linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/18] add new features for FPGA DFL drivers
@ 2019-04-29  8:55 Wu Hao
  2019-04-29  8:55 ` [PATCH v2 01/18] fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address Wu Hao
                   ` (17 more replies)
  0 siblings, 18 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao

This patchset adds more features support for FPGA Device Feature List
(DFL) drivers, including PR enhancement, virtualization support based
on PCIe SRIOV, private features to Port, private features to FME, and
enhancement to DFL framework. Please refer to details in below list.

Main changes from v1:
 - split the clean up code in a separated patch (patch #2)
 - add cpu_feature_enabled check for AVX512 code (patch #4)
 - improve sysfs return values and also sysfs doc (patch #12 #17)
 - create a hwmon for thermal management sysfs interfaces (patch #15)
 - create a hwmon for power management sysfs interfaces (patch #16)
 - update docmentation according to above changes (patch #5)
 - improve sysfs doc for performance reporting support (patch #18)

Wu Hao (18):
  fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address.
  fpga: dfl: fme: remove copy_to_user() in ioctl for PR
  fpga: dfl: fme: align PR buffer size per PR datawidth
  fpga: dfl: fme: support 512bit data width PR
  Documentation: fpga: dfl: add descriptions for virtualization and new
    interfaces.
  fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support.
  fpga: dfl: pci: enable SRIOV support.
  fpga: dfl: afu: add AFU state related sysfs interfaces
  fpga: dfl: afu: add userclock sysfs interfaces.
  fpga: dfl: add id_table for dfl private feature driver
  fpga: dfl: afu: export __port_enable/disable function.
  fpga: dfl: afu: add error reporting support.
  fpga: dfl: afu: add STP (SignalTap) support
  fpga: dfl: fme: add capability sysfs interfaces
  fpga: dfl: fme: add thermal management support
  fpga: dfl: fme: add power management support
  fpga: dfl: fme: add global error reporting support
  fpga: dfl: fme: add performance reporting support

 Documentation/ABI/testing/sysfs-platform-dfl-fme  | 322 ++++++++
 Documentation/ABI/testing/sysfs-platform-dfl-port | 104 +++
 Documentation/fpga/dfl.txt                        | 114 +++
 drivers/fpga/Kconfig                              |   2 +-
 drivers/fpga/Makefile                             |   4 +-
 drivers/fpga/dfl-afu-error.c                      | 225 +++++
 drivers/fpga/dfl-afu-main.c                       | 335 +++++++-
 drivers/fpga/dfl-afu.h                            |   7 +
 drivers/fpga/dfl-fme-error.c                      | 385 +++++++++
 drivers/fpga/dfl-fme-main.c                       | 583 ++++++++++++-
 drivers/fpga/dfl-fme-mgr.c                        | 117 ++-
 drivers/fpga/dfl-fme-perf.c                       | 950 ++++++++++++++++++++++
 drivers/fpga/dfl-fme-pr.c                         |  65 +-
 drivers/fpga/dfl-fme.h                            |   9 +-
 drivers/fpga/dfl-pci.c                            |  40 +
 drivers/fpga/dfl.c                                | 170 +++-
 drivers/fpga/dfl.h                                |  56 +-
 include/uapi/linux/fpga-dfl.h                     |  32 +
 18 files changed, 3439 insertions(+), 81 deletions(-)
 create mode 100644 drivers/fpga/dfl-afu-error.c
 create mode 100644 drivers/fpga/dfl-fme-error.c
 create mode 100644 drivers/fpga/dfl-fme-perf.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 01/18] fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR Wu Hao
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao

FME_PR_INTFC_ID is used as compat_id for fpga manager and region,
but high 64 bits and low 64 bits of the compat_id are swapped by
mistake. This patch fixes this problem by fixing register address.

Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
---
 drivers/fpga/dfl-fme-mgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c
index 76f3770..b3f7eee 100644
--- a/drivers/fpga/dfl-fme-mgr.c
+++ b/drivers/fpga/dfl-fme-mgr.c
@@ -30,8 +30,8 @@
 #define FME_PR_STS		0x10
 #define FME_PR_DATA		0x18
 #define FME_PR_ERR		0x20
-#define FME_PR_INTFC_ID_H	0xA8
-#define FME_PR_INTFC_ID_L	0xB0
+#define FME_PR_INTFC_ID_L	0xA8
+#define FME_PR_INTFC_ID_H	0xB0
 
 /* FME PR Control Register Bitfield */
 #define FME_PR_CTRL_PR_RST	BIT_ULL(0)  /* Reset PR engine */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
  2019-04-29  8:55 ` [PATCH v2 01/18] fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-07 17:26   ` Moritz Fischer
  2019-04-29  8:55 ` [PATCH v2 03/18] fpga: dfl: fme: align PR buffer size per PR datawidth Wu Hao
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

This patch removes copy_to_user() code in partial reconfiguration
ioctl, as it's useless as user never needs to read the data
structure after ioctl.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: clean up code split from patch 2 in v1 patchset.
---
 drivers/fpga/dfl-fme-pr.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
index d9ca955..6ec0f09 100644
--- a/drivers/fpga/dfl-fme-pr.c
+++ b/drivers/fpga/dfl-fme-pr.c
@@ -159,9 +159,6 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 	mutex_unlock(&pdata->lock);
 free_exit:
 	vfree(buf);
-	if (copy_to_user((void __user *)arg, &port_pr, minsz))
-		return -EFAULT;
-
 	return ret;
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 03/18] fpga: dfl: fme: align PR buffer size per PR datawidth
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
  2019-04-29  8:55 ` [PATCH v2 01/18] fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address Wu Hao
  2019-04-29  8:55 ` [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-07 17:27   ` Moritz Fischer
  2019-04-29  8:55 ` [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR Wu Hao
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

Current driver checks if input bitstream file size is aligned or
not per PR data width (default 32bits). It requires one additional
step for end user when they generate the bitstream file, padding
extra zeros to bitstream file to align its size per PR data width,
but they don't have to as hardware will drop extra padding bytes
automatically.

In order to simplify the user steps, this patch aligns PR buffer
size per PR data width in driver, to allow user to pass unaligned
size bitstream files to driver.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
---
 drivers/fpga/dfl-fme-pr.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
index 6ec0f09..3c71dc3 100644
--- a/drivers/fpga/dfl-fme-pr.c
+++ b/drivers/fpga/dfl-fme-pr.c
@@ -74,6 +74,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 	struct dfl_fme *fme;
 	unsigned long minsz;
 	void *buf = NULL;
+	size_t length;
 	int ret = 0;
 	u64 v;
 
@@ -85,9 +86,6 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 	if (port_pr.argsz < minsz || port_pr.flags)
 		return -EINVAL;
 
-	if (!IS_ALIGNED(port_pr.buffer_size, 4))
-		return -EINVAL;
-
 	/* get fme header region */
 	fme_hdr = dfl_get_feature_ioaddr_by_id(&pdev->dev,
 					       FME_FEATURE_ID_HEADER);
@@ -103,7 +101,13 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 		       port_pr.buffer_size))
 		return -EFAULT;
 
-	buf = vmalloc(port_pr.buffer_size);
+	/*
+	 * align PR buffer per PR bandwidth, as HW ignores the extra padding
+	 * data automatically.
+	 */
+	length = ALIGN(port_pr.buffer_size, 4);
+
+	buf = vmalloc(length);
 	if (!buf)
 		return -ENOMEM;
 
@@ -140,7 +144,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 	fpga_image_info_free(region->info);
 
 	info->buf = buf;
-	info->count = port_pr.buffer_size;
+	info->count = length;
 	info->region_id = port_pr.port_id;
 	region->info = info;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (2 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 03/18] fpga: dfl: fme: align PR buffer size per PR datawidth Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-16 17:35   ` Alan Tull
  2019-04-29  8:55 ` [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces Wu Hao
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Ananda Ravuri, Xu Yilun

In early partial reconfiguration private feature, it only
supports 32bit data width when writing data to hardware for
PR. 512bit data width PR support is an important optimization
for some specific solutions (e.g. XEON with FPGA integrated),
it allows driver to use AVX512 instruction to improve the
performance of partial reconfiguration. e.g. programming one
100MB bitstream image via this 512bit data width PR hardware
only takes ~300ms, but 32bit revision requires ~3s per test
result.

Please note now this optimization is only done on revision 2
of this PR private feature which is only used in integrated
solution that AVX512 is always supported. This revision 2
hardware doesn't support 32bit PR.

Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: check AVX512 support using cpu_feature_enabled()
    fix other comments from Scott Wood <swood@redhat.com>
---
 drivers/fpga/dfl-fme-main.c |   3 ++
 drivers/fpga/dfl-fme-mgr.c  | 113 +++++++++++++++++++++++++++++++++++++-------
 drivers/fpga/dfl-fme-pr.c   |  43 +++++++++++------
 drivers/fpga/dfl-fme.h      |   2 +
 drivers/fpga/dfl.h          |   5 ++
 5 files changed, 135 insertions(+), 31 deletions(-)

diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 086ad24..076d74f 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -21,6 +21,8 @@
 #include "dfl.h"
 #include "dfl-fme.h"
 
+#define DRV_VERSION	"0.8"
+
 static ssize_t ports_num_show(struct device *dev,
 			      struct device_attribute *attr, char *buf)
 {
@@ -277,3 +279,4 @@ static int fme_remove(struct platform_device *pdev)
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
 MODULE_ALIAS("platform:dfl-fme");
+MODULE_VERSION(DRV_VERSION);
diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c
index b3f7eee..d1a4ba5 100644
--- a/drivers/fpga/dfl-fme-mgr.c
+++ b/drivers/fpga/dfl-fme-mgr.c
@@ -22,14 +22,18 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/fpga/fpga-mgr.h>
 
+#include "dfl.h"
 #include "dfl-fme-pr.h"
 
+#define DRV_VERSION	"0.8"
+
 /* FME Partial Reconfiguration Sub Feature Register Set */
 #define FME_PR_DFH		0x0
 #define FME_PR_CTRL		0x8
 #define FME_PR_STS		0x10
 #define FME_PR_DATA		0x18
 #define FME_PR_ERR		0x20
+#define FME_PR_512_DATA		0x40 /* Data Register for 512bit datawidth PR */
 #define FME_PR_INTFC_ID_L	0xA8
 #define FME_PR_INTFC_ID_H	0xB0
 
@@ -67,8 +71,43 @@
 #define PR_WAIT_TIMEOUT   8000000
 #define PR_HOST_STATUS_IDLE	0
 
+#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512)
+
+#include <linux/cpufeature.h>
+#include <asm/fpu/api.h>
+
+static inline int is_cpu_avx512_enabled(void)
+{
+	return cpu_feature_enabled(X86_FEATURE_AVX512F);
+}
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+	kernel_fpu_begin();
+
+	asm volatile("vmovdqu64 (%0), %%zmm0;"
+		     "vmovntdq %%zmm0, (%1);"
+		     :
+		     : "r"(src), "r"(dst)
+		     : "memory");
+
+	kernel_fpu_end();
+}
+#else
+static inline int is_cpu_avx512_enabled(void)
+{
+	return 0;
+}
+
+static inline void copy512(const void *src, void __iomem *dst)
+{
+	WARN_ON_ONCE(1);
+}
+#endif
+
 struct fme_mgr_priv {
 	void __iomem *ioaddr;
+	unsigned int pr_datawidth;
 	u64 pr_error;
 };
 
@@ -169,7 +208,7 @@ static int fme_mgr_write(struct fpga_manager *mgr,
 	struct fme_mgr_priv *priv = mgr->priv;
 	void __iomem *fme_pr = priv->ioaddr;
 	u64 pr_ctrl, pr_status, pr_data;
-	int delay = 0, pr_credit, i = 0;
+	int ret = 0, delay = 0, pr_credit;
 
 	dev_dbg(dev, "start request\n");
 
@@ -181,9 +220,9 @@ static int fme_mgr_write(struct fpga_manager *mgr,
 
 	/*
 	 * driver can push data to PR hardware using PR_DATA register once HW
-	 * has enough pr_credit (> 1), pr_credit reduces one for every 32bit
-	 * pr data write to PR_DATA register. If pr_credit <= 1, driver needs
-	 * to wait for enough pr_credit from hardware by polling.
+	 * has enough pr_credit (> 1), pr_credit reduces one for every pr data
+	 * width write to PR_DATA register. If pr_credit <= 1, driver needs to
+	 * wait for enough pr_credit from hardware by polling.
 	 */
 	pr_status = readq(fme_pr + FME_PR_STS);
 	pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
@@ -192,7 +231,8 @@ static int fme_mgr_write(struct fpga_manager *mgr,
 		while (pr_credit <= 1) {
 			if (delay++ > PR_WAIT_TIMEOUT) {
 				dev_err(dev, "PR_CREDIT timeout\n");
-				return -ETIMEDOUT;
+				ret = -ETIMEDOUT;
+				goto done;
 			}
 			udelay(1);
 
@@ -200,21 +240,27 @@ static int fme_mgr_write(struct fpga_manager *mgr,
 			pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
 		}
 
-		if (count < 4) {
-			dev_err(dev, "Invalid PR bitstream size\n");
-			return -EINVAL;
+		WARN_ON(count < priv->pr_datawidth);
+
+		switch (priv->pr_datawidth) {
+		case 4:
+			pr_data = FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
+					     *(u32 *)buf);
+			writeq(pr_data, fme_pr + FME_PR_DATA);
+			break;
+		case 64:
+			copy512(buf, fme_pr + FME_PR_512_DATA);
+			break;
+		default:
+			WARN_ON_ONCE(1);
 		}
-
-		pr_data = 0;
-		pr_data |= FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
-				      *(((u32 *)buf) + i));
-		writeq(pr_data, fme_pr + FME_PR_DATA);
-		count -= 4;
+		buf += priv->pr_datawidth;
+		count -= priv->pr_datawidth;
 		pr_credit--;
-		i++;
 	}
 
-	return 0;
+done:
+	return ret;
 }
 
 static int fme_mgr_write_complete(struct fpga_manager *mgr,
@@ -279,6 +325,36 @@ static void fme_mgr_get_compat_id(void __iomem *fme_pr,
 	id->id_h = readq(fme_pr + FME_PR_INTFC_ID_H);
 }
 
+static u8 fme_mgr_get_pr_datawidth(struct device *dev, void __iomem *fme_pr)
+{
+	u8 revision = dfl_feature_revision(fme_pr);
+
+	if (revision < 2) {
+		/*
+		 * revision 0 and 1 only support 32bit data width partial
+		 * reconfiguration, so pr_datawidth is 4 (Byte).
+		 */
+		return 4;
+	} else if (revision == 2) {
+		/*
+		 * revision 2 hardware has optimization to support 512bit data
+		 * width partial reconfiguration with AVX512 instructions. So
+		 * pr_datawidth is 64 (Byte). As revision 2 hardware is only
+		 * used in integrated solution, CPU supports AVX512 instructions
+		 * for sure, but it still needs to check here as AVX512 could be
+		 * disabled in kernel (e.g. using clearcpuid boot option).
+		 */
+		if (is_cpu_avx512_enabled())
+			return 64;
+
+		dev_err(dev, "revision 2: AVX512 is disabled\n");
+		return 0;
+	}
+
+	dev_err(dev, "revision %d is not supported yet\n", revision);
+	return 0;
+}
+
 static int fme_mgr_probe(struct platform_device *pdev)
 {
 	struct dfl_fme_mgr_pdata *pdata = dev_get_platdata(&pdev->dev);
@@ -302,6 +378,10 @@ static int fme_mgr_probe(struct platform_device *pdev)
 			return PTR_ERR(priv->ioaddr);
 	}
 
+	priv->pr_datawidth = fme_mgr_get_pr_datawidth(dev, priv->ioaddr);
+	if (!priv->pr_datawidth)
+		return -ENODEV;
+
 	compat_id = devm_kzalloc(dev, sizeof(*compat_id), GFP_KERNEL);
 	if (!compat_id)
 		return -ENOMEM;
@@ -342,3 +422,4 @@ static int fme_mgr_remove(struct platform_device *pdev)
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
 MODULE_ALIAS("platform:dfl-fme-mgr");
+MODULE_VERSION(DRV_VERSION);
diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
index 3c71dc3..cd94ba8 100644
--- a/drivers/fpga/dfl-fme-pr.c
+++ b/drivers/fpga/dfl-fme-pr.c
@@ -83,7 +83,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 	if (copy_from_user(&port_pr, argp, minsz))
 		return -EFAULT;
 
-	if (port_pr.argsz < minsz || port_pr.flags)
+	if (port_pr.argsz < minsz || port_pr.flags || !port_pr.buffer_size)
 		return -EINVAL;
 
 	/* get fme header region */
@@ -101,15 +101,25 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 		       port_pr.buffer_size))
 		return -EFAULT;
 
+	mutex_lock(&pdata->lock);
+	fme = dfl_fpga_pdata_get_private(pdata);
+	/* fme device has been unregistered. */
+	if (!fme) {
+		ret = -EINVAL;
+		goto unlock_exit;
+	}
+
 	/*
 	 * align PR buffer per PR bandwidth, as HW ignores the extra padding
 	 * data automatically.
 	 */
-	length = ALIGN(port_pr.buffer_size, 4);
+	length = ALIGN(port_pr.buffer_size, fme->pr_datawidth);
 
 	buf = vmalloc(length);
-	if (!buf)
-		return -ENOMEM;
+	if (!buf) {
+		ret = -ENOMEM;
+		goto unlock_exit;
+	}
 
 	if (copy_from_user(buf,
 			   (void __user *)(unsigned long)port_pr.buffer_address,
@@ -127,18 +137,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 
 	info->flags |= FPGA_MGR_PARTIAL_RECONFIG;
 
-	mutex_lock(&pdata->lock);
-	fme = dfl_fpga_pdata_get_private(pdata);
-	/* fme device has been unregistered. */
-	if (!fme) {
-		ret = -EINVAL;
-		goto unlock_exit;
-	}
-
 	region = dfl_fme_region_find(fme, port_pr.port_id);
 	if (!region) {
 		ret = -EINVAL;
-		goto unlock_exit;
+		goto free_exit;
 	}
 
 	fpga_image_info_free(region->info);
@@ -159,10 +161,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
 		fpga_bridges_put(&region->bridge_list);
 
 	put_device(&region->dev);
-unlock_exit:
-	mutex_unlock(&pdata->lock);
 free_exit:
 	vfree(buf);
+unlock_exit:
+	mutex_unlock(&pdata->lock);
 	return ret;
 }
 
@@ -388,6 +390,17 @@ static int pr_mgmt_init(struct platform_device *pdev,
 	mutex_lock(&pdata->lock);
 	priv = dfl_fpga_pdata_get_private(pdata);
 
+	/*
+	 * Initialize PR data width.
+	 * Only revision 2 supports 512bit datawidth for better performance,
+	 * other revisions use default 32bit datawidth. This is used for
+	 * buffer alignment.
+	 */
+	if (dfl_feature_revision(feature->ioaddr) == 2)
+		priv->pr_datawidth = 64;
+	else
+		priv->pr_datawidth = 4;
+
 	/* Initialize the region and bridge sub device list */
 	INIT_LIST_HEAD(&priv->region_list);
 	INIT_LIST_HEAD(&priv->bridge_list);
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index 5394a21..de20755 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -21,12 +21,14 @@
 /**
  * struct dfl_fme - dfl fme private data
  *
+ * @pr_datawidth: data width for partial reconfiguration.
  * @mgr: FME's FPGA manager platform device.
  * @region_list: linked list of FME's FPGA regions.
  * @bridge_list: linked list of FME's FPGA bridges.
  * @pdata: fme platform device's pdata.
  */
 struct dfl_fme {
+	int pr_datawidth;
 	struct platform_device *mgr;
 	struct list_head region_list;
 	struct list_head bridge_list;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index a8b869e..8851c6c 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -331,6 +331,11 @@ static inline bool dfl_feature_is_port(void __iomem *base)
 		(FIELD_GET(DFH_ID, v) == DFH_ID_FIU_PORT);
 }
 
+static inline u8 dfl_feature_revision(void __iomem *base)
+{
+	return (u8)FIELD_GET(DFH_REVISION, readq(base + DFH));
+}
+
 /**
  * struct dfl_fpga_enum_info - DFL FPGA enumeration information
  *
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (3 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-16 17:36   ` Alan Tull
  2019-04-29  8:55 ` [PATCH v2 06/18] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support Wu Hao
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

This patch adds virtualization support description for DFL based
FPGA devices (based on PCIe SRIOV), and introductions to new
interfaces added by new dfl private feature drivers.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: update description for thermal/power management user interfaces.
---
 Documentation/fpga/dfl.txt | 115 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)

diff --git a/Documentation/fpga/dfl.txt b/Documentation/fpga/dfl.txt
index 6df4621..36610e0 100644
--- a/Documentation/fpga/dfl.txt
+++ b/Documentation/fpga/dfl.txt
@@ -84,6 +84,8 @@ The following functions are exposed through ioctls:
  Get driver API version (DFL_FPGA_GET_API_VERSION)
  Check for extensions (DFL_FPGA_CHECK_EXTENSION)
  Program bitstream (DFL_FPGA_FME_PORT_PR)
+ Assign port to PF (DFL_FPGA_FME_PORT_ASSIGN)
+ Release port from PF (DFL_FPGA_FME_PORT_RELEASE)
 
 More functions are exposed through sysfs
 (/sys/class/fpga_region/regionX/dfl-fme.n/):
@@ -99,6 +101,24 @@ More functions are exposed through sysfs
      one FPGA device may have more than one port, this sysfs interface indicates
      how many ports the FPGA device has.
 
+ Power management (dfl_fme_power hwmon)
+     power management hwmon sysfs interfaces allow user to read power management
+     information (power consumption, thresholds, threshold status, limits, etc.)
+     and and configure power thresholds for different throttling levels.
+
+ Thermal management (dfl_fme_thermal hwmon)
+     thermal management hwmon sysfs interfaces allow user to read thermal
+     management information (current temperature, thresholds, threshold status,
+     etc.).
+
+ Global error reporting management (errors/)
+     error reporting sysfs interfaces allow user to read errors detected by the
+     hardware, and clear the logged errors.
+
+ Performance counters (perf/)
+     performance counters sysfs interfaces allow user to use different counters
+     to get performance data.
+
 
 FIU - PORT
 ==========
@@ -139,6 +159,10 @@ More functions are exposed through sysfs:
  Read Accelerator GUID (afu_id)
      afu_id indicates which PR bitstream is programmed to this AFU.
 
+ Error reporting (errors/)
+     error reporting sysfs interfaces allow user to read port/afu errors
+     detected by the hardware, and clear the logged errors.
+
 
 DFL Framework Overview
 ======================
@@ -212,6 +236,97 @@ the compat_id exposed by the target FPGA region. This check is usually done by
 userspace before calling the reconfiguration IOCTL.
 
 
+FPGA virtualization - PCIe SRIOV
+================================
+This section describes the virtualization support on DFL based FPGA device to
+enable accessing an accelerator from applications running in a virtual machine
+(VM). This section only describes the PCIe based FPGA device with SRIOV support.
+
+Features supported by the particular FPGA device are exposed through Device
+Feature Lists, as illustrated below:
+
+  +-------------------------------+  +-------------+
+  |              PF               |  |     VF      |
+  +-------------------------------+  +-------------+
+      ^            ^         ^              ^
+      |            |         |              |
++-----|------------|---------|--------------|-------+
+|     |            |         |              |       |
+|  +-----+     +-------+ +-------+      +-------+   |
+|  | FME |     | Port0 | | Port1 |      | Port2 |   |
+|  +-----+     +-------+ +-------+      +-------+   |
+|                  ^         ^              ^       |
+|                  |         |              |       |
+|              +-------+ +------+       +-------+   |
+|              |  AFU  | |  AFU |       |  AFU  |   |
+|              +-------+ +------+       +-------+   |
+|                                                   |
+|            DFL based FPGA PCIe Device             |
++---------------------------------------------------+
+
+FME is always accessed through the physical function (PF).
+
+Ports (and related AFUs) are accessed via PF by default, but could be exposed
+through virtual function (VF) devices via PCIe SRIOV. Each VF only contains
+1 Port and 1 AFU for isolation. Users could assign individual VFs (accelerators)
+created via PCIe SRIOV interface, to virtual machines.
+
+The driver organization in virtualization case is illustrated below:
+
+  +-------++------++------+             |
+  | FME   || FME  || FME  |             |
+  | FPGA  || FPGA || FPGA |             |
+  |Manager||Bridge||Region|             |
+  +-------++------++------+             |
+  +-----------------------+  +--------+ |             +--------+
+  |          FME          |  |  AFU   | |             |  AFU   |
+  |         Module        |  | Module | |             | Module |
+  +-----------------------+  +--------+ |             +--------+
+        +-----------------------+       |       +-----------------------+
+        | FPGA Container Device |       |       | FPGA Container Device |
+        |  (FPGA Base Region)   |       |       |  (FPGA Base Region)   |
+        +-----------------------+       |       +-----------------------+
+          +------------------+          |         +------------------+
+          | FPGA PCIE Module |          | Virtual | FPGA PCIE Module |
+          +------------------+   Host   | Machine +------------------+
+ -------------------------------------- | ------------------------------
+           +---------------+            |          +---------------+
+           | PCI PF Device |            |          | PCI VF Device |
+           +---------------+            |          +---------------+
+
+FPGA PCIe device driver is always loaded first once a FPGA PCIe PF or VF device
+is detected. It:
+
+	a) finish enumeration on both FPGA PCIe PF and VF device using common
+	   interfaces from DFL framework.
+	b) supports SRIOV.
+
+The FME device driver plays a management role in this driver architecture, it
+provides ioctls to release Port from PF and assign Port to PF. After release
+a port from PF, then it's safe to expose this port through a VF via PCIe SRIOV
+sysfs interface.
+
+To enable accessing an accelerator from applications running in a VM, the
+respective AFU's port needs to be assigned to a VF using the following steps:
+
+	a) The PF owns all AFU ports by default. Any port that needs to be
+	   reassigned to a VF must first be released through the
+	   DFL_FPGA_FME_PORT_RELEASE ioctl on the FME device.
+
+	b) Once N ports are released from PF, then user can use command below
+	   to enable SRIOV and VFs. Each VF owns only one Port with AFU.
+
+	   echo N > $PCI_DEVICE_PATH/sriov_numvfs
+
+	c) Pass through the VFs to VMs
+
+	d) The AFU under VF is accessible from applications in VM (using the
+	   same driver inside the VF).
+
+Note that an FME can't be assigned to a VF, thus PR and other management
+functions are only available via the PF.
+
+
 Device enumeration
 ==================
 This section introduces how applications enumerate the fpga device from
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 06/18] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (4 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-07 17:33   ` Moritz Fischer
  2019-04-29  8:55 ` [PATCH v2 07/18] fpga: dfl: pci: enable SRIOV support Wu Hao
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Zhang Yi Z, Xu Yilun

In order to support virtualization usage via PCIe SRIOV, this patch
adds two ioctls under FPGA Management Engine (FME) to release and
assign back the port device. In order to safely turn Port from PF
into VF and enable PCIe SRIOV, it requires user to invoke this
PORT_RELEASE ioctl to release port firstly to remove userspace
interfaces, and then configure the PF/VF access register in FME.
After disable SRIOV, it requires user to invoke this PORT_ASSIGN
ioctl to attach the port back to PF.

 Ioctl interfaces:
 * DFL_FPGA_FME_PORT_RELEASE
   Release platform device of given port, it deletes port platform
   device to remove related userspace interfaces on PF, then
   configures PF/VF access mode to VF.

 * DFL_FPGA_FME_PORT_ASSIGN
   Assign platform device of given port back to PF, it configures
   PF/VF access mode to PF, then adds port platform device back to
   re-enable related userspace interfaces on PF.

Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
---
 drivers/fpga/dfl-fme-main.c   |  54 +++++++++++++++++++++
 drivers/fpga/dfl.c            | 107 +++++++++++++++++++++++++++++++++++++-----
 drivers/fpga/dfl.h            |  10 ++++
 include/uapi/linux/fpga-dfl.h |  32 +++++++++++++
 4 files changed, 191 insertions(+), 12 deletions(-)

diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 076d74f..8b2a337 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -16,6 +16,7 @@
 
 #include <linux/kernel.h>
 #include <linux/module.h>
+#include <linux/uaccess.h>
 #include <linux/fpga-dfl.h>
 
 #include "dfl.h"
@@ -105,9 +106,62 @@ static void fme_hdr_uinit(struct platform_device *pdev,
 	sysfs_remove_files(&pdev->dev.kobj, fme_hdr_attrs);
 }
 
+static long fme_hdr_ioctl_release_port(struct dfl_feature_platform_data *pdata,
+				       void __user *arg)
+{
+	struct dfl_fpga_cdev *cdev = pdata->dfl_cdev;
+	struct dfl_fpga_fme_port_release release;
+	unsigned long minsz;
+
+	minsz = offsetofend(struct dfl_fpga_fme_port_release, port_id);
+
+	if (copy_from_user(&release, arg, minsz))
+		return -EFAULT;
+
+	if (release.argsz < minsz || release.flags)
+		return -EINVAL;
+
+	return dfl_fpga_cdev_config_port(cdev, release.port_id, true);
+}
+
+static long fme_hdr_ioctl_assign_port(struct dfl_feature_platform_data *pdata,
+				      void __user *arg)
+{
+	struct dfl_fpga_cdev *cdev = pdata->dfl_cdev;
+	struct dfl_fpga_fme_port_assign assign;
+	unsigned long minsz;
+
+	minsz = offsetofend(struct dfl_fpga_fme_port_assign, port_id);
+
+	if (copy_from_user(&assign, arg, minsz))
+		return -EFAULT;
+
+	if (assign.argsz < minsz || assign.flags)
+		return -EINVAL;
+
+	return dfl_fpga_cdev_config_port(cdev, assign.port_id, false);
+}
+
+static long fme_hdr_ioctl(struct platform_device *pdev,
+			  struct dfl_feature *feature,
+			  unsigned int cmd, unsigned long arg)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
+
+	switch (cmd) {
+	case DFL_FPGA_FME_PORT_RELEASE:
+		return fme_hdr_ioctl_release_port(pdata, (void __user *)arg);
+	case DFL_FPGA_FME_PORT_ASSIGN:
+		return fme_hdr_ioctl_assign_port(pdata, (void __user *)arg);
+	}
+
+	return -ENODEV;
+}
+
 static const struct dfl_feature_ops fme_hdr_ops = {
 	.init = fme_hdr_init,
 	.uinit = fme_hdr_uinit,
+	.ioctl = fme_hdr_ioctl,
 };
 
 static struct dfl_feature_driver fme_feature_drvs[] = {
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index 2c09e50..a6b6d38 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -224,16 +224,20 @@ void dfl_fpga_port_ops_del(struct dfl_fpga_port_ops *ops)
  */
 int dfl_fpga_check_port_id(struct platform_device *pdev, void *pport_id)
 {
-	struct dfl_fpga_port_ops *port_ops = dfl_fpga_port_ops_get(pdev);
-	int port_id;
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
+	struct dfl_fpga_port_ops *port_ops;
+
+	if (pdata->id != FEATURE_DEV_ID_UNUSED)
+		return pdata->id == *(int *)pport_id;
 
+	port_ops = dfl_fpga_port_ops_get(pdev);
 	if (!port_ops || !port_ops->get_id)
 		return 0;
 
-	port_id = port_ops->get_id(pdev);
+	pdata->id = port_ops->get_id(pdev);
 	dfl_fpga_port_ops_put(port_ops);
 
-	return port_id == *(int *)pport_id;
+	return pdata->id == *(int *)pport_id;
 }
 EXPORT_SYMBOL_GPL(dfl_fpga_check_port_id);
 
@@ -462,6 +466,7 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
 	pdata->dev = fdev;
 	pdata->num = binfo->feature_num;
 	pdata->dfl_cdev = binfo->cdev;
+	pdata->id = FEATURE_DEV_ID_UNUSED;
 	mutex_init(&pdata->lock);
 
 	/*
@@ -959,25 +964,27 @@ void dfl_fpga_feature_devs_remove(struct dfl_fpga_cdev *cdev)
 {
 	struct dfl_feature_platform_data *pdata, *ptmp;
 
-	remove_feature_devs(cdev);
-
 	mutex_lock(&cdev->lock);
-	if (cdev->fme_dev) {
-		/* the fme should be unregistered. */
-		WARN_ON(device_is_registered(cdev->fme_dev));
+	if (cdev->fme_dev)
 		put_device(cdev->fme_dev);
-	}
 
 	list_for_each_entry_safe(pdata, ptmp, &cdev->port_dev_list, node) {
 		struct platform_device *port_dev = pdata->dev;
 
-		/* the port should be unregistered. */
-		WARN_ON(device_is_registered(&port_dev->dev));
+		/* remove released ports */
+		if (!device_is_registered(&port_dev->dev)) {
+			dfl_id_free(feature_dev_id_type(port_dev),
+				    port_dev->id);
+			platform_device_put(port_dev);
+		}
+
 		list_del(&pdata->node);
 		put_device(&port_dev->dev);
 	}
 	mutex_unlock(&cdev->lock);
 
+	remove_feature_devs(cdev);
+
 	fpga_region_unregister(cdev->region);
 	devm_kfree(cdev->parent, cdev);
 }
@@ -1015,6 +1022,82 @@ struct platform_device *
 }
 EXPORT_SYMBOL_GPL(__dfl_fpga_cdev_find_port);
 
+static int attach_port_dev(struct dfl_fpga_cdev *cdev, u32 port_id)
+{
+	struct platform_device *port_pdev;
+	int ret = -ENODEV;
+
+	mutex_lock(&cdev->lock);
+	port_pdev = __dfl_fpga_cdev_find_port(cdev, &port_id,
+					      dfl_fpga_check_port_id);
+	if (!port_pdev)
+		goto unlock_exit;
+
+	if (device_is_registered(&port_pdev->dev)) {
+		ret = -EBUSY;
+		goto put_dev_exit;
+	}
+
+	ret = platform_device_add(port_pdev);
+	if (ret)
+		goto put_dev_exit;
+
+	dfl_feature_dev_use_end(dev_get_platdata(&port_pdev->dev));
+	cdev->released_port_num--;
+put_dev_exit:
+	put_device(&port_pdev->dev);
+unlock_exit:
+	mutex_unlock(&cdev->lock);
+	return ret;
+}
+
+static int detach_port_dev(struct dfl_fpga_cdev *cdev, u32 port_id)
+{
+	struct platform_device *port_pdev;
+	int ret = -ENODEV;
+
+	mutex_lock(&cdev->lock);
+	port_pdev = __dfl_fpga_cdev_find_port(cdev, &port_id,
+					      dfl_fpga_check_port_id);
+	if (!port_pdev)
+		goto unlock_exit;
+
+	if (!device_is_registered(&port_pdev->dev)) {
+		ret = -EBUSY;
+		goto put_dev_exit;
+	}
+
+	ret = dfl_feature_dev_use_begin(dev_get_platdata(&port_pdev->dev));
+	if (ret)
+		goto put_dev_exit;
+
+	platform_device_del(port_pdev);
+	cdev->released_port_num++;
+put_dev_exit:
+	put_device(&port_pdev->dev);
+unlock_exit:
+	mutex_unlock(&cdev->lock);
+	return ret;
+}
+
+/**
+ * dfl_fpga_cdev_config_port - configure a port feature dev
+ * @cdev: parent container device.
+ * @port_id: id of the port feature device.
+ * @release: release port or assign port back.
+ *
+ * This function allows user to release port platform device or assign it back.
+ * e.g. to safely turn one port from PF into VF for PCI device SRIOV support,
+ * release port platform device is one necessary step.
+ */
+int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
+			      u32 port_id, bool release)
+{
+	return release ? detach_port_dev(cdev, port_id) :
+			 attach_port_dev(cdev, port_id);
+}
+EXPORT_SYMBOL_GPL(dfl_fpga_cdev_config_port);
+
 static int __init dfl_fpga_init(void)
 {
 	int ret;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 8851c6c..63f39ab 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -183,6 +183,8 @@ struct dfl_feature {
 
 #define DEV_STATUS_IN_USE	0
 
+#define FEATURE_DEV_ID_UNUSED	(-1)
+
 /**
  * struct dfl_feature_platform_data - platform data for feature devices
  *
@@ -191,6 +193,7 @@ struct dfl_feature {
  * @cdev: cdev of feature dev.
  * @dev: ptr to platform device linked with this platform data.
  * @dfl_cdev: ptr to container device.
+ * @id: id used for this feature device.
  * @disable_count: count for port disable.
  * @num: number for sub features.
  * @dev_status: dev status (e.g. DEV_STATUS_IN_USE).
@@ -203,6 +206,7 @@ struct dfl_feature_platform_data {
 	struct cdev cdev;
 	struct platform_device *dev;
 	struct dfl_fpga_cdev *dfl_cdev;
+	int id;
 	unsigned int disable_count;
 	unsigned long dev_status;
 	void *private;
@@ -378,6 +382,7 @@ int dfl_fpga_enum_info_add_dfl(struct dfl_fpga_enum_info *info,
  * @fme_dev: FME feature device under this container device.
  * @lock: mutex lock to protect the port device list.
  * @port_dev_list: list of all port feature devices under this container device.
+ * @released_port_num: released port number under this container device.
  */
 struct dfl_fpga_cdev {
 	struct device *parent;
@@ -385,6 +390,7 @@ struct dfl_fpga_cdev {
 	struct device *fme_dev;
 	struct mutex lock;
 	struct list_head port_dev_list;
+	int released_port_num;
 };
 
 struct dfl_fpga_cdev *
@@ -412,4 +418,8 @@ struct platform_device *
 
 	return pdev;
 }
+
+int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
+			      u32 port_id, bool release);
+
 #endif /* __FPGA_DFL_H */
diff --git a/include/uapi/linux/fpga-dfl.h b/include/uapi/linux/fpga-dfl.h
index 2e324e5..e9a00e0 100644
--- a/include/uapi/linux/fpga-dfl.h
+++ b/include/uapi/linux/fpga-dfl.h
@@ -176,4 +176,36 @@ struct dfl_fpga_fme_port_pr {
 
 #define DFL_FPGA_FME_PORT_PR	_IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 0)
 
+/**
+ * DFL_FPGA_FME_PORT_RELEASE - _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 1,
+ *					struct dfl_fpga_fme_port_release)
+ *
+ * Driver releases the port per Port ID provided by caller.
+ * Return: 0 on success, -errno on failure.
+ */
+struct dfl_fpga_fme_port_release {
+	/* Input */
+	__u32 argsz;		/* Structure length */
+	__u32 flags;		/* Zero for now */
+	__u32 port_id;
+};
+
+#define DFL_FPGA_FME_PORT_RELEASE	_IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 1)
+
+/**
+ * DFL_FPGA_FME_PORT_ASSIGN - _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 2,
+ *					struct dfl_fpga_fme_port_assign)
+ *
+ * Driver assigns the port back per Port ID provided by caller.
+ * Return: 0 on success, -errno on failure.
+ */
+struct dfl_fpga_fme_port_assign {
+	/* Input */
+	__u32 argsz;		/* Structure length */
+	__u32 flags;		/* Zero for now */
+	__u32 port_id;
+};
+
+#define DFL_FPGA_FME_PORT_ASSIGN	_IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 2)
+
 #endif /* _UAPI_LINUX_FPGA_DFL_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 07/18] fpga: dfl: pci: enable SRIOV support.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (5 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 06/18] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-07 17:35   ` Moritz Fischer
  2019-04-29  8:55 ` [PATCH v2 08/18] fpga: dfl: afu: add AFU state related sysfs interfaces Wu Hao
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Zhang Yi Z, Xu Yilun

This patch enables the standard sriov support. It allows user to
enable SRIOV (and VFs), then user could pass through accelerators
(VFs) into virtual machine or use VFs directly in host.

Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
---
 drivers/fpga/dfl-pci.c | 40 ++++++++++++++++++++++++++++++++++++++++
 drivers/fpga/dfl.c     | 41 +++++++++++++++++++++++++++++++++++++++++
 drivers/fpga/dfl.h     |  1 +
 3 files changed, 82 insertions(+)

diff --git a/drivers/fpga/dfl-pci.c b/drivers/fpga/dfl-pci.c
index 66b5720..2fa571b 100644
--- a/drivers/fpga/dfl-pci.c
+++ b/drivers/fpga/dfl-pci.c
@@ -223,8 +223,46 @@ int cci_pci_probe(struct pci_dev *pcidev, const struct pci_device_id *pcidevid)
 	return ret;
 }
 
+static int cci_pci_sriov_configure(struct pci_dev *pcidev, int num_vfs)
+{
+	struct cci_drvdata *drvdata = pci_get_drvdata(pcidev);
+	struct dfl_fpga_cdev *cdev = drvdata->cdev;
+	int ret = 0;
+
+	mutex_lock(&cdev->lock);
+
+	if (!num_vfs) {
+		/*
+		 * disable SRIOV and then put released ports back to default
+		 * PF access mode.
+		 */
+		pci_disable_sriov(pcidev);
+
+		__dfl_fpga_cdev_config_port_vf(cdev, false);
+
+	} else if (cdev->released_port_num == num_vfs) {
+		/*
+		 * only enable SRIOV if cdev has matched released ports, put
+		 * released ports into VF access mode firstly.
+		 */
+		__dfl_fpga_cdev_config_port_vf(cdev, true);
+
+		ret = pci_enable_sriov(pcidev, num_vfs);
+		if (ret)
+			__dfl_fpga_cdev_config_port_vf(cdev, false);
+	} else {
+		ret = -EINVAL;
+	}
+
+	mutex_unlock(&cdev->lock);
+	return ret;
+}
+
 static void cci_pci_remove(struct pci_dev *pcidev)
 {
+	if (dev_is_pf(&pcidev->dev))
+		cci_pci_sriov_configure(pcidev, 0);
+
 	cci_remove_feature_devs(pcidev);
 	pci_disable_pcie_error_reporting(pcidev);
 }
@@ -234,6 +272,7 @@ static void cci_pci_remove(struct pci_dev *pcidev)
 	.id_table = cci_pcie_id_tbl,
 	.probe = cci_pci_probe,
 	.remove = cci_pci_remove,
+	.sriov_configure = cci_pci_sriov_configure,
 };
 
 module_pci_driver(cci_pci_driver);
@@ -241,3 +280,4 @@ static void cci_pci_remove(struct pci_dev *pcidev)
 MODULE_DESCRIPTION("FPGA DFL PCIe Device Driver");
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
+MODULE_VERSION(DRV_VERSION);
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index a6b6d38..c5aa287 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -1098,6 +1098,47 @@ int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
 }
 EXPORT_SYMBOL_GPL(dfl_fpga_cdev_config_port);
 
+static void config_port_vf(struct device *fme_dev, int port_id, bool is_vf)
+{
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(fme_dev, FME_FEATURE_ID_HEADER);
+
+	v = readq(base + FME_HDR_PORT_OFST(port_id));
+
+	v &= ~FME_PORT_OFST_ACC_CTRL;
+	v |= FIELD_PREP(FME_PORT_OFST_ACC_CTRL,
+			is_vf ? FME_PORT_OFST_ACC_VF : FME_PORT_OFST_ACC_PF);
+
+	writeq(v, base + FME_HDR_PORT_OFST(port_id));
+}
+
+/**
+ * __dfl_fpga_cdev_config_port_vf - configure port to VF access mode
+ *
+ * @cdev: parent container device.
+ * @if_vf: true for VF access mode, and false for PF access mode
+ *
+ * Return: 0 on success, negative error code otherwise.
+ *
+ * This function is needed in sriov configuration routine. It could be used to
+ * configures the released ports access mode to VF or PF.
+ * The caller needs to hold lock for protection.
+ */
+void __dfl_fpga_cdev_config_port_vf(struct dfl_fpga_cdev *cdev, bool is_vf)
+{
+	struct dfl_feature_platform_data *pdata;
+
+	list_for_each_entry(pdata, &cdev->port_dev_list, node) {
+		if (device_is_registered(&pdata->dev->dev))
+			continue;
+
+		config_port_vf(cdev->fme_dev, pdata->id, is_vf);
+	}
+}
+EXPORT_SYMBOL_GPL(__dfl_fpga_cdev_config_port_vf);
+
 static int __init dfl_fpga_init(void)
 {
 	int ret;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 63f39ab..1350e8e 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -421,5 +421,6 @@ struct platform_device *
 
 int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
 			      u32 port_id, bool release);
+void __dfl_fpga_cdev_config_port_vf(struct dfl_fpga_cdev *cdev, bool is_vf);
 
 #endif /* __FPGA_DFL_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 08/18] fpga: dfl: afu: add AFU state related sysfs interfaces
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (6 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 07/18] fpga: dfl: pci: enable SRIOV support Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 09/18] fpga: dfl: afu: add userclock " Wu Hao
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Ananda Ravuri, Xu Yilun

This patch introduces more sysfs interfaces for Accelerated
Function Unit (AFU). These interfaces allow users to read
current AFU Power State (APx), read / clear AFU Power (APx)
events which are sticky to identify transient APx state,
and manage AFU's LTR (latency tolerance reporting).

Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
---
 Documentation/ABI/testing/sysfs-platform-dfl-port |  30 +++++
 drivers/fpga/dfl-afu-main.c                       | 144 ++++++++++++++++++++++
 drivers/fpga/dfl.h                                |  11 ++
 3 files changed, 185 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-port b/Documentation/ABI/testing/sysfs-platform-dfl-port
index 6a92dda..d7122a4 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-port
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-port
@@ -14,3 +14,33 @@ Description:	Read-only. User can program different PR bitstreams to FPGA
 		Accelerator Function Unit (AFU) for different functions. It
 		returns uuid which could be used to identify which PR bitstream
 		is programmed in this AFU.
+
+What:		/sys/bus/platform/devices/dfl-port.0/power_state
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It reports the APx (AFU Power) state, different APx
+		means different throttling level. When reading this file, it
+		returns "0" - Normal / "1" - AP1 / "2" - AP2 / "6" - AP6.
+
+What:		/sys/bus/platform/devices/dfl-port.0/ap1_event
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-write. Read or set 1 to clear AP1 (AFU Power State 1)
+		event. It's used to indicate transient AP1 state.
+
+What:		/sys/bus/platform/devices/dfl-port.0/ap2_event
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-write. Read or set 1 to clear AP2 (AFU Power State 2)
+		event. It's used to indicate transient AP2 state.
+
+What:		/sys/bus/platform/devices/dfl-port.0/ltr
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-write. Read and set AFU latency tolerance reporting value.
+		Set ltr to 1 if the AFU can tolerate latency >= 40us or set it
+		to 0 if it is latency sensitive.
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 02baa6a..2ffec06 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -21,6 +21,8 @@
 
 #include "dfl-afu.h"
 
+#define DRV_VERSION	"0.8"
+
 /**
  * port_enable - enable a port
  * @pdev: port platform device.
@@ -141,8 +143,149 @@ static int port_get_id(struct platform_device *pdev)
 }
 static DEVICE_ATTR_RO(id);
 
+static ssize_t
+ltr_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + PORT_HDR_CTRL);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%x\n",
+			 (u8)FIELD_GET(PORT_CTRL_LATENCY, v));
+}
+
+static ssize_t
+ltr_store(struct device *dev, struct device_attribute *attr,
+	  const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u8 ltr;
+	u64 v;
+
+	if (kstrtou8(buf, 0, &ltr) || ltr > 1)
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + PORT_HDR_CTRL);
+	v &= ~PORT_CTRL_LATENCY;
+	v |= FIELD_PREP(PORT_CTRL_LATENCY, ltr);
+	writeq(v, base + PORT_HDR_CTRL);
+	mutex_unlock(&pdata->lock);
+
+	return count;
+}
+static DEVICE_ATTR_RW(ltr);
+
+static ssize_t
+ap1_event_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + PORT_HDR_STS);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%x\n",
+			 (u8)FIELD_GET(PORT_STS_AP1_EVT, v));
+}
+
+static ssize_t
+ap1_event_store(struct device *dev, struct device_attribute *attr,
+		const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u8 ap1_event;
+
+	if (kstrtou8(buf, 0, &ap1_event) || ap1_event != 1)
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	writeq(PORT_STS_AP1_EVT, base + PORT_HDR_STS);
+	mutex_unlock(&pdata->lock);
+
+	return count;
+}
+static DEVICE_ATTR_RW(ap1_event);
+
+static ssize_t
+ap2_event_show(struct device *dev, struct device_attribute *attr,
+	       char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + PORT_HDR_STS);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "%x\n",
+			 (u8)FIELD_GET(PORT_STS_AP2_EVT, v));
+}
+
+static ssize_t
+ap2_event_store(struct device *dev, struct device_attribute *attr,
+		const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u8 ap2_event;
+
+	if (kstrtou8(buf, 0, &ap2_event) || ap2_event != 1)
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	writeq(PORT_STS_AP2_EVT, base + PORT_HDR_STS);
+	mutex_unlock(&pdata->lock);
+
+	return count;
+}
+static DEVICE_ATTR_RW(ap2_event);
+
+static ssize_t
+power_state_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + PORT_HDR_STS);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%x\n",
+			 (u8)FIELD_GET(PORT_STS_PWR_STATE, v));
+}
+static DEVICE_ATTR_RO(power_state);
+
 static const struct attribute *port_hdr_attrs[] = {
 	&dev_attr_id.attr,
+	&dev_attr_ltr.attr,
+	&dev_attr_ap1_event.attr,
+	&dev_attr_ap2_event.attr,
+	&dev_attr_power_state.attr,
 	NULL,
 };
 
@@ -634,3 +777,4 @@ static void __exit afu_exit(void)
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
 MODULE_ALIAS("platform:dfl-port");
+MODULE_VERSION(DRV_VERSION);
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 1350e8e..1525098 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -119,6 +119,7 @@
 #define PORT_HDR_NEXT_AFU	NEXT_AFU
 #define PORT_HDR_CAP		0x30
 #define PORT_HDR_CTRL		0x38
+#define PORT_HDR_STS		0x40
 
 /* Port Capability Register Bitfield */
 #define PORT_CAP_PORT_NUM	GENMASK_ULL(1, 0)	/* ID of this port */
@@ -130,6 +131,16 @@
 /* Latency tolerance reporting. '1' >= 40us, '0' < 40us.*/
 #define PORT_CTRL_LATENCY	BIT_ULL(2)
 #define PORT_CTRL_SFTRST_ACK	BIT_ULL(4)		/* HW ack for reset */
+
+/* Port Status Register Bitfield */
+#define PORT_STS_AP2_EVT	BIT_ULL(13)		/* AP2 event detected */
+#define PORT_STS_AP1_EVT	BIT_ULL(12)		/* AP1 event detected */
+#define PORT_STS_PWR_STATE	GENMASK_ULL(11, 8)	/* AFU power states */
+#define PORT_STS_PWR_STATE_NORM 0
+#define PORT_STS_PWR_STATE_AP1	1			/* 50% throttling */
+#define PORT_STS_PWR_STATE_AP2	2			/* 90% throttling */
+#define PORT_STS_PWR_STATE_AP6	6			/* 100% throttling */
+
 /**
  * struct dfl_fpga_port_ops - port ops
  *
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 09/18] fpga: dfl: afu: add userclock sysfs interfaces.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (7 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 08/18] fpga: dfl: afu: add AFU state related sysfs interfaces Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 10/18] fpga: dfl: add id_table for dfl private feature driver Wu Hao
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Ananda Ravuri, Russ Weight, Xu Yilun

This patch introduces userclock sysfs interfaces for AFU, user
could use these interfaces for clock setting to AFU.

Please note that, this is only working for port header feature
with revision 0, for later revisions, userclock setting is moved
to a separated private feature, so one revision sysfs interface
is exposed to userspace application for this purpose too.

Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
---
 Documentation/ABI/testing/sysfs-platform-dfl-port |  35 +++++++
 drivers/fpga/dfl-afu-main.c                       | 114 +++++++++++++++++++++-
 drivers/fpga/dfl.h                                |   4 +
 3 files changed, 152 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-port b/Documentation/ABI/testing/sysfs-platform-dfl-port
index d7122a4..71f4892 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-port
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-port
@@ -44,3 +44,38 @@ Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-write. Read and set AFU latency tolerance reporting value.
 		Set ltr to 1 if the AFU can tolerate latency >= 40us or set it
 		to 0 if it is latency sensitive.
+
+What:		/sys/bus/platform/devices/dfl-port.0/revision
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the revision of port header
+		feature.
+
+What:		/sys/bus/platform/devices/dfl-port.0/userclk_freqcmd
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Write-only. User writes command to this interface to set
+		userclock to AFU.
+
+What:		/sys/bus/platform/devices/dfl-port.0/userclk_freqsts
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the status of issued command
+		to userclck_freqcmd.
+
+What:		/sys/bus/platform/devices/dfl-port.0/userclk_freqcntrcmd
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Write-only. User writes command to this interface to set
+		userclock counter.
+
+What:		/sys/bus/platform/devices/dfl-port.0/userclk_freqcntrsts
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the status of issued command
+		to userclck_freqcntrcmd.
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 2ffec06..82fd80a 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -144,6 +144,17 @@ static int port_get_id(struct platform_device *pdev)
 static DEVICE_ATTR_RO(id);
 
 static ssize_t
+revision_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	return scnprintf(buf, PAGE_SIZE, "%x\n", dfl_feature_revision(base));
+}
+static DEVICE_ATTR_RO(revision);
+
+static ssize_t
 ltr_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
@@ -282,6 +293,7 @@ static int port_get_id(struct platform_device *pdev)
 
 static const struct attribute *port_hdr_attrs[] = {
 	&dev_attr_id.attr,
+	&dev_attr_revision.attr,
 	&dev_attr_ltr.attr,
 	&dev_attr_ap1_event.attr,
 	&dev_attr_ap2_event.attr,
@@ -289,14 +301,113 @@ static int port_get_id(struct platform_device *pdev)
 	NULL,
 };
 
+static ssize_t
+userclk_freqcmd_store(struct device *dev, struct device_attribute *attr,
+		      const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	u64 userclk_freq_cmd;
+	void __iomem *base;
+
+	if (kstrtou64(buf, 0, &userclk_freq_cmd))
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	writeq(userclk_freq_cmd, base + PORT_HDR_USRCLK_CMD0);
+	mutex_unlock(&pdata->lock);
+
+	return count;
+}
+static DEVICE_ATTR_WO(userclk_freqcmd);
+
+static ssize_t
+userclk_freqcntrcmd_store(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	u64 userclk_freqcntr_cmd;
+	void __iomem *base;
+
+	if (kstrtou64(buf, 0, &userclk_freqcntr_cmd))
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	mutex_lock(&pdata->lock);
+	writeq(userclk_freqcntr_cmd, base + PORT_HDR_USRCLK_CMD1);
+	mutex_unlock(&pdata->lock);
+
+	return count;
+}
+static DEVICE_ATTR_WO(userclk_freqcntrcmd);
+
+static ssize_t
+userclk_freqsts_show(struct device *dev, struct device_attribute *attr,
+		     char *buf)
+{
+	u64 userclk_freqsts;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	userclk_freqsts = readq(base + PORT_HDR_USRCLK_STS0);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)userclk_freqsts);
+}
+static DEVICE_ATTR_RO(userclk_freqsts);
+
+static ssize_t
+userclk_freqcntrsts_show(struct device *dev, struct device_attribute *attr,
+			 char *buf)
+{
+	u64 userclk_freqcntrsts;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	userclk_freqcntrsts = readq(base + PORT_HDR_USRCLK_STS1);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)userclk_freqcntrsts);
+}
+static DEVICE_ATTR_RO(userclk_freqcntrsts);
+
+static const struct attribute *port_hdr_userclk_attrs[] = {
+	&dev_attr_userclk_freqcmd.attr,
+	&dev_attr_userclk_freqcntrcmd.attr,
+	&dev_attr_userclk_freqsts.attr,
+	&dev_attr_userclk_freqcntrsts.attr,
+	NULL,
+};
+
 static int port_hdr_init(struct platform_device *pdev,
 			 struct dfl_feature *feature)
 {
+	int ret;
+
 	dev_dbg(&pdev->dev, "PORT HDR Init.\n");
 
 	port_reset(pdev);
 
-	return sysfs_create_files(&pdev->dev.kobj, port_hdr_attrs);
+	ret = sysfs_create_files(&pdev->dev.kobj, port_hdr_attrs);
+	if (ret)
+		return ret;
+
+	/*
+	 * if revision > 0, the userclock will be moved from port hdr register
+	 * region to a separated private feature.
+	 */
+	if (dfl_feature_revision(feature->ioaddr) > 0)
+		return 0;
+
+	ret = sysfs_create_files(&pdev->dev.kobj, port_hdr_userclk_attrs);
+	if (ret)
+		sysfs_remove_files(&pdev->dev.kobj, port_hdr_attrs);
+
+	return ret;
 }
 
 static void port_hdr_uinit(struct platform_device *pdev,
@@ -304,6 +415,7 @@ static void port_hdr_uinit(struct platform_device *pdev,
 {
 	dev_dbg(&pdev->dev, "PORT HDR UInit.\n");
 
+	sysfs_remove_files(&pdev->dev.kobj, port_hdr_userclk_attrs);
 	sysfs_remove_files(&pdev->dev.kobj, port_hdr_attrs);
 }
 
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 1525098..3c5dc3a 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -120,6 +120,10 @@
 #define PORT_HDR_CAP		0x30
 #define PORT_HDR_CTRL		0x38
 #define PORT_HDR_STS		0x40
+#define PORT_HDR_USRCLK_CMD0	0x50
+#define PORT_HDR_USRCLK_CMD1	0x58
+#define PORT_HDR_USRCLK_STS0	0x60
+#define PORT_HDR_USRCLK_STS1	0x68
 
 /* Port Capability Register Bitfield */
 #define PORT_CAP_PORT_NUM	GENMASK_ULL(1, 0)	/* ID of this port */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 10/18] fpga: dfl: add id_table for dfl private feature driver
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (8 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 09/18] fpga: dfl: afu: add userclock " Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 11/18] fpga: dfl: afu: export __port_enable/disable function Wu Hao
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

This patch adds id_table for each dfl private feature driver,
it allows to reuse same private feature driver to match and support
multiple dfl private features.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
---
 drivers/fpga/dfl-afu-main.c | 14 ++++++++++++--
 drivers/fpga/dfl-fme-main.c | 11 ++++++++---
 drivers/fpga/dfl-fme-pr.c   |  7 ++++++-
 drivers/fpga/dfl-fme.h      |  3 ++-
 drivers/fpga/dfl.c          | 21 +++++++++++++++++++--
 drivers/fpga/dfl.h          | 21 +++++++++++++++------
 6 files changed, 62 insertions(+), 15 deletions(-)

diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 82fd80a..2916876 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -440,6 +440,11 @@ static void port_hdr_uinit(struct platform_device *pdev,
 	return ret;
 }
 
+static const struct dfl_feature_id port_hdr_id_table[] = {
+	{.id = PORT_FEATURE_ID_HEADER,},
+	{0,}
+};
+
 static const struct dfl_feature_ops port_hdr_ops = {
 	.init = port_hdr_init,
 	.uinit = port_hdr_uinit,
@@ -500,6 +505,11 @@ static void port_afu_uinit(struct platform_device *pdev,
 	sysfs_remove_files(&pdev->dev.kobj, port_afu_attrs);
 }
 
+static const struct dfl_feature_id port_afu_id_table[] = {
+	{.id = PORT_FEATURE_ID_AFU,},
+	{0,}
+};
+
 static const struct dfl_feature_ops port_afu_ops = {
 	.init = port_afu_init,
 	.uinit = port_afu_uinit,
@@ -507,11 +517,11 @@ static void port_afu_uinit(struct platform_device *pdev,
 
 static struct dfl_feature_driver port_feature_drvs[] = {
 	{
-		.id = PORT_FEATURE_ID_HEADER,
+		.id_table = port_hdr_id_table,
 		.ops = &port_hdr_ops,
 	},
 	{
-		.id = PORT_FEATURE_ID_AFU,
+		.id_table = port_afu_id_table,
 		.ops = &port_afu_ops,
 	},
 	{
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 8b2a337..38c6342 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -158,6 +158,11 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
 	return -ENODEV;
 }
 
+static const struct dfl_feature_id fme_hdr_id_table[] = {
+	{.id = FME_FEATURE_ID_HEADER,},
+	{0,}
+};
+
 static const struct dfl_feature_ops fme_hdr_ops = {
 	.init = fme_hdr_init,
 	.uinit = fme_hdr_uinit,
@@ -166,12 +171,12 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
 
 static struct dfl_feature_driver fme_feature_drvs[] = {
 	{
-		.id = FME_FEATURE_ID_HEADER,
+		.id_table = fme_hdr_id_table,
 		.ops = &fme_hdr_ops,
 	},
 	{
-		.id = FME_FEATURE_ID_PR_MGMT,
-		.ops = &pr_mgmt_ops,
+		.id_table = fme_pr_mgmt_id_table,
+		.ops = &fme_pr_mgmt_ops,
 	},
 	{
 		.ops = NULL,
diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
index cd94ba8..52f1745 100644
--- a/drivers/fpga/dfl-fme-pr.c
+++ b/drivers/fpga/dfl-fme-pr.c
@@ -483,7 +483,12 @@ static long fme_pr_ioctl(struct platform_device *pdev,
 	return ret;
 }
 
-const struct dfl_feature_ops pr_mgmt_ops = {
+const struct dfl_feature_id fme_pr_mgmt_id_table[] = {
+	{.id = FME_FEATURE_ID_PR_MGMT,},
+	{0}
+};
+
+const struct dfl_feature_ops fme_pr_mgmt_ops = {
 	.init = pr_mgmt_init,
 	.uinit = pr_mgmt_uinit,
 	.ioctl = fme_pr_ioctl,
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index de20755..7a021c4 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -35,6 +35,7 @@ struct dfl_fme {
 	struct dfl_feature_platform_data *pdata;
 };
 
-extern const struct dfl_feature_ops pr_mgmt_ops;
+extern const struct dfl_feature_ops fme_pr_mgmt_ops;
+extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
 
 #endif /* __DFL_FME_H */
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index c5aa287..65f91ef 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -14,6 +14,8 @@
 
 #include "dfl.h"
 
+#define DRV_VERSION	"0.8"
+
 static DEFINE_MUTEX(dfl_id_mutex);
 
 /*
@@ -274,6 +276,21 @@ static int dfl_feature_instance_init(struct platform_device *pdev,
 	return ret;
 }
 
+static bool dfl_feature_drv_match(struct dfl_feature *feature,
+				  struct dfl_feature_driver *driver)
+{
+	const struct dfl_feature_id *ids = driver->id_table;
+
+	if (ids) {
+		while (ids->id) {
+			if (ids->id == feature->id)
+				return true;
+			ids++;
+		}
+	}
+	return false;
+}
+
 /**
  * dfl_fpga_dev_feature_init - init for sub features of dfl feature device
  * @pdev: feature device.
@@ -294,8 +311,7 @@ int dfl_fpga_dev_feature_init(struct platform_device *pdev,
 
 	while (drv->ops) {
 		dfl_fpga_dev_for_each_feature(pdata, feature) {
-			/* match feature and drv using id */
-			if (feature->id == drv->id) {
+			if (dfl_feature_drv_match(feature, drv)) {
 				ret = dfl_feature_instance_init(pdev, pdata,
 								feature, drv);
 				if (ret)
@@ -1164,3 +1180,4 @@ static void __exit dfl_fpga_exit(void)
 MODULE_DESCRIPTION("FPGA Device Feature List (DFL) Support");
 MODULE_AUTHOR("Intel Corporation");
 MODULE_LICENSE("GPL v2");
+MODULE_VERSION(DRV_VERSION);
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 3c5dc3a..fbc57f0 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -30,8 +30,8 @@
 /* plus one for fme device */
 #define MAX_DFL_FEATURE_DEV_NUM    (MAX_DFL_FPGA_PORT_NUM + 1)
 
-/* Reserved 0x0 for Header Group Register and 0xff for AFU */
-#define FEATURE_ID_FIU_HEADER		0x0
+/* Reserved 0xfe for Header Group Register and 0xff for AFU */
+#define FEATURE_ID_FIU_HEADER		0xfe
 #define FEATURE_ID_AFU			0xff
 
 #define FME_FEATURE_ID_HEADER		FEATURE_ID_FIU_HEADER
@@ -169,13 +169,22 @@ struct dfl_fpga_port_ops {
 int dfl_fpga_check_port_id(struct platform_device *pdev, void *pport_id);
 
 /**
- * struct dfl_feature_driver - sub feature's driver
+ * struct dfl_feature_id - dfl private feature id
  *
- * @id: sub feature id.
- * @ops: ops of this sub feature.
+ * @id: unique dfl private feature id.
  */
-struct dfl_feature_driver {
+struct dfl_feature_id {
 	u64 id;
+};
+
+/**
+ * struct dfl_feature_driver - dfl private feature driver
+ *
+ * @id_table: id_table for dfl private features supported by this driver.
+ * @ops: ops of this dfl private feature driver.
+ */
+struct dfl_feature_driver {
+	const struct dfl_feature_id *id_table;
 	const struct dfl_feature_ops *ops;
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 11/18] fpga: dfl: afu: export __port_enable/disable function.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (9 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 10/18] fpga: dfl: add id_table for dfl private feature driver Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 12/18] fpga: dfl: afu: add error reporting support Wu Hao
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

As these two functions are used by other private features. e.g.
in error reporting private feature, it requires to check port status
and reset port for error clearing.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
---
 drivers/fpga/dfl-afu-main.c | 25 ++++++++++++++-----------
 drivers/fpga/dfl-afu.h      |  3 +++
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 2916876..e727d9b 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -24,14 +24,16 @@
 #define DRV_VERSION	"0.8"
 
 /**
- * port_enable - enable a port
+ * __port_enable - enable a port
  * @pdev: port platform device.
  *
  * Enable Port by clear the port soft reset bit, which is set by default.
  * The AFU is unable to respond to any MMIO access while in reset.
- * port_enable function should only be used after port_disable function.
+ * __port_enable function should only be used after __port_disable function.
+ *
+ * The caller needs to hold lock for protection.
  */
-static void port_enable(struct platform_device *pdev)
+void __port_enable(struct platform_device *pdev)
 {
 	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
 	void __iomem *base;
@@ -54,13 +56,14 @@ static void port_enable(struct platform_device *pdev)
 #define RST_POLL_TIMEOUT 1000 /* us */
 
 /**
- * port_disable - disable a port
+ * __port_disable - disable a port
  * @pdev: port platform device.
  *
- * Disable Port by setting the port soft reset bit, it puts the port into
- * reset.
+ * Disable Port by setting the port soft reset bit, it puts the port into reset.
+ *
+ * The caller needs to hold lock for protection.
  */
-static int port_disable(struct platform_device *pdev)
+int __port_disable(struct platform_device *pdev)
 {
 	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
 	void __iomem *base;
@@ -106,9 +109,9 @@ static int __port_reset(struct platform_device *pdev)
 {
 	int ret;
 
-	ret = port_disable(pdev);
+	ret = __port_disable(pdev);
 	if (!ret)
-		port_enable(pdev);
+		__port_enable(pdev);
 
 	return ret;
 }
@@ -810,9 +813,9 @@ static int port_enable_set(struct platform_device *pdev, bool enable)
 
 	mutex_lock(&pdata->lock);
 	if (enable)
-		port_enable(pdev);
+		__port_enable(pdev);
 	else
-		ret = port_disable(pdev);
+		ret = __port_disable(pdev);
 	mutex_unlock(&pdata->lock);
 
 	return ret;
diff --git a/drivers/fpga/dfl-afu.h b/drivers/fpga/dfl-afu.h
index 0c7630a..35e60c5 100644
--- a/drivers/fpga/dfl-afu.h
+++ b/drivers/fpga/dfl-afu.h
@@ -79,6 +79,9 @@ struct dfl_afu {
 	struct dfl_feature_platform_data *pdata;
 };
 
+void __port_enable(struct platform_device *pdev);
+int __port_disable(struct platform_device *pdev);
+
 void afu_mmio_region_init(struct dfl_feature_platform_data *pdata);
 int afu_mmio_region_add(struct dfl_feature_platform_data *pdata,
 			u32 region_index, u64 region_size, u64 phys, u32 flags);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 12/18] fpga: dfl: afu: add error reporting support.
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (10 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 11/18] fpga: dfl: afu: export __port_enable/disable function Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-09 14:41   ` Alan Tull
  2019-04-29  8:55 ` [PATCH v2 13/18] fpga: dfl: afu: add STP (SignalTap) support Wu Hao
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

Error reporting is one important private feature, it reports error
detected on port and accelerated function unit (AFU). It introduces
several sysfs interfaces to allow userspace to check and clear
errors detected by hardware.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: add more error code description for error clear sysfs in doc.
    return -EINVAL instead of -EBUSY when input error code doesn't
    match in error clear sysfs.
---
 Documentation/ABI/testing/sysfs-platform-dfl-port |  39 ++++
 drivers/fpga/Makefile                             |   1 +
 drivers/fpga/dfl-afu-error.c                      | 225 ++++++++++++++++++++++
 drivers/fpga/dfl-afu-main.c                       |   4 +
 drivers/fpga/dfl-afu.h                            |   4 +
 5 files changed, 273 insertions(+)
 create mode 100644 drivers/fpga/dfl-afu-error.c

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-port b/Documentation/ABI/testing/sysfs-platform-dfl-port
index 71f4892..8e3eed5 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-port
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-port
@@ -79,3 +79,42 @@ KernelVersion:	5.2
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. Read this file to get the status of issued command
 		to userclck_freqcntrcmd.
+
+What:		/sys/bus/platform/devices/dfl-port.0/errors/revision
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the revision of this error
+		reporting private feature.
+
+What:		/sys/bus/platform/devices/dfl-port.0/errors/errors
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get errors detected on port and
+		Accelerated Function Unit (AFU).
+
+What:		/sys/bus/platform/devices/dfl-port.0/errors/first_error
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the first error detected by
+		hardware.
+
+What:		/sys/bus/platform/devices/dfl-port.0/errors/first_malformed_req
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the first malformed request
+		captured by hardware.
+
+What:		/sys/bus/platform/devices/dfl-port.0/errors/clear
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Write-only. Write error code to this file to clear errors.
+		Write fails with -EINVAL if input parsing fails or input error
+		code doesn't match.
+		Write fails with -EBUSY or -ETIMEDOUT if error can't be cleared
+		as hardware is in low power state (-EBUSY) or not responding
+		(-ETIMEDOUT).
diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
index c0dd4c8..f1f0af7 100644
--- a/drivers/fpga/Makefile
+++ b/drivers/fpga/Makefile
@@ -40,6 +40,7 @@ obj-$(CONFIG_FPGA_DFL_AFU)		+= dfl-afu.o
 
 dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o
 dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
+dfl-afu-objs += dfl-afu-error.o
 
 # Drivers for FPGAs which implement DFL
 obj-$(CONFIG_FPGA_DFL_PCI)		+= dfl-pci.o
diff --git a/drivers/fpga/dfl-afu-error.c b/drivers/fpga/dfl-afu-error.c
new file mode 100644
index 0000000..ed2be1d
--- /dev/null
+++ b/drivers/fpga/dfl-afu-error.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for FPGA Accelerated Function Unit (AFU) Error Reporting
+ *
+ * Copyright 2019 Intel Corporation, Inc.
+ *
+ * Authors:
+ *   Wu Hao <hao.wu@linux.intel.com>
+ *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *   Joseph Grecco <joe.grecco@intel.com>
+ *   Enno Luebbers <enno.luebbers@intel.com>
+ *   Tim Whisonant <tim.whisonant@intel.com>
+ *   Ananda Ravuri <ananda.ravuri@intel.com>
+ *   Mitchel Henry <henry.mitchel@intel.com>
+ */
+
+#include <linux/uaccess.h>
+
+#include "dfl-afu.h"
+
+#define PORT_ERROR_MASK		0x8
+#define PORT_ERROR		0x10
+#define PORT_FIRST_ERROR	0x18
+#define PORT_MALFORMED_REQ0	0x20
+#define PORT_MALFORMED_REQ1	0x28
+
+#define ERROR_MASK		GENMASK_ULL(63, 0)
+
+/* mask or unmask port errors by the error mask register. */
+static void __port_err_mask(struct device *dev, bool mask)
+{
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+	writeq(mask ? ERROR_MASK : 0, base + PORT_ERROR_MASK);
+}
+
+/* clear port errors. */
+static int __port_err_clear(struct device *dev, u64 err)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	void __iomem *base_err, *base_hdr;
+	int ret;
+	u64 v;
+
+	base_err = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+	base_hdr = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_HEADER);
+
+	/*
+	 * clear Port Errors
+	 *
+	 * - Check for AP6 State
+	 * - Halt Port by keeping Port in reset
+	 * - Set PORT Error mask to all 1 to mask errors
+	 * - Clear all errors
+	 * - Set Port mask to all 0 to enable errors
+	 * - All errors start capturing new errors
+	 * - Enable Port by pulling the port out of reset
+	 */
+
+	/* if device is still in AP6 power state, can not clear any error. */
+	v = readq(base_hdr + PORT_HDR_STS);
+	if (FIELD_GET(PORT_STS_PWR_STATE, v) == PORT_STS_PWR_STATE_AP6) {
+		dev_err(dev, "Could not clear errors, device in AP6 state.\n");
+		return -EBUSY;
+	}
+
+	/* Halt Port by keeping Port in reset */
+	ret = __port_disable(pdev);
+	if (ret)
+		return ret;
+
+	/* Mask all errors */
+	__port_err_mask(dev, true);
+
+	/* Clear errors if err input matches with current port errors.*/
+	v = readq(base_err + PORT_ERROR);
+
+	if (v == err) {
+		writeq(v, base_err + PORT_ERROR);
+
+		v = readq(base_err + PORT_FIRST_ERROR);
+		writeq(v, base_err + PORT_FIRST_ERROR);
+	} else {
+		ret = -EINVAL;
+	}
+
+	/* Clear mask */
+	__port_err_mask(dev, false);
+
+	/* Enable the Port by clear the reset */
+	__port_enable(pdev);
+
+	return ret;
+}
+
+static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
+			     char *buf)
+{
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n", dfl_feature_revision(base));
+}
+static DEVICE_ATTR_RO(revision);
+
+static ssize_t errors_show(struct device *dev, struct device_attribute *attr,
+			   char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 error;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+	mutex_lock(&pdata->lock);
+	error = readq(base + PORT_ERROR);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)error);
+}
+static DEVICE_ATTR_RO(errors);
+
+static ssize_t first_error_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 error;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+	mutex_lock(&pdata->lock);
+	error = readq(base + PORT_FIRST_ERROR);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)error);
+}
+static DEVICE_ATTR_RO(first_error);
+
+static ssize_t first_malformed_req_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	void __iomem *base;
+	u64 req0, req1;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
+
+	mutex_lock(&pdata->lock);
+	req0 = readq(base + PORT_MALFORMED_REQ0);
+	req1 = readq(base + PORT_MALFORMED_REQ1);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%016llx%016llx\n",
+			 (unsigned long long)req1, (unsigned long long)req0);
+}
+static DEVICE_ATTR_RO(first_malformed_req);
+
+static ssize_t clear_store(struct device *dev, struct device_attribute *attr,
+			   const char *buff, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
+	u64 value;
+	int ret;
+
+	if (kstrtou64(buff, 0, &value))
+		return -EINVAL;
+
+	mutex_lock(&pdata->lock);
+	ret = __port_err_clear(dev, value);
+	mutex_unlock(&pdata->lock);
+
+	return ret ? ret : count;
+}
+static DEVICE_ATTR_WO(clear);
+
+static struct attribute *port_err_attrs[] = {
+	&dev_attr_revision.attr,
+	&dev_attr_errors.attr,
+	&dev_attr_first_error.attr,
+	&dev_attr_first_malformed_req.attr,
+	&dev_attr_clear.attr,
+	NULL,
+};
+
+static struct attribute_group port_err_attr_group = {
+	.attrs = port_err_attrs,
+	.name = "errors",
+};
+
+static int port_err_init(struct platform_device *pdev,
+			 struct dfl_feature *feature)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
+
+	dev_dbg(&pdev->dev, "PORT ERR Init.\n");
+
+	mutex_lock(&pdata->lock);
+	__port_err_mask(&pdev->dev, false);
+	mutex_unlock(&pdata->lock);
+
+	return sysfs_create_group(&pdev->dev.kobj, &port_err_attr_group);
+}
+
+static void port_err_uinit(struct platform_device *pdev,
+			   struct dfl_feature *feature)
+{
+	dev_dbg(&pdev->dev, "PORT ERR UInit.\n");
+
+	sysfs_remove_group(&pdev->dev.kobj, &port_err_attr_group);
+}
+
+const struct dfl_feature_id port_err_id_table[] = {
+	{.id = PORT_FEATURE_ID_ERROR,},
+	{0,}
+};
+
+const struct dfl_feature_ops port_err_ops = {
+	.init = port_err_init,
+	.uinit = port_err_uinit,
+};
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index e727d9b..754729e 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -528,6 +528,10 @@ static void port_afu_uinit(struct platform_device *pdev,
 		.ops = &port_afu_ops,
 	},
 	{
+		.id_table = port_err_id_table,
+		.ops = &port_err_ops,
+	},
+	{
 		.ops = NULL,
 	}
 };
diff --git a/drivers/fpga/dfl-afu.h b/drivers/fpga/dfl-afu.h
index 35e60c5..c3182a2 100644
--- a/drivers/fpga/dfl-afu.h
+++ b/drivers/fpga/dfl-afu.h
@@ -100,4 +100,8 @@ int afu_dma_map_region(struct dfl_feature_platform_data *pdata,
 struct dfl_afu_dma_region *
 afu_dma_region_find(struct dfl_feature_platform_data *pdata,
 		    u64 iova, u64 size);
+
+extern const struct dfl_feature_ops port_err_ops;
+extern const struct dfl_feature_id port_err_id_table[];
+
 #endif /* __DFL_AFU_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 13/18] fpga: dfl: afu: add STP (SignalTap) support
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (11 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 12/18] fpga: dfl: afu: add error reporting support Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 14/18] fpga: dfl: fme: add capability sysfs interfaces Wu Hao
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel; +Cc: linux-api, Wu Hao, Xu Yilun

STP (SignalTap) is one of the private features under the port for
debugging. This patch adds private feature driver support for it
to allow userspace applications to mmap related mmio region and
provide STP service.

Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Alan Tull <atull@kernel.org>
---
 drivers/fpga/dfl-afu-main.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 754729e..14970a4 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -518,6 +518,36 @@ static void port_afu_uinit(struct platform_device *pdev,
 	.uinit = port_afu_uinit,
 };
 
+static int port_stp_init(struct platform_device *pdev,
+			 struct dfl_feature *feature)
+{
+	struct resource *res = &pdev->resource[feature->resource_index];
+
+	dev_dbg(&pdev->dev, "PORT STP Init.\n");
+
+	return afu_mmio_region_add(dev_get_platdata(&pdev->dev),
+				   DFL_PORT_REGION_INDEX_STP,
+				   resource_size(res), res->start,
+				   DFL_PORT_REGION_MMAP | DFL_PORT_REGION_READ |
+				   DFL_PORT_REGION_WRITE);
+}
+
+static void port_stp_uinit(struct platform_device *pdev,
+			   struct dfl_feature *feature)
+{
+	dev_dbg(&pdev->dev, "PORT STP UInit.\n");
+}
+
+static const struct dfl_feature_id port_stp_id_table[] = {
+	{.id = PORT_FEATURE_ID_STP,},
+	{0,}
+};
+
+static const struct dfl_feature_ops port_stp_ops = {
+	.init = port_stp_init,
+	.uinit = port_stp_uinit,
+};
+
 static struct dfl_feature_driver port_feature_drvs[] = {
 	{
 		.id_table = port_hdr_id_table,
@@ -532,6 +562,10 @@ static void port_afu_uinit(struct platform_device *pdev,
 		.ops = &port_err_ops,
 	},
 	{
+		.id_table = port_stp_id_table,
+		.ops = &port_stp_ops,
+	},
+	{
 		.ops = NULL,
 	}
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 14/18] fpga: dfl: fme: add capability sysfs interfaces
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (12 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 13/18] fpga: dfl: afu: add STP (SignalTap) support Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-04-29  8:55 ` [PATCH v2 15/18] fpga: dfl: fme: add thermal management support Wu Hao
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Luwei Kang, Xu Yilun

This patch adds 3 read-only sysfs interfaces for FPGA Management Engine
(FME) block for capabilities including cache_size, fabric_version and
socket_id.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Alan Tull <atull@kernel.org>
---
 Documentation/ABI/testing/sysfs-platform-dfl-fme | 23 ++++++++++++
 drivers/fpga/dfl-fme-main.c                      | 48 ++++++++++++++++++++++++
 2 files changed, 71 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index 8fa4feb..d1aa375 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -21,3 +21,26 @@ Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. It returns Bitstream (static FPGA region) meta
 		data, which includes the synthesis date, seed and other
 		information of this static FPGA region.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/cache_size
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns cache size of this FPGA device.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/fabric_version
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns fabric version of this FPGA device.
+		Userspace applications need this information to select
+		best data channels per different fabric design.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/socket_id
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns socket_id to indicate which socket
+		this FPGA belongs to, only valid for integrated solution.
+		User only needs this information, in case standard numa node
+		can't provide correct information.
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 38c6342..8339ee8 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -75,10 +75,58 @@ static ssize_t bitstream_metadata_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(bitstream_metadata);
 
+static ssize_t cache_size_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+
+	v = readq(base + FME_HDR_CAP);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(FME_CAP_CACHE_SIZE, v));
+}
+static DEVICE_ATTR_RO(cache_size);
+
+static ssize_t fabric_version_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+
+	v = readq(base + FME_HDR_CAP);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(FME_CAP_FABRIC_VERID, v));
+}
+static DEVICE_ATTR_RO(fabric_version);
+
+static ssize_t socket_id_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+
+	v = readq(base + FME_HDR_CAP);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(FME_CAP_SOCKET_ID, v));
+}
+static DEVICE_ATTR_RO(socket_id);
+
 static const struct attribute *fme_hdr_attrs[] = {
 	&dev_attr_ports_num.attr,
 	&dev_attr_bitstream_id.attr,
 	&dev_attr_bitstream_metadata.attr,
+	&dev_attr_cache_size.attr,
+	&dev_attr_fabric_version.attr,
+	&dev_attr_socket_id.attr,
 	NULL,
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 15/18] fpga: dfl: fme: add thermal management support
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (13 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 14/18] fpga: dfl: fme: add capability sysfs interfaces Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-07 18:20   ` Alan Tull
  2019-05-07 18:30   ` Moritz Fischer
  2019-04-29  8:55 ` [PATCH v2 16/18] fpga: dfl: fme: add power " Wu Hao
                   ` (2 subsequent siblings)
  17 siblings, 2 replies; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Luwei Kang, Russ Weight, Xu Yilun

This patch adds support to thermal management private feature for DFL
FPGA Management Engine (FME). This private feature driver registers
a hwmon for thermal/temperature monitoring (hwmon temp1_input).
If hardware automatic throttling is supported by this hardware, then
driver also exposes sysfs interfaces under hwmon for thresholds
(temp1_alarm/ crit/ emergency), threshold status (temp1_alarm_status/
temp1_crit_status) and throttling policy (temp1_alarm_policy).

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: create a dfl_fme_thermal hwmon to expose thermal information.
    move all sysfs interfaces under hwmon
	tempareture       --> hwmon temp1_input
	threshold1        --> hwmon temp1_alarm
	threshold2        --> hwmon temp1_crit
	trip_threshold    --> hwmon temp1_emergency
	threshold1_status --> hwmon temp1_alarm_status
	threshold2_status --> hwmon temp1_crit_status
	threshold1_policy --> hwmon temp1_alarm_policy
---
 Documentation/ABI/testing/sysfs-platform-dfl-fme |  64 +++++++
 drivers/fpga/Kconfig                             |   2 +-
 drivers/fpga/dfl-fme-main.c                      | 212 +++++++++++++++++++++++
 3 files changed, 277 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index d1aa375..dfbd315 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -44,3 +44,67 @@ Description:	Read-only. It returns socket_id to indicate which socket
 		this FPGA belongs to, only valid for integrated solution.
 		User only needs this information, in case standard numa node
 		can't provide correct information.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/name
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read this file to get the name of hwmon device, it
+		supports values:
+		    'dfl_fme_thermal' - thermal hwmon device name
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns FPGA device temperature in millidegrees
+		Celsius.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns hardware threshold1 temperature in
+		millidegrees Celsius. If temperature rises at or above this
+		threshold, hardware starts 50% or 90% throttling (see
+		'temp1_alarm_policy').
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns hardware threshold2 temperature in
+		millidegrees Celsius. If temperature rises at or above this
+		threshold, hardware starts 100% throttling.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_emergency
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns hardware trip threshold temperature in
+		millidegrees Celsius. If temperature rises at or above this
+		threshold, a fatal event will be triggered to board management
+		controller (BMC) to shutdown FPGA.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_status
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns 1 if temperature is currently at or above
+		hardware threshold1 (see 'temp1_alarm'), otherwise 0.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit_status
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns 1 if temperature is currently at or above
+		hardware threshold2 (see 'temp1_crit'), otherwise 0.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_policy
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read this file to get the policy of hardware threshold1
+		(see 'temp1_alarm'). It only supports two values (policies):
+		    0 - AP2 state (90% throttling)
+		    1 - AP1 state (50% throttling)
diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
index c20445b..a6d7588 100644
--- a/drivers/fpga/Kconfig
+++ b/drivers/fpga/Kconfig
@@ -154,7 +154,7 @@ config FPGA_DFL
 
 config FPGA_DFL_FME
 	tristate "FPGA DFL FME Driver"
-	depends on FPGA_DFL
+	depends on FPGA_DFL && HWMON
 	help
 	  The FPGA Management Engine (FME) is a feature device implemented
 	  under Device Feature List (DFL) framework. Select this option to
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 8339ee8..b9a68b8 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -14,6 +14,8 @@
  *   Henry Mitchel <henry.mitchel@intel.com>
  */
 
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/uaccess.h>
@@ -217,6 +219,212 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
 	.ioctl = fme_hdr_ioctl,
 };
 
+#define FME_THERM_THRESHOLD	0x8
+#define TEMP_THRESHOLD1		GENMASK_ULL(6, 0)
+#define TEMP_THRESHOLD1_EN	BIT_ULL(7)
+#define TEMP_THRESHOLD2		GENMASK_ULL(14, 8)
+#define TEMP_THRESHOLD2_EN	BIT_ULL(15)
+#define TRIP_THRESHOLD		GENMASK_ULL(30, 24)
+#define TEMP_THRESHOLD1_STATUS	BIT_ULL(32)		/* threshold1 reached */
+#define TEMP_THRESHOLD2_STATUS	BIT_ULL(33)		/* threshold2 reached */
+/* threshold1 policy: 0 - AP2 (90% throttle) / 1 - AP1 (50% throttle) */
+#define TEMP_THRESHOLD1_POLICY	BIT_ULL(44)
+
+#define FME_THERM_RDSENSOR_FMT1	0x10
+#define FPGA_TEMPERATURE	GENMASK_ULL(6, 0)
+
+#define FME_THERM_CAP		0x20
+#define THERM_NO_THROTTLE	BIT_ULL(0)
+
+#define MD_PRE_DEG
+
+static bool fme_thermal_throttle_support(void __iomem *base)
+{
+	u64 v = readq(base + FME_THERM_CAP);
+
+	return FIELD_GET(THERM_NO_THROTTLE, v) ? false : true;
+}
+
+static umode_t thermal_hwmon_attrs_visible(const void *drvdata,
+					   enum hwmon_sensor_types type,
+					   u32 attr, int channel)
+{
+	const struct dfl_feature *feature = drvdata;
+
+	/* temperature is always supported, and check hardware cap for others */
+	if (attr == hwmon_temp_input)
+		return 0444;
+
+	return fme_thermal_throttle_support(feature->ioaddr) ? 0444 : 0;
+}
+
+static int thermal_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
+			      u32 attr, int channel, long *val)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		v = readq(feature->ioaddr + FME_THERM_RDSENSOR_FMT1);
+		*val = (long)(FIELD_GET(FPGA_TEMPERATURE, v) * 1000);
+		break;
+	case hwmon_temp_alarm:
+		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
+		*val = (long)(FIELD_GET(TEMP_THRESHOLD1, v) * 1000);
+		break;
+	case hwmon_temp_crit:
+		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
+		*val = (long)(FIELD_GET(TEMP_THRESHOLD2, v) * 1000);
+		break;
+	case hwmon_temp_emergency:
+		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
+		*val = (long)(FIELD_GET(TRIP_THRESHOLD, v) * 1000);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static const struct hwmon_ops thermal_hwmon_ops = {
+	.is_visible = thermal_hwmon_attrs_visible,
+	.read = thermal_hwmon_read,
+};
+
+static const u32 thermal_hwmon_temp_config[] = {
+	HWMON_T_INPUT | HWMON_T_ALARM | HWMON_T_CRIT | HWMON_T_EMERGENCY,
+	0
+};
+
+static const struct hwmon_channel_info hwmon_temp_info = {
+	.type = hwmon_temp,
+	.config = thermal_hwmon_temp_config,
+};
+
+static const struct hwmon_channel_info *thermal_hwmon_info[] = {
+	&hwmon_temp_info,
+	NULL
+};
+
+static const struct hwmon_chip_info thermal_hwmon_chip_info = {
+	.ops = &thermal_hwmon_ops,
+	.info = thermal_hwmon_info,
+};
+
+static ssize_t temp1_alarm_status_show(struct device *dev,
+				       struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(TEMP_THRESHOLD1_STATUS, v));
+}
+
+static ssize_t temp1_crit_status_show(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(TEMP_THRESHOLD2_STATUS, v));
+}
+
+static ssize_t temp1_alarm_policy_show(struct device *dev,
+				       struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(TEMP_THRESHOLD1_POLICY, v));
+}
+
+static DEVICE_ATTR_RO(temp1_alarm_status);
+static DEVICE_ATTR_RO(temp1_crit_status);
+static DEVICE_ATTR_RO(temp1_alarm_policy);
+
+static struct attribute *thermal_extra_attrs[] = {
+	&dev_attr_temp1_alarm_status.attr,
+	&dev_attr_temp1_crit_status.attr,
+	&dev_attr_temp1_alarm_policy.attr,
+	NULL,
+};
+
+static umode_t thermal_extra_attrs_visible(struct kobject *kobj,
+					   struct attribute *attr, int index)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+
+	return fme_thermal_throttle_support(feature->ioaddr) ? attr->mode : 0;
+}
+
+static const struct attribute_group thermal_extra_group = {
+	.attrs		= thermal_extra_attrs,
+	.is_visible	= thermal_extra_attrs_visible,
+};
+__ATTRIBUTE_GROUPS(thermal_extra);
+
+static int fme_thermal_mgmt_init(struct platform_device *pdev,
+				 struct dfl_feature *feature)
+{
+	struct device *hwmon;
+
+	dev_dbg(&pdev->dev, "FME Thermal Management Init.\n");
+
+	/*
+	 * create hwmon to allow userspace monitoring temperature and other
+	 * threshold information.
+	 *
+	 * temp1_alarm     -> hardware threshold 1 -> 50% or 90% throttling
+	 * temp1_crit      -> hardware threshold 2 -> 100% throttling
+	 * temp1_emergency -> hardware trip_threshold to shutdown FPGA
+	 *
+	 * create device specific sysfs interfaces, e.g. read temp1_alarm_policy
+	 * to understand the actual hardware throttling action (50% vs 90%).
+	 *
+	 * If hardware doesn't support automatic throttling per thresholds,
+	 * then all above sysfs interfaces are not visible except temp1_input
+	 * for temperature.
+	 */
+	hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
+						     "dfl_fme_thermal", feature,
+						     &thermal_hwmon_chip_info,
+						     thermal_extra_groups);
+	if (IS_ERR(hwmon)) {
+		dev_err(&pdev->dev, "Fail to register thermal hwmon\n");
+		return PTR_ERR(hwmon);
+	}
+
+	return 0;
+}
+
+static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
+				   struct dfl_feature *feature)
+{
+	dev_dbg(&pdev->dev, "FME Thermal Management UInit.\n");
+}
+
+static const struct dfl_feature_id fme_thermal_mgmt_id_table[] = {
+	{.id = FME_FEATURE_ID_THERMAL_MGMT,},
+	{0,}
+};
+
+static const struct dfl_feature_ops fme_thermal_mgmt_ops = {
+	.init = fme_thermal_mgmt_init,
+	.uinit = fme_thermal_mgmt_uinit,
+};
+
 static struct dfl_feature_driver fme_feature_drvs[] = {
 	{
 		.id_table = fme_hdr_id_table,
@@ -227,6 +435,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
 		.ops = &fme_pr_mgmt_ops,
 	},
 	{
+		.id_table = fme_thermal_mgmt_id_table,
+		.ops = &fme_thermal_mgmt_ops,
+	},
+	{
 		.ops = NULL,
 	},
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 16/18] fpga: dfl: fme: add power management support
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (14 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 15/18] fpga: dfl: fme: add thermal management support Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-07 18:23   ` Alan Tull
  2019-04-29  8:55 ` [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support Wu Hao
  2019-04-29  8:55 ` [PATCH v2 18/18] fpga: dfl: fme: add performance " Wu Hao
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Luwei Kang, Xu Yilun

This patch adds support for power management private feature under
FPGA Management Engine (FME). This private feature driver registers
a hwmon for power (power1_input), thresholds information, e.g.
(power1_cap / crit) and also read-only sysfs interfaces for other
power management information. For configuration, user could write
threshold values via above power1_cap / crit sysfs interface
under hwmon too.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: create a dfl_fme_power hwmon to expose power sysfs interfaces.
    move all sysfs interfaces under hwmon
        consumed          --> hwmon power1_input
        threshold1        --> hwmon power1_cap
        threshold2        --> hwmon power1_crit
        threshold1_status --> hwmon power1_cap_status
        threshold2_status --> hwmon power1_crit_status
        xeon_limit        --> hwmon power1_xeon_limit
        fpga_limit        --> hwmon power1_fpga_limit
        ltr               --> hwmon power1_ltr
---
 Documentation/ABI/testing/sysfs-platform-dfl-fme |  67 ++++++
 drivers/fpga/dfl-fme-main.c                      | 247 +++++++++++++++++++++++
 2 files changed, 314 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index dfbd315..e2ba92d 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -52,6 +52,7 @@ Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-Only. Read this file to get the name of hwmon device, it
 		supports values:
 		    'dfl_fme_thermal' - thermal hwmon device name
+		    'dfl_fme_power'   - power hwmon device name
 
 What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
 Date:		April 2019
@@ -108,3 +109,69 @@ Description:	Read-Only. Read this file to get the policy of hardware threshold1
 		(see 'temp1_alarm'). It only supports two values (policies):
 		    0 - AP2 state (90% throttling)
 		    1 - AP1 state (50% throttling)
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_input
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns current FPGA power consumption in uW.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_cap
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file to get current hardware power
+		threshold1 in uW. If power consumption rises at or above
+		this threshold, hardware starts 50% throttling.
+		Write this file to set current hardware power threshold1 in uW.
+		As hardware only accepts values in Watts, so input value will
+		be round down per Watts (< 1 watts part will be discarded).
+		Write fails with -EINVAL if input parsing fails or input isn't
+		in the valid range (0 - 127000000 uW).
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file to get current hardware power
+		threshold2 in uW. If power consumption rises at or above
+		this threshold, hardware starts 90% throttling.
+		Write this file to set current hardware power threshold2 in uW.
+		As hardware only accepts values in Watts, so input value will
+		be round down per Watts (< 1 watts part will be discarded).
+		Write fails with -EINVAL if input parsing fails or input isn't
+		in the valid range (0 - 127000000 uW).
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_cap_status
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns 1 if power consumption is currently at or
+		above hardware threshold1 (see 'power1_cap'), otherwise 0.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit_status
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns 1 if power consumption is currently at or
+		above hardware threshold2 (see 'power1_crit'), otherwise 0.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_xeon_limit
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns power limit for XEON in uW.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_fpga_limit
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. It returns power limit for FPGA in uW.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_ltr
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get current Latency Tolerance
+		Reporting (ltr) value. This ltr impacts the CPU low power
+		state in integrated solution.
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index b9a68b8..7005316 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -425,6 +425,249 @@ static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
 	.uinit = fme_thermal_mgmt_uinit,
 };
 
+#define FME_PWR_STATUS		0x8
+#define FME_LATENCY_TOLERANCE	BIT_ULL(18)
+#define PWR_CONSUMED		GENMASK_ULL(17, 0)
+
+#define FME_PWR_THRESHOLD	0x10
+#define PWR_THRESHOLD1		GENMASK_ULL(6, 0)	/* in Watts */
+#define PWR_THRESHOLD2		GENMASK_ULL(14, 8)	/* in Watts */
+#define PWR_THRESHOLD_MAX	0x7f			/* in Watts */
+#define PWR_THRESHOLD1_STATUS	BIT_ULL(16)
+#define PWR_THRESHOLD2_STATUS	BIT_ULL(17)
+
+#define FME_PWR_XEON_LIMIT	0x18
+#define XEON_PWR_LIMIT		GENMASK_ULL(14, 0)	/* in 0.1 Watts */
+#define XEON_PWR_EN		BIT_ULL(15)
+#define FME_PWR_FPGA_LIMIT	0x20
+#define FPGA_PWR_LIMIT		GENMASK_ULL(14, 0)	/* in 0.1 Watts */
+#define FPGA_PWR_EN		BIT_ULL(15)
+
+#define PWR_THRESHOLD_MAX_IN_UW (PWR_THRESHOLD_MAX * 1000000)
+
+static int power_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
+			    u32 attr, int channel, long *val)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	switch (attr) {
+	case hwmon_power_input:
+		v = readq(feature->ioaddr + FME_PWR_STATUS);
+		*val = (long)(FIELD_GET(PWR_CONSUMED, v) * 1000000);
+		break;
+	case hwmon_power_cap:
+		v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
+		*val = (long)(FIELD_GET(PWR_THRESHOLD1, v) * 1000000);
+		break;
+	case hwmon_power_crit:
+		v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
+		*val = (long)(FIELD_GET(PWR_THRESHOLD2, v) * 1000000);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int power_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
+			     u32 attr, int channel, long val)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	int ret = 0;
+	u64 v;
+
+	if (val < 0 || val > PWR_THRESHOLD_MAX_IN_UW)
+		return -EINVAL;
+
+	val = val / 1000000;
+
+	mutex_lock(&pdata->lock);
+
+	switch (attr) {
+	case hwmon_power_cap:
+		v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
+		v &= ~PWR_THRESHOLD1;
+		v |= FIELD_PREP(PWR_THRESHOLD1, val);
+		writeq(v, feature->ioaddr + FME_PWR_THRESHOLD);
+		break;
+	case hwmon_power_crit:
+		v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
+		v &= ~PWR_THRESHOLD2;
+		v |= FIELD_PREP(PWR_THRESHOLD2, val);
+		writeq(v, feature->ioaddr + FME_PWR_THRESHOLD);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+
+	mutex_unlock(&pdata->lock);
+
+	return ret;
+}
+
+static umode_t power_hwmon_attrs_visible(const void *drvdata,
+					 enum hwmon_sensor_types type,
+					 u32 attr, int channel)
+{
+	switch (attr) {
+	case hwmon_power_input:
+		return 0444;
+	case hwmon_power_cap:
+	case hwmon_power_crit:
+		return 0644;
+	}
+
+	return 0;
+}
+
+static const u32 power_hwmon_config[] = {
+	HWMON_P_INPUT | HWMON_P_CAP | HWMON_P_CRIT,
+	0
+};
+
+static const struct hwmon_channel_info hwmon_pwr_info = {
+	.type = hwmon_power,
+	.config = power_hwmon_config,
+};
+
+static const struct hwmon_channel_info *power_hwmon_info[] = {
+	&hwmon_pwr_info,
+	NULL
+};
+
+static const struct hwmon_ops power_hwmon_ops = {
+	.is_visible = power_hwmon_attrs_visible,
+	.read = power_hwmon_read,
+	.write = power_hwmon_write,
+};
+
+static const struct hwmon_chip_info power_hwmon_chip_info = {
+	.ops = &power_hwmon_ops,
+	.info = power_hwmon_info,
+};
+
+static ssize_t power1_cap_status_show(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(PWR_THRESHOLD1_STATUS, v));
+}
+
+static ssize_t power1_crit_status_show(struct device *dev,
+				       struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(PWR_THRESHOLD2_STATUS, v));
+}
+
+static ssize_t power1_xeon_limit_show(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u16 xeon_limit = 0;
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_PWR_XEON_LIMIT);
+
+	if (FIELD_GET(XEON_PWR_EN, v))
+		xeon_limit = FIELD_GET(XEON_PWR_LIMIT, v);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n", xeon_limit * 100000);
+}
+
+static ssize_t power1_fpga_limit_show(struct device *dev,
+				      struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u16 fpga_limit = 0;
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_PWR_FPGA_LIMIT);
+
+	if (FIELD_GET(FPGA_PWR_EN, v))
+		fpga_limit = FIELD_GET(FPGA_PWR_LIMIT, v);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n", fpga_limit * 100000);
+}
+
+static ssize_t power1_ltr_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct dfl_feature *feature = dev_get_drvdata(dev);
+	u64 v;
+
+	v = readq(feature->ioaddr + FME_PWR_STATUS);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(FME_LATENCY_TOLERANCE, v));
+}
+
+static DEVICE_ATTR_RO(power1_cap_status);
+static DEVICE_ATTR_RO(power1_crit_status);
+static DEVICE_ATTR_RO(power1_xeon_limit);
+static DEVICE_ATTR_RO(power1_fpga_limit);
+static DEVICE_ATTR_RO(power1_ltr);
+
+static struct attribute *power_extra_attrs[] = {
+	&dev_attr_power1_cap_status.attr,
+	&dev_attr_power1_crit_status.attr,
+	&dev_attr_power1_xeon_limit.attr,
+	&dev_attr_power1_fpga_limit.attr,
+	&dev_attr_power1_ltr.attr,
+	NULL
+};
+
+ATTRIBUTE_GROUPS(power_extra);
+
+static int fme_power_mgmt_init(struct platform_device *pdev,
+			       struct dfl_feature *feature)
+{
+	struct device *hwmon;
+
+	dev_dbg(&pdev->dev, "FME Power Management Init.\n");
+
+	hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
+						     "dfl_fme_power", feature,
+						     &power_hwmon_chip_info,
+						     power_extra_groups);
+	if (IS_ERR(hwmon)) {
+		dev_err(&pdev->dev, "Fail to register power hwmon\n");
+		return PTR_ERR(hwmon);
+	}
+
+	return 0;
+}
+
+static void fme_power_mgmt_uinit(struct platform_device *pdev,
+				 struct dfl_feature *feature)
+{
+	dev_dbg(&pdev->dev, "FME Power Management UInit.\n");
+}
+
+static const struct dfl_feature_id fme_power_mgmt_id_table[] = {
+	{.id = FME_FEATURE_ID_POWER_MGMT,},
+	{0,}
+};
+
+static const struct dfl_feature_ops fme_power_mgmt_ops = {
+	.init = fme_power_mgmt_init,
+	.uinit = fme_power_mgmt_uinit,
+};
+
 static struct dfl_feature_driver fme_feature_drvs[] = {
 	{
 		.id_table = fme_hdr_id_table,
@@ -439,6 +682,10 @@ static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
 		.ops = &fme_thermal_mgmt_ops,
 	},
 	{
+		.id_table = fme_power_mgmt_id_table,
+		.ops = &fme_power_mgmt_ops,
+	},
+	{
 		.ops = NULL,
 	},
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (15 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 16/18] fpga: dfl: fme: add power " Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-09 16:27   ` Alan Tull
  2019-04-29  8:55 ` [PATCH v2 18/18] fpga: dfl: fme: add performance " Wu Hao
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Luwei Kang, Ananda Ravuri, Xu Yilun

This patch adds support for global error reporting for FPGA
Management Engine (FME), it introduces sysfs interfaces to
report different error detected by the hardware, and allow
user to clear errors or inject error for testing purpose.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: fix issues found in sysfs doc.
    fix returned error code issues for writable sysfs interfaces.
    (use -EINVAL if input doesn't match error code)
    reorder the sysfs groups in code.
---
 Documentation/ABI/testing/sysfs-platform-dfl-fme |  75 +++++
 drivers/fpga/Makefile                            |   2 +-
 drivers/fpga/dfl-fme-error.c                     | 385 +++++++++++++++++++++++
 drivers/fpga/dfl-fme-main.c                      |   4 +
 drivers/fpga/dfl-fme.h                           |   2 +
 drivers/fpga/dfl.h                               |   2 +
 6 files changed, 469 insertions(+), 1 deletion(-)
 create mode 100644 drivers/fpga/dfl-fme-error.c

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index e2ba92d..503984b 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -175,3 +175,78 @@ Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. Read this file to get current Latency Tolerance
 		Reporting (ltr) value. This ltr impacts the CPU low power
 		state in integrated solution.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/revision
+Date:		April 2019
+KernelVersion:	5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the revision of this global
+		error reporting private feature.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/pcie0_errors
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for errors detected on pcie0 link.
+		Write this file to clear errors logged in pcie0_errors. Write
+		fails with -EINVAL if input parsing fails or input error code
+		doesn't match.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/pcie1_errors
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for errors detected on pcie1 link.
+		Write this file to clear errors logged in pcie1_errors. Write
+		fails with -EINVAL if input parsing fails or input error code
+		doesn't match.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/nonfatal_errors
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns non-fatal errors detected.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/catfatal_errors
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. It returns catastrophic and fatal errors detected.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/inject_error
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file to check errors injected. Write this
+		file to inject errors for testing purpose. Write fails with
+		-EINVAL if input parsing fails or input inject error code isn't
+		supported.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/errors
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get errors detected by hardware.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/first_error
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the first error detected by
+		hardware.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/next_error
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-only. Read this file to get the second error detected by
+		hardware.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/errors/fme-errors/clear
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Write-only. Write error code to this file to clear all errors
+		logged in errors, first_error and next_error. Write fails with
+		-EINVAL if input parsing fails or input error code doesn't
+		match.
diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
index f1f0af7..1a9fa3d 100644
--- a/drivers/fpga/Makefile
+++ b/drivers/fpga/Makefile
@@ -38,7 +38,7 @@ obj-$(CONFIG_FPGA_DFL_FME_BRIDGE)	+= dfl-fme-br.o
 obj-$(CONFIG_FPGA_DFL_FME_REGION)	+= dfl-fme-region.o
 obj-$(CONFIG_FPGA_DFL_AFU)		+= dfl-afu.o
 
-dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o
+dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o dfl-fme-error.o
 dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
 dfl-afu-objs += dfl-afu-error.o
 
diff --git a/drivers/fpga/dfl-fme-error.c b/drivers/fpga/dfl-fme-error.c
new file mode 100644
index 0000000..772b53e
--- /dev/null
+++ b/drivers/fpga/dfl-fme-error.c
@@ -0,0 +1,385 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for FPGA Management Engine Error Management
+ *
+ * Copyright 2019 Intel Corporation, Inc.
+ *
+ * Authors:
+ *   Kang Luwei <luwei.kang@intel.com>
+ *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *   Wu Hao <hao.wu@intel.com>
+ *   Joseph Grecco <joe.grecco@intel.com>
+ *   Enno Luebbers <enno.luebbers@intel.com>
+ *   Tim Whisonant <tim.whisonant@intel.com>
+ *   Ananda Ravuri <ananda.ravuri@intel.com>
+ *   Mitchel, Henry <henry.mitchel@intel.com>
+ */
+
+#include <linux/uaccess.h>
+
+#include "dfl.h"
+#include "dfl-fme.h"
+
+#define FME_ERROR_MASK		0x8
+#define FME_ERROR		0x10
+#define MBP_ERROR		BIT_ULL(6)
+#define PCIE0_ERROR_MASK	0x18
+#define PCIE0_ERROR		0x20
+#define PCIE1_ERROR_MASK	0x28
+#define PCIE1_ERROR		0x30
+#define FME_FIRST_ERROR		0x38
+#define FME_NEXT_ERROR		0x40
+#define RAS_NONFAT_ERROR_MASK	0x48
+#define RAS_NONFAT_ERROR	0x50
+#define RAS_CATFAT_ERROR_MASK	0x58
+#define RAS_CATFAT_ERROR	0x60
+#define RAS_ERROR_INJECT	0x68
+#define INJECT_ERROR_MASK	GENMASK_ULL(2, 0)
+
+static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
+			     char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n", dfl_feature_revision(base));
+}
+static DEVICE_ATTR_RO(revision);
+
+static ssize_t pcie0_errors_show(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + PCIE0_ERROR));
+}
+
+static ssize_t pcie0_errors_store(struct device *dev,
+				  struct device_attribute *attr,
+				  const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+	int ret = 0;
+	u64 v, val;
+
+	if (kstrtou64(buf, 0, &val))
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	mutex_lock(&pdata->lock);
+	writeq(GENMASK_ULL(63, 0), base + PCIE0_ERROR_MASK);
+
+	v = readq(base + PCIE0_ERROR);
+	if (val == v)
+		writeq(v, base + PCIE0_ERROR);
+	else
+		ret = -EINVAL;
+
+	writeq(0ULL, base + PCIE0_ERROR_MASK);
+	mutex_unlock(&pdata->lock);
+	return ret ? ret : count;
+}
+static DEVICE_ATTR_RW(pcie0_errors);
+
+static ssize_t pcie1_errors_show(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + PCIE1_ERROR));
+}
+
+static ssize_t pcie1_errors_store(struct device *dev,
+				  struct device_attribute *attr,
+				  const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+	int ret = 0;
+	u64 v, val;
+
+	if (kstrtou64(buf, 0, &val))
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	mutex_lock(&pdata->lock);
+	writeq(GENMASK_ULL(63, 0), base + PCIE1_ERROR_MASK);
+
+	v = readq(base + PCIE1_ERROR);
+	if (val == v)
+		writeq(v, base + PCIE1_ERROR);
+	else
+		ret = -EINVAL;
+
+	writeq(0ULL, base + PCIE1_ERROR_MASK);
+	mutex_unlock(&pdata->lock);
+	return ret ? ret : count;
+}
+static DEVICE_ATTR_RW(pcie1_errors);
+
+static ssize_t nonfatal_errors_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + RAS_NONFAT_ERROR));
+}
+static DEVICE_ATTR_RO(nonfatal_errors);
+
+static ssize_t catfatal_errors_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + RAS_CATFAT_ERROR));
+}
+static DEVICE_ATTR_RO(catfatal_errors);
+
+static ssize_t inject_error_show(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+	u64 v;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	v = readq(base + RAS_ERROR_INJECT);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)FIELD_GET(INJECT_ERROR_MASK, v));
+}
+
+static ssize_t inject_error_store(struct device *dev,
+				  struct device_attribute *attr,
+				  const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+	u8 inject_error;
+	u64 v;
+
+	if (kstrtou8(buf, 0, &inject_error))
+		return -EINVAL;
+
+	if (inject_error & ~INJECT_ERROR_MASK)
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + RAS_ERROR_INJECT);
+	v &= ~INJECT_ERROR_MASK;
+	v |= FIELD_PREP(INJECT_ERROR_MASK, inject_error);
+	writeq(v, base + RAS_ERROR_INJECT);
+	mutex_unlock(&pdata->lock);
+
+	return count;
+}
+static DEVICE_ATTR_RW(inject_error);
+
+static struct attribute *errors_attrs[] = {
+	&dev_attr_revision.attr,
+	&dev_attr_pcie0_errors.attr,
+	&dev_attr_pcie1_errors.attr,
+	&dev_attr_nonfatal_errors.attr,
+	&dev_attr_catfatal_errors.attr,
+	&dev_attr_inject_error.attr,
+	NULL,
+};
+
+static struct attribute_group errors_attr_group = {
+	.attrs	= errors_attrs,
+};
+
+static ssize_t errors_show(struct device *dev,
+			   struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + FME_ERROR));
+}
+static DEVICE_ATTR_RO(errors);
+
+static ssize_t first_error_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + FME_FIRST_ERROR));
+}
+static DEVICE_ATTR_RO(first_error);
+
+static ssize_t next_error_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + FME_NEXT_ERROR));
+}
+static DEVICE_ATTR_RO(next_error);
+
+static ssize_t clear_store(struct device *dev, struct device_attribute *attr,
+			   const char *buf, size_t count)
+{
+	struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
+	struct device *err_dev = dev->parent;
+	void __iomem *base;
+	u64 v, val;
+	int ret = 0;
+
+	if (kstrtou64(buf, 0, &val))
+		return -EINVAL;
+
+	base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
+
+	mutex_lock(&pdata->lock);
+	writeq(GENMASK_ULL(63, 0), base + FME_ERROR_MASK);
+
+	v = readq(base + FME_ERROR);
+	if (val == v) {
+		writeq(v, base + FME_ERROR);
+		v = readq(base + FME_FIRST_ERROR);
+		writeq(v, base + FME_FIRST_ERROR);
+		v = readq(base + FME_NEXT_ERROR);
+		writeq(v, base + FME_NEXT_ERROR);
+	} else {
+		ret = -EINVAL;
+	}
+
+	/* Workaround: disable MBP_ERROR if feature revision is 0 */
+	writeq(dfl_feature_revision(base) ? 0ULL : MBP_ERROR,
+	       base + FME_ERROR_MASK);
+	mutex_unlock(&pdata->lock);
+	return ret ? ret : count;
+}
+static DEVICE_ATTR_WO(clear);
+
+static struct attribute *fme_errors_attrs[] = {
+	&dev_attr_errors.attr,
+	&dev_attr_first_error.attr,
+	&dev_attr_next_error.attr,
+	&dev_attr_clear.attr,
+	NULL,
+};
+
+static struct attribute_group fme_errors_attr_group = {
+	.attrs	= fme_errors_attrs,
+	.name	= "fme-errors",
+};
+
+static const struct attribute_group *error_groups[] = {
+	&fme_errors_attr_group,
+	&errors_attr_group,
+	NULL
+};
+
+static void fme_error_enable(struct dfl_feature *feature)
+{
+	void __iomem *base = feature->ioaddr;
+
+	/* Workaround: disable MBP_ERROR if revision is 0 */
+	writeq(dfl_feature_revision(feature->ioaddr) ? 0ULL : MBP_ERROR,
+	       base + FME_ERROR_MASK);
+	writeq(0ULL, base + PCIE0_ERROR_MASK);
+	writeq(0ULL, base + PCIE1_ERROR_MASK);
+	writeq(0ULL, base + RAS_NONFAT_ERROR_MASK);
+	writeq(0ULL, base + RAS_CATFAT_ERROR_MASK);
+}
+
+static void err_dev_release(struct device *dev)
+{
+	kfree(dev);
+}
+
+static int fme_global_err_init(struct platform_device *pdev,
+			       struct dfl_feature *feature)
+{
+	struct device *dev;
+	int ret = 0;
+
+	dev_dbg(&pdev->dev, "FME Global Error Reporting Init.\n");
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+
+	dev->parent = &pdev->dev;
+	dev->release = err_dev_release;
+	dev_set_name(dev, "errors");
+
+	fme_error_enable(feature);
+
+	ret = device_register(dev);
+	if (ret) {
+		put_device(dev);
+		return ret;
+	}
+
+	ret = sysfs_create_groups(&dev->kobj, error_groups);
+	if (ret) {
+		device_unregister(dev);
+		return ret;
+	}
+
+	feature->priv = dev;
+
+	return ret;
+}
+
+static void fme_global_err_uinit(struct platform_device *pdev,
+				 struct dfl_feature *feature)
+{
+	struct device *dev = feature->priv;
+
+	dev_dbg(&pdev->dev, "FME Global Error Reporting UInit.\n");
+
+	sysfs_remove_groups(&dev->kobj, error_groups);
+	device_unregister(dev);
+}
+
+const struct dfl_feature_id fme_global_err_id_table[] = {
+	{.id = FME_FEATURE_ID_GLOBAL_ERR,},
+	{0,}
+};
+
+const struct dfl_feature_ops fme_global_err_ops = {
+	.init = fme_global_err_init,
+	.uinit = fme_global_err_uinit,
+};
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 7005316..1986b32 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -686,6 +686,10 @@ static void fme_power_mgmt_uinit(struct platform_device *pdev,
 		.ops = &fme_power_mgmt_ops,
 	},
 	{
+		.id_table = fme_global_err_id_table,
+		.ops = &fme_global_err_ops,
+	},
+	{
 		.ops = NULL,
 	},
 };
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index 7a021c4..5fbe3f5 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -37,5 +37,7 @@ struct dfl_fme {
 
 extern const struct dfl_feature_ops fme_pr_mgmt_ops;
 extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
+extern const struct dfl_feature_ops fme_global_err_ops;
+extern const struct dfl_feature_id fme_global_err_id_table[];
 
 #endif /* __DFL_FME_H */
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index fbc57f0..6c32080 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -197,12 +197,14 @@ struct dfl_feature_driver {
  *		    feature dev (platform device)'s reources.
  * @ioaddr: mapped mmio resource address.
  * @ops: ops of this sub feature.
+ * @priv: priv data of this feature.
  */
 struct dfl_feature {
 	u64 id;
 	int resource_index;
 	void __iomem *ioaddr;
 	const struct dfl_feature_ops *ops;
+	void *priv;
 };
 
 #define DEV_STATUS_IN_USE	0
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 18/18] fpga: dfl: fme: add performance reporting support
  2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
                   ` (16 preceding siblings ...)
  2019-04-29  8:55 ` [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support Wu Hao
@ 2019-04-29  8:55 ` Wu Hao
  2019-05-16 17:28   ` Alan Tull
  17 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-04-29  8:55 UTC (permalink / raw)
  To: atull, mdf, linux-fpga, linux-kernel
  Cc: linux-api, Wu Hao, Luwei Kang, Xu Yilun

This patch adds support for performance reporting private feature
for FPGA Management Engine (FME). Actually it supports 4 categories
performance counters, 'clock', 'cache', 'iommu' and 'fabric', user
could read the performance counter via exposed sysfs interfaces.
Please refer to sysfs doc for more details.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
---
v2: improve sysfs doc
---
 Documentation/ABI/testing/sysfs-platform-dfl-fme |  93 +++
 drivers/fpga/Makefile                            |   1 +
 drivers/fpga/dfl-fme-main.c                      |   4 +
 drivers/fpga/dfl-fme-perf.c                      | 950 +++++++++++++++++++++++
 drivers/fpga/dfl-fme.h                           |   2 +
 drivers/fpga/dfl.c                               |   1 +
 drivers/fpga/dfl.h                               |   2 +
 7 files changed, 1053 insertions(+)
 create mode 100644 drivers/fpga/dfl-fme-perf.c

diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
index 503984b..a7f7eb6 100644
--- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
+++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
@@ -250,3 +250,96 @@ Description:	Write-only. Write error code to this file to clear all errors
 		logged in errors, first_error and next_error. Write fails with
 		-EINVAL if input parsing fails or input error code doesn't
 		match.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/clock
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read for Accelerator Function Unit (AFU) clock
+		counter.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/cache/freeze
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for the current status of 'cache'
+		category performance counters, and Write '1' or '0' to freeze
+		or unfreeze 'cache' performance counters.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/cache/<counter>
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read 'cache' category performance counters:
+		read_hit, read_miss, write_hit, write_miss, hold_request,
+		data_write_port_contention, tag_write_port_contention,
+		tx_req_stall, rx_req_stall and rx_eviction.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/iommu/freeze
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for the current status of 'iommu'
+		category performance counters, and Write '1' or '0' to freeze
+		or unfreeze 'iommu' performance counters.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/iommu/<sip_counter>
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read 'iommu' category 'sip' sub category
+		performance counters: iotlb_4k_hit, iotlb_2m_hit,
+		iotlb_1g_hit, slpwc_l3_hit, slpwc_l4_hit, rcc_hit,
+		rcc_miss, iotlb_4k_miss, iotlb_2m_miss, iotlb_1g_miss,
+		slpwc_l3_miss and slpwc_l4_miss.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/iommu/afu0/<counter>
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read 'iommu' category 'afuX' sub category
+		performance counters: read_transaction, write_transaction,
+		devtlb_read_hit, devtlb_write_hit, devtlb_4k_fill,
+		devtlb_2m_fill and devtlb_1g_fill.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/fabric/freeze
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for the current status of 'fabric'
+		category performance counters, and Write '1' or '0' to freeze
+		or unfreeze 'fabric' performance counters.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/fabric/<counter>
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read 'fabric' category performance counters:
+		pcie0_read, pcie0_write, pcie1_read, pcie1_write,
+		upi_read, upi_write and mmio_read.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/fabric/enable
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for current status of device level
+		fabric counters. Write "1" to enable device level fabric
+		counters. Once device level fabric counters are enabled, port
+		level fabric counters will be disabled automatically.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/fabric/port0/<counter>
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Only. Read 'fabric' category "portX" sub category
+		performance counters: pcie0_read, pcie0_write, pcie1_read,
+		pcie1_write, upi_read, upi_write and mmio_read.
+
+What:		/sys/bus/platform/devices/dfl-fme.0/perf/fabric/port0/enable
+Date:		April 2019
+KernelVersion:  5.2
+Contact:	Wu Hao <hao.wu@intel.com>
+Description:	Read-Write. Read this file for current status of port level
+		fabric counters. Write "1" to enable port level fabric counters.
+		Once port level fabric counters are enabled, device level fabric
+		counters will be disabled automatically.
diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
index 1a9fa3d..7df3971 100644
--- a/drivers/fpga/Makefile
+++ b/drivers/fpga/Makefile
@@ -39,6 +39,7 @@ obj-$(CONFIG_FPGA_DFL_FME_REGION)	+= dfl-fme-region.o
 obj-$(CONFIG_FPGA_DFL_AFU)		+= dfl-afu.o
 
 dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o dfl-fme-error.o
+dfl-fme-objs += dfl-fme-perf.o
 dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
 dfl-afu-objs += dfl-afu-error.o
 
diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
index 1986b32..221f4ec 100644
--- a/drivers/fpga/dfl-fme-main.c
+++ b/drivers/fpga/dfl-fme-main.c
@@ -690,6 +690,10 @@ static void fme_power_mgmt_uinit(struct platform_device *pdev,
 		.ops = &fme_global_err_ops,
 	},
 	{
+		.id_table = fme_perf_id_table,
+		.ops = &fme_perf_ops,
+	},
+	{
 		.ops = NULL,
 	},
 };
diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
new file mode 100644
index 0000000..035bb68
--- /dev/null
+++ b/drivers/fpga/dfl-fme-perf.c
@@ -0,0 +1,950 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for FPGA Management Engine (FME) Global Performance Reporting
+ *
+ * Copyright 2019 Intel Corporation, Inc.
+ *
+ * Authors:
+ *   Kang Luwei <luwei.kang@intel.com>
+ *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
+ *   Wu Hao <hao.wu@intel.com>
+ *   Joseph Grecco <joe.grecco@intel.com>
+ *   Enno Luebbers <enno.luebbers@intel.com>
+ *   Tim Whisonant <tim.whisonant@intel.com>
+ *   Ananda Ravuri <ananda.ravuri@intel.com>
+ *   Mitchel, Henry <henry.mitchel@intel.com>
+ */
+
+#include "dfl.h"
+#include "dfl-fme.h"
+
+/*
+ * Performance Counter Registers for Cache.
+ *
+ * Cache Events are listed below as CACHE_EVNT_*.
+ */
+#define CACHE_CTRL			0x8
+#define CACHE_RESET_CNTR		BIT_ULL(0)
+#define CACHE_FREEZE_CNTR		BIT_ULL(8)
+#define CACHE_CTRL_EVNT			GENMASK_ULL(19, 16)
+#define CACHE_EVNT_RD_HIT		0x0
+#define CACHE_EVNT_WR_HIT		0x1
+#define CACHE_EVNT_RD_MISS		0x2
+#define CACHE_EVNT_WR_MISS		0x3
+#define CACHE_EVNT_RSVD			0x4
+#define CACHE_EVNT_HOLD_REQ		0x5
+#define CACHE_EVNT_DATA_WR_PORT_CONTEN	0x6
+#define CACHE_EVNT_TAG_WR_PORT_CONTEN	0x7
+#define CACHE_EVNT_TX_REQ_STALL		0x8
+#define CACHE_EVNT_RX_REQ_STALL		0x9
+#define CACHE_EVNT_EVICTIONS		0xa
+#define CACHE_EVNT_MAX			CACHE_EVNT_EVICTIONS
+#define CACHE_CHANNEL_SEL		BIT_ULL(20)
+#define CACHE_CHANNEL_RD		0
+#define CACHE_CHANNEL_WR		1
+#define CACHE_CHANNEL_MAX		2
+#define CACHE_CNTR0			0x10
+#define CACHE_CNTR1			0x18
+#define CACHE_CNTR_EVNT_CNTR		GENMASK_ULL(47, 0)
+#define CACHE_CNTR_EVNT			GENMASK_ULL(63, 60)
+
+/*
+ * Performance Counter Registers for Fabric.
+ *
+ * Fabric Events are listed below as FAB_EVNT_*
+ */
+#define FAB_CTRL			0x20
+#define FAB_RESET_CNTR			BIT_ULL(0)
+#define FAB_FREEZE_CNTR			BIT_ULL(8)
+#define FAB_CTRL_EVNT			GENMASK_ULL(19, 16)
+#define FAB_EVNT_PCIE0_RD		0x0
+#define FAB_EVNT_PCIE0_WR		0x1
+#define FAB_EVNT_PCIE1_RD		0x2
+#define FAB_EVNT_PCIE1_WR		0x3
+#define FAB_EVNT_UPI_RD			0x4
+#define FAB_EVNT_UPI_WR			0x5
+#define FAB_EVNT_MMIO_RD		0x6
+#define FAB_EVNT_MMIO_WR		0x7
+#define FAB_EVNT_MAX			FAB_EVNT_MMIO_WR
+#define FAB_PORT_ID			GENMASK_ULL(21, 20)
+#define FAB_PORT_FILTER			BIT_ULL(23)
+#define FAB_PORT_FILTER_DISABLE		0
+#define FAB_PORT_FILTER_ENABLE		1
+#define FAB_CNTR			0x28
+#define FAB_CNTR_EVNT_CNTR		GENMASK_ULL(59, 0)
+#define FAB_CNTR_EVNT			GENMASK_ULL(63, 60)
+
+/*
+ * Performance Counter Registers for Clock.
+ *
+ * Clock Counter can't be reset or frozen by SW.
+ */
+#define CLK_CNTR			0x30
+
+/*
+ * Performance Counter Registers for IOMMU / VT-D.
+ *
+ * VT-D Events are listed below as VTD_EVNT_* and VTD_SIP_EVNT_*
+ */
+#define VTD_CTRL			0x38
+#define VTD_RESET_CNTR			BIT_ULL(0)
+#define VTD_FREEZE_CNTR			BIT_ULL(8)
+#define VTD_CTRL_EVNT			GENMASK_ULL(19, 16)
+#define VTD_EVNT_AFU_MEM_RD_TRANS	0x0
+#define VTD_EVNT_AFU_MEM_WR_TRANS	0x1
+#define VTD_EVNT_AFU_DEVTLB_RD_HIT	0x2
+#define VTD_EVNT_AFU_DEVTLB_WR_HIT	0x3
+#define VTD_EVNT_DEVTLB_4K_FILL		0x4
+#define VTD_EVNT_DEVTLB_2M_FILL		0x5
+#define VTD_EVNT_DEVTLB_1G_FILL		0x6
+#define VTD_EVNT_MAX			VTD_EVNT_DEVTLB_1G_FILL
+#define VTD_CNTR			0x40
+#define VTD_CNTR_EVNT			GENMASK_ULL(63, 60)
+#define VTD_CNTR_EVNT_CNTR		GENMASK_ULL(47, 0)
+#define VTD_SIP_CTRL			0x48
+#define VTD_SIP_RESET_CNTR		BIT_ULL(0)
+#define VTD_SIP_FREEZE_CNTR		BIT_ULL(8)
+#define VTD_SIP_CTRL_EVNT		GENMASK_ULL(19, 16)
+#define VTD_SIP_EVNT_IOTLB_4K_HIT	0x0
+#define VTD_SIP_EVNT_IOTLB_2M_HIT	0x1
+#define VTD_SIP_EVNT_IOTLB_1G_HIT	0x2
+#define VTD_SIP_EVNT_SLPWC_L3_HIT	0x3
+#define VTD_SIP_EVNT_SLPWC_L4_HIT	0x4
+#define VTD_SIP_EVNT_RCC_HIT		0x5
+#define VTD_SIP_EVNT_IOTLB_4K_MISS	0x6
+#define VTD_SIP_EVNT_IOTLB_2M_MISS	0x7
+#define VTD_SIP_EVNT_IOTLB_1G_MISS	0x8
+#define VTD_SIP_EVNT_SLPWC_L3_MISS	0x9
+#define VTD_SIP_EVNT_SLPWC_L4_MISS	0xa
+#define VTD_SIP_EVNT_RCC_MISS		0xb
+#define VTD_SIP_EVNT_MAX		VTD_SIP_EVNT_RCC_MISS
+#define VTD_SIP_CNTR			0X50
+#define VTD_SIP_CNTR_EVNT		GENMASK_ULL(63, 60)
+#define VTD_SIP_CNTR_EVNT_CNTR		GENMASK_ULL(47, 0)
+
+#define PERF_OBJ_ROOT_ID		(~0)
+
+#define PERF_TIMEOUT			30
+
+/**
+ * struct perf_object - object of performance counter
+ *
+ * @id: instance id. PERF_OBJ_ROOT_ID indicates it is a parent object which
+ *      counts performance counters for all instances.
+ * @attr_groups: the sysfs files are associated with this object.
+ * @feature: pointer to related private feature.
+ * @node: used to link itself to parent's children list.
+ * @children: used to link its children objects together.
+ * @kobj: generic kobject interface.
+ *
+ * 'node' and 'children' are used to construct parent-children hierarchy.
+ */
+struct perf_object {
+	int id;
+	const struct attribute_group **attr_groups;
+	struct dfl_feature *feature;
+
+	struct list_head node;
+	struct list_head children;
+	struct kobject kobj;
+};
+
+/**
+ * struct perf_obj_attribute - attribute of perf object
+ *
+ * @attr: attribute of this perf object.
+ * @show: show callback for sysfs attribute.
+ * @store: store callback for sysfs attribute.
+ */
+struct perf_obj_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct perf_object *pobj, char *buf);
+	ssize_t (*store)(struct perf_object *pobj,
+			 const char *buf, size_t n);
+};
+
+#define to_perf_obj_attr(_attr)					\
+		container_of(_attr, struct perf_obj_attribute, attr)
+#define to_perf_obj(_kobj)					\
+		container_of(_kobj, struct perf_object, kobj)
+
+#define PERF_OBJ_ATTR(_name, _filename, _mode, _show, _store)	\
+struct perf_obj_attribute perf_obj_attr_##_name =		\
+	__ATTR(_filename, _mode, _show, _store)
+
+#define PERF_OBJ_ATTR_RW(_name)					\
+	struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_RW(_name)
+#define PERF_OBJ_ATTR_RO(_name)					\
+	struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_RO(_name)
+#define PERF_OBJ_ATTR_WO(_name)					\
+	struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_WO(_name)
+
+static ssize_t perf_obj_attr_show(struct kobject *kobj,
+				  struct attribute *__attr, char *buf)
+{
+	struct perf_obj_attribute *attr = to_perf_obj_attr(__attr);
+	struct perf_object *pobj = to_perf_obj(kobj);
+	ssize_t ret = -EIO;
+
+	if (attr->show)
+		ret = attr->show(pobj, buf);
+	return ret;
+}
+
+static ssize_t perf_obj_attr_store(struct kobject *kobj,
+				   struct attribute *__attr,
+				   const char *buf, size_t n)
+{
+	struct perf_obj_attribute *attr = to_perf_obj_attr(__attr);
+	struct perf_object *pobj = to_perf_obj(kobj);
+	ssize_t ret = -EIO;
+
+	if (attr->store)
+		ret = attr->store(pobj, buf, n);
+	return ret;
+}
+
+static const struct sysfs_ops perf_obj_sysfs_ops = {
+	.show = perf_obj_attr_show,
+	.store = perf_obj_attr_store,
+};
+
+static void perf_obj_release(struct kobject *kobj)
+{
+	kfree(to_perf_obj(kobj));
+}
+
+static struct kobj_type perf_obj_ktype = {
+	.sysfs_ops = &perf_obj_sysfs_ops,
+	.release = perf_obj_release,
+};
+
+static struct perf_object *
+create_perf_obj(struct dfl_feature *feature, struct kobject *parent, int id,
+		const struct attribute_group **groups, const char *name)
+{
+	struct perf_object *pobj;
+	int ret;
+
+	pobj = kzalloc(sizeof(*pobj), GFP_KERNEL);
+	if (!pobj)
+		return ERR_PTR(-ENOMEM);
+
+	pobj->id = id;
+	pobj->feature = feature;
+	pobj->attr_groups = groups;
+	INIT_LIST_HEAD(&pobj->node);
+	INIT_LIST_HEAD(&pobj->children);
+
+	if (id != PERF_OBJ_ROOT_ID)
+		ret = kobject_init_and_add(&pobj->kobj, &perf_obj_ktype,
+					   parent, "%s%d", name, id);
+	else
+		ret = kobject_init_and_add(&pobj->kobj, &perf_obj_ktype,
+					   parent, "%s", name);
+	if (ret)
+		goto put_exit;
+
+	if (pobj->attr_groups) {
+		ret = sysfs_create_groups(&pobj->kobj, pobj->attr_groups);
+		if (ret)
+			goto del_exit;
+	}
+
+	return pobj;
+
+del_exit:
+	kobject_del(&pobj->kobj);
+put_exit:
+	kobject_put(&pobj->kobj);
+	return ERR_PTR(ret);
+}
+
+/*
+ * Counter Sysfs Interface for Clock.
+ */
+static ssize_t clock_show(struct perf_object *pobj, char *buf)
+{
+	void __iomem *base = pobj->feature->ioaddr;
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
+			 (unsigned long long)readq(base + CLK_CNTR));
+}
+static PERF_OBJ_ATTR_RO(clock);
+
+static struct attribute *clock_attrs[] = {
+	&perf_obj_attr_clock.attr,
+	NULL,
+};
+
+static struct attribute_group clock_attr_group = {
+	.attrs = clock_attrs,
+};
+
+static const struct attribute_group *perf_dev_attr_groups[] = {
+	&clock_attr_group,
+	NULL,
+};
+
+static void destroy_perf_obj(struct perf_object *pobj)
+{
+	struct perf_object *obj, *obj_tmp;
+
+	list_for_each_entry_safe(obj, obj_tmp, &pobj->children, node)
+		destroy_perf_obj(obj);
+
+	list_del(&pobj->node);
+	if (pobj->attr_groups)
+		sysfs_remove_groups(&pobj->kobj, pobj->attr_groups);
+	kobject_put(&pobj->kobj);
+}
+
+static struct perf_object *create_perf_dev(struct dfl_feature *feature)
+{
+	struct platform_device *pdev = feature->pdev;
+
+	return create_perf_obj(feature, &pdev->dev.kobj, PERF_OBJ_ROOT_ID,
+			       perf_dev_attr_groups, "perf");
+}
+
+/*
+ * Counter Sysfs Interfaces for Cache.
+ */
+static ssize_t cache_freeze_show(struct perf_object *pobj, char *buf)
+{
+	void __iomem *base = pobj->feature->ioaddr;
+	u64 v;
+
+	v = readq(base + CACHE_CTRL);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(CACHE_FREEZE_CNTR, v));
+}
+
+static ssize_t cache_freeze_store(struct perf_object *pobj,
+				  const char *buf, size_t n)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	bool state;
+	u64 v;
+
+	if (strtobool(buf, &state))
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + CACHE_CTRL);
+	v &= ~CACHE_FREEZE_CNTR;
+	v |= FIELD_PREP(CACHE_FREEZE_CNTR, state ? 1 : 0);
+	writeq(v, base + CACHE_CTRL);
+	mutex_unlock(&pdata->lock);
+
+	return n;
+}
+static PERF_OBJ_ATTR(cache_freeze, freeze, 0644,
+		     cache_freeze_show, cache_freeze_store);
+
+static ssize_t read_cache_counter(struct perf_object *pobj, char *buf,
+				  u8 channel, u8 event)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	u64 v, count;
+
+	if (event > CACHE_EVNT_MAX || channel > CACHE_CHANNEL_MAX)
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	/* set channel access type and cache event code. */
+	v = readq(base + CACHE_CTRL);
+	v &= ~(CACHE_CHANNEL_SEL | CACHE_CTRL_EVNT);
+	v |= FIELD_PREP(CACHE_CHANNEL_SEL, channel);
+	v |= FIELD_PREP(CACHE_CTRL_EVNT, event);
+	writeq(v, base + CACHE_CTRL);
+
+	if (readq_poll_timeout(base + CACHE_CNTR0, v,
+			       FIELD_GET(CACHE_CNTR_EVNT, v) == event,
+			       1, PERF_TIMEOUT)) {
+		dev_err(&feature->pdev->dev, "timeout, unmatched cache event type in counter registers.\n");
+		mutex_unlock(&pdata->lock);
+		return -ETIMEDOUT;
+	}
+
+	v = readq(base + CACHE_CNTR0);
+	count = FIELD_GET(CACHE_CNTR_EVNT_CNTR, v);
+	v = readq(base + CACHE_CNTR1);
+	count += FIELD_GET(CACHE_CNTR_EVNT_CNTR, v);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
+}
+
+#define CACHE_SHOW(name, type, event)					\
+static ssize_t name##_show(struct perf_object *pobj, char *buf)		\
+{									\
+	return read_cache_counter(pobj, buf, type, event);		\
+}									\
+static PERF_OBJ_ATTR_RO(name)
+
+CACHE_SHOW(read_hit, CACHE_CHANNEL_RD, CACHE_EVNT_RD_HIT);
+CACHE_SHOW(read_miss, CACHE_CHANNEL_RD, CACHE_EVNT_RD_MISS);
+CACHE_SHOW(write_hit, CACHE_CHANNEL_WR, CACHE_EVNT_WR_HIT);
+CACHE_SHOW(write_miss, CACHE_CHANNEL_WR, CACHE_EVNT_WR_MISS);
+CACHE_SHOW(hold_request, CACHE_CHANNEL_RD, CACHE_EVNT_HOLD_REQ);
+CACHE_SHOW(tx_req_stall, CACHE_CHANNEL_RD, CACHE_EVNT_TX_REQ_STALL);
+CACHE_SHOW(rx_req_stall, CACHE_CHANNEL_RD, CACHE_EVNT_RX_REQ_STALL);
+CACHE_SHOW(rx_eviction, CACHE_CHANNEL_RD, CACHE_EVNT_EVICTIONS);
+CACHE_SHOW(data_write_port_contention, CACHE_CHANNEL_WR,
+	   CACHE_EVNT_DATA_WR_PORT_CONTEN);
+CACHE_SHOW(tag_write_port_contention, CACHE_CHANNEL_WR,
+	   CACHE_EVNT_TAG_WR_PORT_CONTEN);
+
+static struct attribute *cache_attrs[] = {
+	&perf_obj_attr_read_hit.attr,
+	&perf_obj_attr_read_miss.attr,
+	&perf_obj_attr_write_hit.attr,
+	&perf_obj_attr_write_miss.attr,
+	&perf_obj_attr_hold_request.attr,
+	&perf_obj_attr_data_write_port_contention.attr,
+	&perf_obj_attr_tag_write_port_contention.attr,
+	&perf_obj_attr_tx_req_stall.attr,
+	&perf_obj_attr_rx_req_stall.attr,
+	&perf_obj_attr_rx_eviction.attr,
+	&perf_obj_attr_cache_freeze.attr,
+	NULL,
+};
+
+static struct attribute_group cache_attr_group = {
+	.attrs = cache_attrs,
+};
+
+static const struct attribute_group *cache_attr_groups[] = {
+	&cache_attr_group,
+	NULL,
+};
+
+static int create_perf_cache_obj(struct perf_object *perf_dev)
+{
+	struct perf_object *pobj;
+
+	pobj = create_perf_obj(perf_dev->feature, &perf_dev->kobj,
+			       PERF_OBJ_ROOT_ID, cache_attr_groups, "cache");
+	if (IS_ERR(pobj))
+		return PTR_ERR(pobj);
+
+	list_add(&pobj->node, &perf_dev->children);
+
+	return 0;
+}
+
+/*
+ * Counter Sysfs Interfaces for VT-D / IOMMU.
+ */
+static ssize_t vtd_freeze_show(struct perf_object *pobj, char *buf)
+{
+	void __iomem *base = pobj->feature->ioaddr;
+	u64 v;
+
+	v = readq(base + VTD_CTRL);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(VTD_FREEZE_CNTR, v));
+}
+
+static ssize_t vtd_freeze_store(struct perf_object *pobj,
+				const char *buf, size_t n)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	bool state;
+	u64 v;
+
+	if (strtobool(buf, &state))
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + VTD_CTRL);
+	v &= ~VTD_FREEZE_CNTR;
+	v |= FIELD_PREP(VTD_FREEZE_CNTR, state ? 1 : 0);
+	writeq(v, base + VTD_CTRL);
+	mutex_unlock(&pdata->lock);
+
+	return n;
+}
+static PERF_OBJ_ATTR(vtd_freeze, freeze, 0644,
+		     vtd_freeze_show, vtd_freeze_store);
+
+static struct attribute *iommu_top_attrs[] = {
+	&perf_obj_attr_vtd_freeze.attr,
+	NULL,
+};
+
+static struct attribute_group iommu_top_attr_group = {
+	.attrs = iommu_top_attrs,
+};
+
+static ssize_t read_iommu_sip_counter(struct perf_object *pobj,
+				      u8 event, char *buf)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	u64 v, count;
+
+	if (event > VTD_SIP_EVNT_MAX)
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + VTD_SIP_CTRL);
+	v &= ~VTD_SIP_CTRL_EVNT;
+	v |= FIELD_PREP(VTD_SIP_CTRL_EVNT, event);
+	writeq(v, base + VTD_SIP_CTRL);
+
+	if (readq_poll_timeout(base + VTD_SIP_CNTR, v,
+			       FIELD_GET(VTD_SIP_CNTR_EVNT, v) == event,
+			       1, PERF_TIMEOUT)) {
+		dev_err(&feature->pdev->dev, "timeout, unmatched VTd SIP event type in counter registers\n");
+		mutex_unlock(&pdata->lock);
+		return -ETIMEDOUT;
+	}
+
+	v = readq(base + VTD_SIP_CNTR);
+	count = FIELD_GET(VTD_SIP_CNTR_EVNT_CNTR, v);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
+}
+
+#define VTD_SIP_SHOW(name, event)					\
+static ssize_t name##_show(struct perf_object *pobj, char *buf)		\
+{									\
+	return read_iommu_sip_counter(pobj, event, buf);		\
+}									\
+static PERF_OBJ_ATTR_RO(name)
+
+VTD_SIP_SHOW(iotlb_4k_hit, VTD_SIP_EVNT_IOTLB_4K_HIT);
+VTD_SIP_SHOW(iotlb_2m_hit, VTD_SIP_EVNT_IOTLB_2M_HIT);
+VTD_SIP_SHOW(iotlb_1g_hit, VTD_SIP_EVNT_IOTLB_1G_HIT);
+VTD_SIP_SHOW(slpwc_l3_hit, VTD_SIP_EVNT_SLPWC_L3_HIT);
+VTD_SIP_SHOW(slpwc_l4_hit, VTD_SIP_EVNT_SLPWC_L4_HIT);
+VTD_SIP_SHOW(rcc_hit, VTD_SIP_EVNT_RCC_HIT);
+VTD_SIP_SHOW(iotlb_4k_miss, VTD_SIP_EVNT_IOTLB_4K_MISS);
+VTD_SIP_SHOW(iotlb_2m_miss, VTD_SIP_EVNT_IOTLB_2M_MISS);
+VTD_SIP_SHOW(iotlb_1g_miss, VTD_SIP_EVNT_IOTLB_1G_MISS);
+VTD_SIP_SHOW(slpwc_l3_miss, VTD_SIP_EVNT_SLPWC_L3_MISS);
+VTD_SIP_SHOW(slpwc_l4_miss, VTD_SIP_EVNT_SLPWC_L4_MISS);
+VTD_SIP_SHOW(rcc_miss, VTD_SIP_EVNT_RCC_MISS);
+
+static struct attribute *iommu_sip_attrs[] = {
+	&perf_obj_attr_iotlb_4k_hit.attr,
+	&perf_obj_attr_iotlb_2m_hit.attr,
+	&perf_obj_attr_iotlb_1g_hit.attr,
+	&perf_obj_attr_slpwc_l3_hit.attr,
+	&perf_obj_attr_slpwc_l4_hit.attr,
+	&perf_obj_attr_rcc_hit.attr,
+	&perf_obj_attr_iotlb_4k_miss.attr,
+	&perf_obj_attr_iotlb_2m_miss.attr,
+	&perf_obj_attr_iotlb_1g_miss.attr,
+	&perf_obj_attr_slpwc_l3_miss.attr,
+	&perf_obj_attr_slpwc_l4_miss.attr,
+	&perf_obj_attr_rcc_miss.attr,
+	NULL,
+};
+
+static struct attribute_group iommu_sip_attr_group = {
+	.attrs = iommu_sip_attrs,
+};
+
+static const struct attribute_group *iommu_top_attr_groups[] = {
+	&iommu_top_attr_group,
+	&iommu_sip_attr_group,
+	NULL,
+};
+
+static ssize_t read_iommu_counter(struct perf_object *pobj, u8 event, char *buf)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	u64 v, count;
+
+	if (event > VTD_EVNT_MAX)
+		return -EINVAL;
+
+	event += pobj->id;
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + VTD_CTRL);
+	v &= ~VTD_CTRL_EVNT;
+	v |= FIELD_PREP(VTD_CTRL_EVNT, event);
+	writeq(v, base + VTD_CTRL);
+
+	if (readq_poll_timeout(base + VTD_CNTR, v,
+			       FIELD_GET(VTD_CNTR_EVNT, v) == event, 1,
+			       PERF_TIMEOUT)) {
+		dev_err(&feature->pdev->dev, "timeout, unmatched VTd event type in counter registers\n");
+		mutex_unlock(&pdata->lock);
+		return -ETIMEDOUT;
+	}
+
+	v = readq(base + VTD_CNTR);
+	count = FIELD_GET(VTD_CNTR_EVNT_CNTR, v);
+	mutex_unlock(&pdata->lock);
+
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
+}
+
+#define VTD_SHOW(name, base_event)					\
+static ssize_t name##_show(struct perf_object *pobj, char *buf)		\
+{									\
+	return read_iommu_counter(pobj, base_event, buf);		\
+}									\
+static PERF_OBJ_ATTR_RO(name)
+
+VTD_SHOW(read_transaction, VTD_EVNT_AFU_MEM_RD_TRANS);
+VTD_SHOW(write_transaction, VTD_EVNT_AFU_MEM_WR_TRANS);
+VTD_SHOW(devtlb_read_hit, VTD_EVNT_AFU_DEVTLB_RD_HIT);
+VTD_SHOW(devtlb_write_hit, VTD_EVNT_AFU_DEVTLB_WR_HIT);
+VTD_SHOW(devtlb_4k_fill, VTD_EVNT_DEVTLB_4K_FILL);
+VTD_SHOW(devtlb_2m_fill, VTD_EVNT_DEVTLB_2M_FILL);
+VTD_SHOW(devtlb_1g_fill, VTD_EVNT_DEVTLB_1G_FILL);
+
+static struct attribute *iommu_attrs[] = {
+	&perf_obj_attr_read_transaction.attr,
+	&perf_obj_attr_write_transaction.attr,
+	&perf_obj_attr_devtlb_read_hit.attr,
+	&perf_obj_attr_devtlb_write_hit.attr,
+	&perf_obj_attr_devtlb_4k_fill.attr,
+	&perf_obj_attr_devtlb_2m_fill.attr,
+	&perf_obj_attr_devtlb_1g_fill.attr,
+	NULL,
+};
+
+static struct attribute_group iommu_attr_group = {
+	.attrs = iommu_attrs,
+};
+
+static const struct attribute_group *iommu_attr_groups[] = {
+	&iommu_attr_group,
+	NULL,
+};
+
+#define PERF_MAX_PORT_NUM	1
+
+static int create_perf_iommu_obj(struct perf_object *perf_dev)
+{
+	struct dfl_feature *feature = perf_dev->feature;
+	struct device *dev = &feature->pdev->dev;
+	struct perf_object *pobj, *obj;
+	void __iomem *base;
+	u64 v;
+	int i;
+
+	/* check if iommu is not supported on this device. */
+	base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
+	v = readq(base + FME_HDR_CAP);
+	if (!FIELD_GET(FME_CAP_IOMMU_AVL, v))
+		return 0;
+
+	pobj = create_perf_obj(feature, &perf_dev->kobj, PERF_OBJ_ROOT_ID,
+			       iommu_top_attr_groups, "iommu");
+	if (IS_ERR(pobj))
+		return PTR_ERR(pobj);
+
+	list_add(&pobj->node, &perf_dev->children);
+
+	for (i = 0; i < PERF_MAX_PORT_NUM; i++) {
+		obj = create_perf_obj(feature, &pobj->kobj, i,
+				      iommu_attr_groups, "afu");
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		list_add(&obj->node, &pobj->children);
+	}
+
+	return 0;
+}
+
+/*
+ * Counter Sysfs Interfaces for Fabric
+ */
+static bool fabric_pobj_is_enabled(struct perf_object *pobj)
+{
+	struct dfl_feature *feature = pobj->feature;
+	void __iomem *base = feature->ioaddr;
+	u64 v;
+
+	v = readq(base + FAB_CTRL);
+
+	if (FIELD_GET(FAB_PORT_FILTER, v) == FAB_PORT_FILTER_DISABLE)
+		return pobj->id == PERF_OBJ_ROOT_ID;
+
+	return pobj->id == FIELD_GET(FAB_PORT_ID, v);
+}
+
+static ssize_t read_fabric_counter(struct perf_object *pobj,
+				   u8 event, char *buf)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	u64 v, count = 0;
+
+	if (event > FAB_EVNT_MAX)
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	/* if it is disabled, force the counter to return zero. */
+	if (!fabric_pobj_is_enabled(pobj))
+		goto exit;
+
+	v = readq(base + FAB_CTRL);
+	v &= ~FAB_CTRL_EVNT;
+	v |= FIELD_PREP(FAB_CTRL_EVNT, event);
+	writeq(v, base + FAB_CTRL);
+
+	if (readq_poll_timeout(base + FAB_CNTR, v,
+			       FIELD_GET(FAB_CNTR_EVNT, v) == event,
+			       1, PERF_TIMEOUT)) {
+		dev_err(&feature->pdev->dev, "timeout, unmatched fab event type in counter registers.\n");
+		mutex_unlock(&pdata->lock);
+		return -ETIMEDOUT;
+	}
+
+	v = readq(base + FAB_CNTR);
+	count = FIELD_GET(FAB_CNTR_EVNT_CNTR, v);
+exit:
+	mutex_unlock(&pdata->lock);
+	return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
+}
+
+#define FAB_SHOW(name, event)						\
+static ssize_t name##_show(struct perf_object *pobj, char *buf)		\
+{									\
+	return read_fabric_counter(pobj, event, buf);			\
+}									\
+static PERF_OBJ_ATTR_RO(name)
+
+FAB_SHOW(pcie0_read, FAB_EVNT_PCIE0_RD);
+FAB_SHOW(pcie0_write, FAB_EVNT_PCIE0_WR);
+FAB_SHOW(pcie1_read, FAB_EVNT_PCIE1_RD);
+FAB_SHOW(pcie1_write, FAB_EVNT_PCIE1_WR);
+FAB_SHOW(upi_read, FAB_EVNT_UPI_RD);
+FAB_SHOW(upi_write, FAB_EVNT_UPI_WR);
+FAB_SHOW(mmio_read, FAB_EVNT_MMIO_RD);
+FAB_SHOW(mmio_write, FAB_EVNT_MMIO_WR);
+
+static ssize_t fab_enable_show(struct perf_object *pobj, char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)!!fabric_pobj_is_enabled(pobj));
+}
+
+/*
+ * If enable one port or all port event counter in fabric, other
+ * fabric event counter originally enabled will be disable automatically.
+ */
+static ssize_t fab_enable_store(struct perf_object *pobj,
+				const char *buf, size_t n)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	bool state;
+	u64 v;
+
+	if (strtobool(buf, &state) || !state)
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	/* if it is already enabled. */
+	if (fabric_pobj_is_enabled(pobj))
+		return n;
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + FAB_CTRL);
+	v &= ~(FAB_PORT_FILTER | FAB_PORT_ID);
+
+	if (pobj->id == PERF_OBJ_ROOT_ID) {
+		v |= FIELD_PREP(FAB_PORT_FILTER, FAB_PORT_FILTER_DISABLE);
+	} else {
+		v |= FIELD_PREP(FAB_PORT_FILTER, FAB_PORT_FILTER_ENABLE);
+		v |= FIELD_PREP(FAB_PORT_ID, pobj->id);
+	}
+	writeq(v, base + FAB_CTRL);
+	mutex_unlock(&pdata->lock);
+
+	return n;
+}
+static PERF_OBJ_ATTR(fab_enable, enable, 0644,
+		     fab_enable_show, fab_enable_store);
+
+static struct attribute *fabric_attrs[] = {
+	&perf_obj_attr_pcie0_read.attr,
+	&perf_obj_attr_pcie0_write.attr,
+	&perf_obj_attr_pcie1_read.attr,
+	&perf_obj_attr_pcie1_write.attr,
+	&perf_obj_attr_upi_read.attr,
+	&perf_obj_attr_upi_write.attr,
+	&perf_obj_attr_mmio_read.attr,
+	&perf_obj_attr_mmio_write.attr,
+	&perf_obj_attr_fab_enable.attr,
+	NULL,
+};
+
+static struct attribute_group fabric_attr_group = {
+	.attrs = fabric_attrs,
+};
+
+static const struct attribute_group *fabric_attr_groups[] = {
+	&fabric_attr_group,
+	NULL,
+};
+
+static ssize_t fab_freeze_show(struct perf_object *pobj, char *buf)
+{
+	void __iomem *base = pobj->feature->ioaddr;
+	u64 v;
+
+	v = readq(base + FAB_CTRL);
+
+	return scnprintf(buf, PAGE_SIZE, "%u\n",
+			 (unsigned int)FIELD_GET(FAB_FREEZE_CNTR, v));
+}
+
+static ssize_t fab_freeze_store(struct perf_object *pobj,
+				const char *buf, size_t n)
+{
+	struct dfl_feature *feature = pobj->feature;
+	struct dfl_feature_platform_data *pdata;
+	void __iomem *base = feature->ioaddr;
+	bool state;
+	u64 v;
+
+	if (strtobool(buf, &state))
+		return -EINVAL;
+
+	pdata = dev_get_platdata(&feature->pdev->dev);
+
+	mutex_lock(&pdata->lock);
+	v = readq(base + FAB_CTRL);
+	v &= ~FAB_FREEZE_CNTR;
+	v |= FIELD_PREP(FAB_FREEZE_CNTR, state ? 1 : 0);
+	writeq(v, base + FAB_CTRL);
+	mutex_unlock(&pdata->lock);
+
+	return n;
+}
+static PERF_OBJ_ATTR(fab_freeze, freeze, 0644,
+		     fab_freeze_show, fab_freeze_store);
+
+static struct attribute *fabric_top_attrs[] = {
+	&perf_obj_attr_fab_freeze.attr,
+	NULL,
+};
+
+static struct attribute_group fabric_top_attr_group = {
+	.attrs = fabric_top_attrs,
+};
+
+static const struct attribute_group *fabric_top_attr_groups[] = {
+	&fabric_attr_group,
+	&fabric_top_attr_group,
+	NULL,
+};
+
+static int create_perf_fabric_obj(struct perf_object *perf_dev)
+{
+	struct perf_object *pobj, *obj;
+	int i;
+
+	pobj = create_perf_obj(perf_dev->feature, &perf_dev->kobj,
+			       PERF_OBJ_ROOT_ID, fabric_top_attr_groups,
+			       "fabric");
+	if (IS_ERR(pobj))
+		return PTR_ERR(pobj);
+
+	list_add(&pobj->node, &perf_dev->children);
+
+	for (i = 0; i < PERF_MAX_PORT_NUM; i++) {
+		obj = create_perf_obj(perf_dev->feature, &pobj->kobj, i,
+				      fabric_attr_groups, "port");
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		list_add(&obj->node, &pobj->children);
+	}
+
+	return 0;
+}
+
+static int fme_perf_init(struct platform_device *pdev,
+			 struct dfl_feature *feature)
+{
+	struct perf_object *perf_dev;
+	int ret;
+
+	perf_dev = create_perf_dev(feature);
+	if (IS_ERR(perf_dev))
+		return PTR_ERR(perf_dev);
+
+	ret = create_perf_fabric_obj(perf_dev);
+	if (ret)
+		goto done;
+
+	if (feature->id == FME_FEATURE_ID_GLOBAL_IPERF) {
+		/*
+		 * Cache and IOMMU(VT-D) performance counters are not supported
+		 * on discreted solutions e.g. Intel Programmable Acceleration
+		 * Card based on PCIe.
+		 */
+		ret = create_perf_cache_obj(perf_dev);
+		if (ret)
+			goto done;
+
+		ret = create_perf_iommu_obj(perf_dev);
+		if (ret)
+			goto done;
+	}
+
+	feature->priv = perf_dev;
+	return 0;
+
+done:
+	destroy_perf_obj(perf_dev);
+	return ret;
+}
+
+static void fme_perf_uinit(struct platform_device *pdev,
+			   struct dfl_feature *feature)
+{
+	struct perf_object *perf_dev = feature->priv;
+
+	destroy_perf_obj(perf_dev);
+}
+
+const struct dfl_feature_id fme_perf_id_table[] = {
+	{.id = FME_FEATURE_ID_GLOBAL_IPERF,},
+	{.id = FME_FEATURE_ID_GLOBAL_DPERF,},
+	{0,}
+};
+
+const struct dfl_feature_ops fme_perf_ops = {
+	.init = fme_perf_init,
+	.uinit = fme_perf_uinit,
+};
diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
index 5fbe3f5..dc71048 100644
--- a/drivers/fpga/dfl-fme.h
+++ b/drivers/fpga/dfl-fme.h
@@ -39,5 +39,7 @@ struct dfl_fme {
 extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
 extern const struct dfl_feature_ops fme_global_err_ops;
 extern const struct dfl_feature_id fme_global_err_id_table[];
+extern const struct dfl_feature_ops fme_perf_ops;
+extern const struct dfl_feature_id fme_perf_id_table[];
 
 #endif /* __DFL_FME_H */
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index 65f91ef..637692a 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -507,6 +507,7 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
 		struct dfl_feature *feature = &pdata->features[index];
 
 		/* save resource information for each feature */
+		feature->pdev = fdev;
 		feature->id = finfo->fid;
 		feature->resource_index = index;
 		feature->ioaddr = finfo->ioaddr;
diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
index 6c32080..bf23436 100644
--- a/drivers/fpga/dfl.h
+++ b/drivers/fpga/dfl.h
@@ -191,6 +191,7 @@ struct dfl_feature_driver {
 /**
  * struct dfl_feature - sub feature of the feature devices
  *
+ * @pdev: parent platform device.
  * @id: sub feature id.
  * @resource_index: each sub feature has one mmio resource for its registers.
  *		    this index is used to find its mmio resource from the
@@ -200,6 +201,7 @@ struct dfl_feature_driver {
  * @priv: priv data of this feature.
  */
 struct dfl_feature {
+	struct platform_device *pdev;
 	u64 id;
 	int resource_index;
 	void __iomem *ioaddr;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR
  2019-04-29  8:55 ` [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR Wu Hao
@ 2019-05-07 17:26   ` Moritz Fischer
  2019-05-08 17:58     ` Alan Tull
  0 siblings, 1 reply; 42+ messages in thread
From: Moritz Fischer @ 2019-05-07 17:26 UTC (permalink / raw)
  To: Wu Hao; +Cc: atull, mdf, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Mon, Apr 29, 2019 at 04:55:35PM +0800, Wu Hao wrote:
> This patch removes copy_to_user() code in partial reconfiguration
> ioctl, as it's useless as user never needs to read the data
> structure after ioctl.
> 
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
Acked-by: Moritz Fischer <mdf@kernel.org>
> ---
> v2: clean up code split from patch 2 in v1 patchset.
> ---
>  drivers/fpga/dfl-fme-pr.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
> index d9ca955..6ec0f09 100644
> --- a/drivers/fpga/dfl-fme-pr.c
> +++ b/drivers/fpga/dfl-fme-pr.c
> @@ -159,9 +159,6 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>  	mutex_unlock(&pdata->lock);
>  free_exit:
>  	vfree(buf);
> -	if (copy_to_user((void __user *)arg, &port_pr, minsz))
> -		return -EFAULT;
> -
>  	return ret;
>  }
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 03/18] fpga: dfl: fme: align PR buffer size per PR datawidth
  2019-04-29  8:55 ` [PATCH v2 03/18] fpga: dfl: fme: align PR buffer size per PR datawidth Wu Hao
@ 2019-05-07 17:27   ` Moritz Fischer
  0 siblings, 0 replies; 42+ messages in thread
From: Moritz Fischer @ 2019-05-07 17:27 UTC (permalink / raw)
  To: Wu Hao; +Cc: atull, mdf, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Mon, Apr 29, 2019 at 04:55:36PM +0800, Wu Hao wrote:
> Current driver checks if input bitstream file size is aligned or
> not per PR data width (default 32bits). It requires one additional
> step for end user when they generate the bitstream file, padding
> extra zeros to bitstream file to align its size per PR data width,
> but they don't have to as hardware will drop extra padding bytes
> automatically.
> 
> In order to simplify the user steps, this patch aligns PR buffer
> size per PR data width in driver, to allow user to pass unaligned
> size bitstream files to driver.
> 
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/fpga/dfl-fme-pr.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
> index 6ec0f09..3c71dc3 100644
> --- a/drivers/fpga/dfl-fme-pr.c
> +++ b/drivers/fpga/dfl-fme-pr.c
> @@ -74,6 +74,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>  	struct dfl_fme *fme;
>  	unsigned long minsz;
>  	void *buf = NULL;
> +	size_t length;
>  	int ret = 0;
>  	u64 v;
>  
> @@ -85,9 +86,6 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>  	if (port_pr.argsz < minsz || port_pr.flags)
>  		return -EINVAL;
>  
> -	if (!IS_ALIGNED(port_pr.buffer_size, 4))
> -		return -EINVAL;
> -
>  	/* get fme header region */
>  	fme_hdr = dfl_get_feature_ioaddr_by_id(&pdev->dev,
>  					       FME_FEATURE_ID_HEADER);
> @@ -103,7 +101,13 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>  		       port_pr.buffer_size))
>  		return -EFAULT;
>  
> -	buf = vmalloc(port_pr.buffer_size);
> +	/*
> +	 * align PR buffer per PR bandwidth, as HW ignores the extra padding
> +	 * data automatically.
> +	 */
> +	length = ALIGN(port_pr.buffer_size, 4);
> +
> +	buf = vmalloc(length);
>  	if (!buf)
>  		return -ENOMEM;
>  
> @@ -140,7 +144,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>  	fpga_image_info_free(region->info);
>  
>  	info->buf = buf;
> -	info->count = port_pr.buffer_size;
> +	info->count = length;
>  	info->region_id = port_pr.port_id;
>  	region->info = info;
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 06/18] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support.
  2019-04-29  8:55 ` [PATCH v2 06/18] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support Wu Hao
@ 2019-05-07 17:33   ` Moritz Fischer
  0 siblings, 0 replies; 42+ messages in thread
From: Moritz Fischer @ 2019-05-07 17:33 UTC (permalink / raw)
  To: Wu Hao
  Cc: atull, mdf, linux-fpga, linux-kernel, linux-api, Zhang Yi Z, Xu Yilun

On Mon, Apr 29, 2019 at 04:55:39PM +0800, Wu Hao wrote:
> In order to support virtualization usage via PCIe SRIOV, this patch
> adds two ioctls under FPGA Management Engine (FME) to release and
> assign back the port device. In order to safely turn Port from PF
> into VF and enable PCIe SRIOV, it requires user to invoke this
> PORT_RELEASE ioctl to release port firstly to remove userspace
> interfaces, and then configure the PF/VF access register in FME.
> After disable SRIOV, it requires user to invoke this PORT_ASSIGN
> ioctl to attach the port back to PF.
> 
>  Ioctl interfaces:
>  * DFL_FPGA_FME_PORT_RELEASE
>    Release platform device of given port, it deletes port platform
>    device to remove related userspace interfaces on PF, then
>    configures PF/VF access mode to VF.
> 
>  * DFL_FPGA_FME_PORT_ASSIGN
>    Assign platform device of given port back to PF, it configures
>    PF/VF access mode to PF, then adds port platform device back to
>    re-enable related userspace interfaces on PF.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/fpga/dfl-fme-main.c   |  54 +++++++++++++++++++++
>  drivers/fpga/dfl.c            | 107 +++++++++++++++++++++++++++++++++++++-----
>  drivers/fpga/dfl.h            |  10 ++++
>  include/uapi/linux/fpga-dfl.h |  32 +++++++++++++
>  4 files changed, 191 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index 076d74f..8b2a337 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -16,6 +16,7 @@
>  
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> +#include <linux/uaccess.h>
>  #include <linux/fpga-dfl.h>
>  
>  #include "dfl.h"
> @@ -105,9 +106,62 @@ static void fme_hdr_uinit(struct platform_device *pdev,
>  	sysfs_remove_files(&pdev->dev.kobj, fme_hdr_attrs);
>  }
>  
> +static long fme_hdr_ioctl_release_port(struct dfl_feature_platform_data *pdata,
> +				       void __user *arg)
> +{
> +	struct dfl_fpga_cdev *cdev = pdata->dfl_cdev;
> +	struct dfl_fpga_fme_port_release release;
> +	unsigned long minsz;
> +
> +	minsz = offsetofend(struct dfl_fpga_fme_port_release, port_id);
> +
> +	if (copy_from_user(&release, arg, minsz))
> +		return -EFAULT;
> +
> +	if (release.argsz < minsz || release.flags)
> +		return -EINVAL;
> +
> +	return dfl_fpga_cdev_config_port(cdev, release.port_id, true);
> +}
> +
> +static long fme_hdr_ioctl_assign_port(struct dfl_feature_platform_data *pdata,
> +				      void __user *arg)
> +{
> +	struct dfl_fpga_cdev *cdev = pdata->dfl_cdev;
> +	struct dfl_fpga_fme_port_assign assign;
> +	unsigned long minsz;
> +
> +	minsz = offsetofend(struct dfl_fpga_fme_port_assign, port_id);
> +
> +	if (copy_from_user(&assign, arg, minsz))
> +		return -EFAULT;
> +
> +	if (assign.argsz < minsz || assign.flags)
> +		return -EINVAL;
> +
> +	return dfl_fpga_cdev_config_port(cdev, assign.port_id, false);
> +}
> +
> +static long fme_hdr_ioctl(struct platform_device *pdev,
> +			  struct dfl_feature *feature,
> +			  unsigned int cmd, unsigned long arg)
> +{
> +	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
> +
> +	switch (cmd) {
> +	case DFL_FPGA_FME_PORT_RELEASE:
> +		return fme_hdr_ioctl_release_port(pdata, (void __user *)arg);
> +	case DFL_FPGA_FME_PORT_ASSIGN:
> +		return fme_hdr_ioctl_assign_port(pdata, (void __user *)arg);
> +	}
> +
> +	return -ENODEV;
> +}
> +
>  static const struct dfl_feature_ops fme_hdr_ops = {
>  	.init = fme_hdr_init,
>  	.uinit = fme_hdr_uinit,
> +	.ioctl = fme_hdr_ioctl,
>  };
>  
>  static struct dfl_feature_driver fme_feature_drvs[] = {
> diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> index 2c09e50..a6b6d38 100644
> --- a/drivers/fpga/dfl.c
> +++ b/drivers/fpga/dfl.c
> @@ -224,16 +224,20 @@ void dfl_fpga_port_ops_del(struct dfl_fpga_port_ops *ops)
>   */
>  int dfl_fpga_check_port_id(struct platform_device *pdev, void *pport_id)
>  {
> -	struct dfl_fpga_port_ops *port_ops = dfl_fpga_port_ops_get(pdev);
> -	int port_id;
> +	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
> +	struct dfl_fpga_port_ops *port_ops;
> +
> +	if (pdata->id != FEATURE_DEV_ID_UNUSED)
> +		return pdata->id == *(int *)pport_id;
>  
> +	port_ops = dfl_fpga_port_ops_get(pdev);
>  	if (!port_ops || !port_ops->get_id)
>  		return 0;
>  
> -	port_id = port_ops->get_id(pdev);
> +	pdata->id = port_ops->get_id(pdev);
>  	dfl_fpga_port_ops_put(port_ops);
>  
> -	return port_id == *(int *)pport_id;
> +	return pdata->id == *(int *)pport_id;
>  }
>  EXPORT_SYMBOL_GPL(dfl_fpga_check_port_id);
>  
> @@ -462,6 +466,7 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
>  	pdata->dev = fdev;
>  	pdata->num = binfo->feature_num;
>  	pdata->dfl_cdev = binfo->cdev;
> +	pdata->id = FEATURE_DEV_ID_UNUSED;
>  	mutex_init(&pdata->lock);
>  
>  	/*
> @@ -959,25 +964,27 @@ void dfl_fpga_feature_devs_remove(struct dfl_fpga_cdev *cdev)
>  {
>  	struct dfl_feature_platform_data *pdata, *ptmp;
>  
> -	remove_feature_devs(cdev);
> -
>  	mutex_lock(&cdev->lock);
> -	if (cdev->fme_dev) {
> -		/* the fme should be unregistered. */
> -		WARN_ON(device_is_registered(cdev->fme_dev));
> +	if (cdev->fme_dev)
>  		put_device(cdev->fme_dev);
> -	}
>  
>  	list_for_each_entry_safe(pdata, ptmp, &cdev->port_dev_list, node) {
>  		struct platform_device *port_dev = pdata->dev;
>  
> -		/* the port should be unregistered. */
> -		WARN_ON(device_is_registered(&port_dev->dev));
> +		/* remove released ports */
> +		if (!device_is_registered(&port_dev->dev)) {
> +			dfl_id_free(feature_dev_id_type(port_dev),
> +				    port_dev->id);
> +			platform_device_put(port_dev);
> +		}
> +
>  		list_del(&pdata->node);
>  		put_device(&port_dev->dev);
>  	}
>  	mutex_unlock(&cdev->lock);
>  
> +	remove_feature_devs(cdev);
> +
>  	fpga_region_unregister(cdev->region);
>  	devm_kfree(cdev->parent, cdev);
>  }
> @@ -1015,6 +1022,82 @@ struct platform_device *
>  }
>  EXPORT_SYMBOL_GPL(__dfl_fpga_cdev_find_port);
>  
> +static int attach_port_dev(struct dfl_fpga_cdev *cdev, u32 port_id)
> +{
> +	struct platform_device *port_pdev;
> +	int ret = -ENODEV;
> +
> +	mutex_lock(&cdev->lock);
> +	port_pdev = __dfl_fpga_cdev_find_port(cdev, &port_id,
> +					      dfl_fpga_check_port_id);
> +	if (!port_pdev)
> +		goto unlock_exit;
> +
> +	if (device_is_registered(&port_pdev->dev)) {
> +		ret = -EBUSY;
> +		goto put_dev_exit;
> +	}
> +
> +	ret = platform_device_add(port_pdev);
> +	if (ret)
> +		goto put_dev_exit;
> +
> +	dfl_feature_dev_use_end(dev_get_platdata(&port_pdev->dev));
> +	cdev->released_port_num--;
> +put_dev_exit:
> +	put_device(&port_pdev->dev);
> +unlock_exit:
> +	mutex_unlock(&cdev->lock);
> +	return ret;
> +}
> +
> +static int detach_port_dev(struct dfl_fpga_cdev *cdev, u32 port_id)
> +{
> +	struct platform_device *port_pdev;
> +	int ret = -ENODEV;
> +
> +	mutex_lock(&cdev->lock);
> +	port_pdev = __dfl_fpga_cdev_find_port(cdev, &port_id,
> +					      dfl_fpga_check_port_id);
> +	if (!port_pdev)
> +		goto unlock_exit;
> +
> +	if (!device_is_registered(&port_pdev->dev)) {
> +		ret = -EBUSY;
> +		goto put_dev_exit;
> +	}
> +
> +	ret = dfl_feature_dev_use_begin(dev_get_platdata(&port_pdev->dev));
> +	if (ret)
> +		goto put_dev_exit;
> +
> +	platform_device_del(port_pdev);
> +	cdev->released_port_num++;
> +put_dev_exit:
> +	put_device(&port_pdev->dev);
> +unlock_exit:
> +	mutex_unlock(&cdev->lock);
> +	return ret;
> +}
> +
> +/**
> + * dfl_fpga_cdev_config_port - configure a port feature dev
> + * @cdev: parent container device.
> + * @port_id: id of the port feature device.
> + * @release: release port or assign port back.
> + *
> + * This function allows user to release port platform device or assign it back.
> + * e.g. to safely turn one port from PF into VF for PCI device SRIOV support,
> + * release port platform device is one necessary step.
> + */
> +int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
> +			      u32 port_id, bool release)
> +{
> +	return release ? detach_port_dev(cdev, port_id) :
> +			 attach_port_dev(cdev, port_id);
> +}
> +EXPORT_SYMBOL_GPL(dfl_fpga_cdev_config_port);
> +
>  static int __init dfl_fpga_init(void)
>  {
>  	int ret;
> diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
> index 8851c6c..63f39ab 100644
> --- a/drivers/fpga/dfl.h
> +++ b/drivers/fpga/dfl.h
> @@ -183,6 +183,8 @@ struct dfl_feature {
>  
>  #define DEV_STATUS_IN_USE	0
>  
> +#define FEATURE_DEV_ID_UNUSED	(-1)
> +
>  /**
>   * struct dfl_feature_platform_data - platform data for feature devices
>   *
> @@ -191,6 +193,7 @@ struct dfl_feature {
>   * @cdev: cdev of feature dev.
>   * @dev: ptr to platform device linked with this platform data.
>   * @dfl_cdev: ptr to container device.
> + * @id: id used for this feature device.
>   * @disable_count: count for port disable.
>   * @num: number for sub features.
>   * @dev_status: dev status (e.g. DEV_STATUS_IN_USE).
> @@ -203,6 +206,7 @@ struct dfl_feature_platform_data {
>  	struct cdev cdev;
>  	struct platform_device *dev;
>  	struct dfl_fpga_cdev *dfl_cdev;
> +	int id;
>  	unsigned int disable_count;
>  	unsigned long dev_status;
>  	void *private;
> @@ -378,6 +382,7 @@ int dfl_fpga_enum_info_add_dfl(struct dfl_fpga_enum_info *info,
>   * @fme_dev: FME feature device under this container device.
>   * @lock: mutex lock to protect the port device list.
>   * @port_dev_list: list of all port feature devices under this container device.
> + * @released_port_num: released port number under this container device.
>   */
>  struct dfl_fpga_cdev {
>  	struct device *parent;
> @@ -385,6 +390,7 @@ struct dfl_fpga_cdev {
>  	struct device *fme_dev;
>  	struct mutex lock;
>  	struct list_head port_dev_list;
> +	int released_port_num;
>  };
>  
>  struct dfl_fpga_cdev *
> @@ -412,4 +418,8 @@ struct platform_device *
>  
>  	return pdev;
>  }
> +
> +int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
> +			      u32 port_id, bool release);
> +
>  #endif /* __FPGA_DFL_H */
> diff --git a/include/uapi/linux/fpga-dfl.h b/include/uapi/linux/fpga-dfl.h
> index 2e324e5..e9a00e0 100644
> --- a/include/uapi/linux/fpga-dfl.h
> +++ b/include/uapi/linux/fpga-dfl.h
> @@ -176,4 +176,36 @@ struct dfl_fpga_fme_port_pr {
>  
>  #define DFL_FPGA_FME_PORT_PR	_IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 0)
>  
> +/**
> + * DFL_FPGA_FME_PORT_RELEASE - _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 1,
> + *					struct dfl_fpga_fme_port_release)
> + *
> + * Driver releases the port per Port ID provided by caller.
> + * Return: 0 on success, -errno on failure.
> + */
> +struct dfl_fpga_fme_port_release {
> +	/* Input */
> +	__u32 argsz;		/* Structure length */
> +	__u32 flags;		/* Zero for now */
> +	__u32 port_id;
> +};
> +
> +#define DFL_FPGA_FME_PORT_RELEASE	_IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 1)
> +
> +/**
> + * DFL_FPGA_FME_PORT_ASSIGN - _IOW(DFL_FPGA_MAGIC, DFL_FME_BASE + 2,
> + *					struct dfl_fpga_fme_port_assign)
> + *
> + * Driver assigns the port back per Port ID provided by caller.
> + * Return: 0 on success, -errno on failure.
> + */
> +struct dfl_fpga_fme_port_assign {
> +	/* Input */
> +	__u32 argsz;		/* Structure length */
> +	__u32 flags;		/* Zero for now */
> +	__u32 port_id;
> +};
> +
> +#define DFL_FPGA_FME_PORT_ASSIGN	_IO(DFL_FPGA_MAGIC, DFL_FME_BASE + 2)
> +
>  #endif /* _UAPI_LINUX_FPGA_DFL_H */
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 07/18] fpga: dfl: pci: enable SRIOV support.
  2019-04-29  8:55 ` [PATCH v2 07/18] fpga: dfl: pci: enable SRIOV support Wu Hao
@ 2019-05-07 17:35   ` Moritz Fischer
  0 siblings, 0 replies; 42+ messages in thread
From: Moritz Fischer @ 2019-05-07 17:35 UTC (permalink / raw)
  To: Wu Hao
  Cc: atull, mdf, linux-fpga, linux-kernel, linux-api, Zhang Yi Z, Xu Yilun

On Mon, Apr 29, 2019 at 04:55:40PM +0800, Wu Hao wrote:
> This patch enables the standard sriov support. It allows user to
> enable SRIOV (and VFs), then user could pass through accelerators
> (VFs) into virtual machine or use VFs directly in host.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> Acked-by: Alan Tull <atull@kernel.org>
Acked-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/fpga/dfl-pci.c | 40 ++++++++++++++++++++++++++++++++++++++++
>  drivers/fpga/dfl.c     | 41 +++++++++++++++++++++++++++++++++++++++++
>  drivers/fpga/dfl.h     |  1 +
>  3 files changed, 82 insertions(+)
> 
> diff --git a/drivers/fpga/dfl-pci.c b/drivers/fpga/dfl-pci.c
> index 66b5720..2fa571b 100644
> --- a/drivers/fpga/dfl-pci.c
> +++ b/drivers/fpga/dfl-pci.c
> @@ -223,8 +223,46 @@ int cci_pci_probe(struct pci_dev *pcidev, const struct pci_device_id *pcidevid)
>  	return ret;
>  }
>  
> +static int cci_pci_sriov_configure(struct pci_dev *pcidev, int num_vfs)
> +{
> +	struct cci_drvdata *drvdata = pci_get_drvdata(pcidev);
> +	struct dfl_fpga_cdev *cdev = drvdata->cdev;
> +	int ret = 0;
> +
> +	mutex_lock(&cdev->lock);
> +
> +	if (!num_vfs) {
> +		/*
> +		 * disable SRIOV and then put released ports back to default
> +		 * PF access mode.
> +		 */
> +		pci_disable_sriov(pcidev);
> +
> +		__dfl_fpga_cdev_config_port_vf(cdev, false);
> +
> +	} else if (cdev->released_port_num == num_vfs) {
> +		/*
> +		 * only enable SRIOV if cdev has matched released ports, put
> +		 * released ports into VF access mode firstly.
> +		 */
> +		__dfl_fpga_cdev_config_port_vf(cdev, true);
> +
> +		ret = pci_enable_sriov(pcidev, num_vfs);
> +		if (ret)
> +			__dfl_fpga_cdev_config_port_vf(cdev, false);
> +	} else {
> +		ret = -EINVAL;
> +	}
> +
> +	mutex_unlock(&cdev->lock);
> +	return ret;
> +}
> +
>  static void cci_pci_remove(struct pci_dev *pcidev)
>  {
> +	if (dev_is_pf(&pcidev->dev))
> +		cci_pci_sriov_configure(pcidev, 0);
> +
>  	cci_remove_feature_devs(pcidev);
>  	pci_disable_pcie_error_reporting(pcidev);
>  }
> @@ -234,6 +272,7 @@ static void cci_pci_remove(struct pci_dev *pcidev)
>  	.id_table = cci_pcie_id_tbl,
>  	.probe = cci_pci_probe,
>  	.remove = cci_pci_remove,
> +	.sriov_configure = cci_pci_sriov_configure,
>  };
>  
>  module_pci_driver(cci_pci_driver);
> @@ -241,3 +280,4 @@ static void cci_pci_remove(struct pci_dev *pcidev)
>  MODULE_DESCRIPTION("FPGA DFL PCIe Device Driver");
>  MODULE_AUTHOR("Intel Corporation");
>  MODULE_LICENSE("GPL v2");
> +MODULE_VERSION(DRV_VERSION);
> diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> index a6b6d38..c5aa287 100644
> --- a/drivers/fpga/dfl.c
> +++ b/drivers/fpga/dfl.c
> @@ -1098,6 +1098,47 @@ int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
>  }
>  EXPORT_SYMBOL_GPL(dfl_fpga_cdev_config_port);
>  
> +static void config_port_vf(struct device *fme_dev, int port_id, bool is_vf)
> +{
> +	void __iomem *base;
> +	u64 v;
> +
> +	base = dfl_get_feature_ioaddr_by_id(fme_dev, FME_FEATURE_ID_HEADER);
> +
> +	v = readq(base + FME_HDR_PORT_OFST(port_id));
> +
> +	v &= ~FME_PORT_OFST_ACC_CTRL;
> +	v |= FIELD_PREP(FME_PORT_OFST_ACC_CTRL,
> +			is_vf ? FME_PORT_OFST_ACC_VF : FME_PORT_OFST_ACC_PF);
> +
> +	writeq(v, base + FME_HDR_PORT_OFST(port_id));
> +}
> +
> +/**
> + * __dfl_fpga_cdev_config_port_vf - configure port to VF access mode
> + *
> + * @cdev: parent container device.
> + * @if_vf: true for VF access mode, and false for PF access mode
> + *
> + * Return: 0 on success, negative error code otherwise.
> + *
> + * This function is needed in sriov configuration routine. It could be used to
> + * configures the released ports access mode to VF or PF.
> + * The caller needs to hold lock for protection.
> + */
> +void __dfl_fpga_cdev_config_port_vf(struct dfl_fpga_cdev *cdev, bool is_vf)
> +{
> +	struct dfl_feature_platform_data *pdata;
> +
> +	list_for_each_entry(pdata, &cdev->port_dev_list, node) {
> +		if (device_is_registered(&pdata->dev->dev))
> +			continue;
> +
> +		config_port_vf(cdev->fme_dev, pdata->id, is_vf);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(__dfl_fpga_cdev_config_port_vf);
> +
>  static int __init dfl_fpga_init(void)
>  {
>  	int ret;
> diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
> index 63f39ab..1350e8e 100644
> --- a/drivers/fpga/dfl.h
> +++ b/drivers/fpga/dfl.h
> @@ -421,5 +421,6 @@ struct platform_device *
>  
>  int dfl_fpga_cdev_config_port(struct dfl_fpga_cdev *cdev,
>  			      u32 port_id, bool release);
> +void __dfl_fpga_cdev_config_port_vf(struct dfl_fpga_cdev *cdev, bool is_vf);
>  
>  #endif /* __FPGA_DFL_H */
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/18] fpga: dfl: fme: add thermal management support
  2019-04-29  8:55 ` [PATCH v2 15/18] fpga: dfl: fme: add thermal management support Wu Hao
@ 2019-05-07 18:20   ` Alan Tull
  2019-05-07 18:35     ` Guenter Roeck
  2019-05-07 18:30   ` Moritz Fischer
  1 sibling, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-07 18:20 UTC (permalink / raw)
  To: Wu Hao
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Russ Weight, Xu Yilun, Jean Delvare, Guenter Roeck,
	Linux HWMON List

On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:

+ The hwmon people

>
> This patch adds support to thermal management private feature for DFL
> FPGA Management Engine (FME). This private feature driver registers
> a hwmon for thermal/temperature monitoring (hwmon temp1_input).
> If hardware automatic throttling is supported by this hardware, then
> driver also exposes sysfs interfaces under hwmon for thresholds
> (temp1_alarm/ crit/ emergency), threshold status (temp1_alarm_status/
> temp1_crit_status) and throttling policy (temp1_alarm_policy).
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> ---
> v2: create a dfl_fme_thermal hwmon to expose thermal information.
>     move all sysfs interfaces under hwmon
>         tempareture       --> hwmon temp1_input
>         threshold1        --> hwmon temp1_alarm
>         threshold2        --> hwmon temp1_crit
>         trip_threshold    --> hwmon temp1_emergency
>         threshold1_status --> hwmon temp1_alarm_status
>         threshold2_status --> hwmon temp1_crit_status
>         threshold1_policy --> hwmon temp1_alarm_policy
> ---
>  Documentation/ABI/testing/sysfs-platform-dfl-fme |  64 +++++++
>  drivers/fpga/Kconfig                             |   2 +-
>  drivers/fpga/dfl-fme-main.c                      | 212 +++++++++++++++++++++++
>  3 files changed, 277 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> index d1aa375..dfbd315 100644
> --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> @@ -44,3 +44,67 @@ Description: Read-only. It returns socket_id to indicate which socket
>                 this FPGA belongs to, only valid for integrated solution.
>                 User only needs this information, in case standard numa node
>                 can't provide correct information.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/name
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read this file to get the name of hwmon device, it
> +               supports values:
> +                   'dfl_fme_thermal' - thermal hwmon device name
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns FPGA device temperature in millidegrees
> +               Celsius.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns hardware threshold1 temperature in
> +               millidegrees Celsius. If temperature rises at or above this
> +               threshold, hardware starts 50% or 90% throttling (see
> +               'temp1_alarm_policy').
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns hardware threshold2 temperature in
> +               millidegrees Celsius. If temperature rises at or above this
> +               threshold, hardware starts 100% throttling.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_emergency
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns hardware trip threshold temperature in
> +               millidegrees Celsius. If temperature rises at or above this
> +               threshold, a fatal event will be triggered to board management
> +               controller (BMC) to shutdown FPGA.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_status
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-only. It returns 1 if temperature is currently at or above
> +               hardware threshold1 (see 'temp1_alarm'), otherwise 0.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit_status
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-only. It returns 1 if temperature is currently at or above
> +               hardware threshold2 (see 'temp1_crit'), otherwise 0.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_policy
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read this file to get the policy of hardware threshold1
> +               (see 'temp1_alarm'). It only supports two values (policies):
> +                   0 - AP2 state (90% throttling)
> +                   1 - AP1 state (50% throttling)
> diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> index c20445b..a6d7588 100644
> --- a/drivers/fpga/Kconfig
> +++ b/drivers/fpga/Kconfig
> @@ -154,7 +154,7 @@ config FPGA_DFL
>
>  config FPGA_DFL_FME
>         tristate "FPGA DFL FME Driver"
> -       depends on FPGA_DFL
> +       depends on FPGA_DFL && HWMON
>         help
>           The FPGA Management Engine (FME) is a feature device implemented
>           under Device Feature List (DFL) framework. Select this option to
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index 8339ee8..b9a68b8 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -14,6 +14,8 @@
>   *   Henry Mitchel <henry.mitchel@intel.com>
>   */
>
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
>  #include <linux/uaccess.h>
> @@ -217,6 +219,212 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
>         .ioctl = fme_hdr_ioctl,
>  };
>
> +#define FME_THERM_THRESHOLD    0x8
> +#define TEMP_THRESHOLD1                GENMASK_ULL(6, 0)
> +#define TEMP_THRESHOLD1_EN     BIT_ULL(7)
> +#define TEMP_THRESHOLD2                GENMASK_ULL(14, 8)
> +#define TEMP_THRESHOLD2_EN     BIT_ULL(15)
> +#define TRIP_THRESHOLD         GENMASK_ULL(30, 24)
> +#define TEMP_THRESHOLD1_STATUS BIT_ULL(32)             /* threshold1 reached */
> +#define TEMP_THRESHOLD2_STATUS BIT_ULL(33)             /* threshold2 reached */
> +/* threshold1 policy: 0 - AP2 (90% throttle) / 1 - AP1 (50% throttle) */
> +#define TEMP_THRESHOLD1_POLICY BIT_ULL(44)
> +
> +#define FME_THERM_RDSENSOR_FMT1        0x10
> +#define FPGA_TEMPERATURE       GENMASK_ULL(6, 0)
> +
> +#define FME_THERM_CAP          0x20
> +#define THERM_NO_THROTTLE      BIT_ULL(0)
> +
> +#define MD_PRE_DEG
> +
> +static bool fme_thermal_throttle_support(void __iomem *base)
> +{
> +       u64 v = readq(base + FME_THERM_CAP);
> +
> +       return FIELD_GET(THERM_NO_THROTTLE, v) ? false : true;
> +}
> +
> +static umode_t thermal_hwmon_attrs_visible(const void *drvdata,
> +                                          enum hwmon_sensor_types type,
> +                                          u32 attr, int channel)
> +{
> +       const struct dfl_feature *feature = drvdata;
> +
> +       /* temperature is always supported, and check hardware cap for others */
> +       if (attr == hwmon_temp_input)
> +               return 0444;
> +
> +       return fme_thermal_throttle_support(feature->ioaddr) ? 0444 : 0;
> +}
> +
> +static int thermal_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> +                             u32 attr, int channel, long *val)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       switch (attr) {
> +       case hwmon_temp_input:
> +               v = readq(feature->ioaddr + FME_THERM_RDSENSOR_FMT1);
> +               *val = (long)(FIELD_GET(FPGA_TEMPERATURE, v) * 1000);
> +               break;
> +       case hwmon_temp_alarm:
> +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +               *val = (long)(FIELD_GET(TEMP_THRESHOLD1, v) * 1000);
> +               break;
> +       case hwmon_temp_crit:
> +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +               *val = (long)(FIELD_GET(TEMP_THRESHOLD2, v) * 1000);
> +               break;
> +       case hwmon_temp_emergency:
> +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +               *val = (long)(FIELD_GET(TRIP_THRESHOLD, v) * 1000);
> +               break;
> +       default:
> +               return -EOPNOTSUPP;
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct hwmon_ops thermal_hwmon_ops = {
> +       .is_visible = thermal_hwmon_attrs_visible,
> +       .read = thermal_hwmon_read,
> +};
> +
> +static const u32 thermal_hwmon_temp_config[] = {
> +       HWMON_T_INPUT | HWMON_T_ALARM | HWMON_T_CRIT | HWMON_T_EMERGENCY,
> +       0
> +};
> +
> +static const struct hwmon_channel_info hwmon_temp_info = {
> +       .type = hwmon_temp,
> +       .config = thermal_hwmon_temp_config,
> +};
> +
> +static const struct hwmon_channel_info *thermal_hwmon_info[] = {
> +       &hwmon_temp_info,
> +       NULL
> +};
> +
> +static const struct hwmon_chip_info thermal_hwmon_chip_info = {
> +       .ops = &thermal_hwmon_ops,
> +       .info = thermal_hwmon_info,
> +};
> +
> +static ssize_t temp1_alarm_status_show(struct device *dev,
> +                                      struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD1_STATUS, v));
> +}
> +
> +static ssize_t temp1_crit_status_show(struct device *dev,
> +                                     struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD2_STATUS, v));
> +}
> +
> +static ssize_t temp1_alarm_policy_show(struct device *dev,
> +                                      struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD1_POLICY, v));
> +}
> +
> +static DEVICE_ATTR_RO(temp1_alarm_status);
> +static DEVICE_ATTR_RO(temp1_crit_status);
> +static DEVICE_ATTR_RO(temp1_alarm_policy);
> +
> +static struct attribute *thermal_extra_attrs[] = {
> +       &dev_attr_temp1_alarm_status.attr,
> +       &dev_attr_temp1_crit_status.attr,
> +       &dev_attr_temp1_alarm_policy.attr,
> +       NULL,
> +};
> +
> +static umode_t thermal_extra_attrs_visible(struct kobject *kobj,
> +                                          struct attribute *attr, int index)
> +{
> +       struct device *dev = kobj_to_dev(kobj);
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +
> +       return fme_thermal_throttle_support(feature->ioaddr) ? attr->mode : 0;
> +}
> +
> +static const struct attribute_group thermal_extra_group = {
> +       .attrs          = thermal_extra_attrs,
> +       .is_visible     = thermal_extra_attrs_visible,
> +};
> +__ATTRIBUTE_GROUPS(thermal_extra);
> +
> +static int fme_thermal_mgmt_init(struct platform_device *pdev,
> +                                struct dfl_feature *feature)
> +{
> +       struct device *hwmon;
> +
> +       dev_dbg(&pdev->dev, "FME Thermal Management Init.\n");
> +
> +       /*
> +        * create hwmon to allow userspace monitoring temperature and other
> +        * threshold information.
> +        *
> +        * temp1_alarm     -> hardware threshold 1 -> 50% or 90% throttling
> +        * temp1_crit      -> hardware threshold 2 -> 100% throttling
> +        * temp1_emergency -> hardware trip_threshold to shutdown FPGA
> +        *
> +        * create device specific sysfs interfaces, e.g. read temp1_alarm_policy
> +        * to understand the actual hardware throttling action (50% vs 90%).
> +        *
> +        * If hardware doesn't support automatic throttling per thresholds,
> +        * then all above sysfs interfaces are not visible except temp1_input
> +        * for temperature.
> +        */
> +       hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> +                                                    "dfl_fme_thermal", feature,
> +                                                    &thermal_hwmon_chip_info,
> +                                                    thermal_extra_groups);
> +       if (IS_ERR(hwmon)) {
> +               dev_err(&pdev->dev, "Fail to register thermal hwmon\n");
> +               return PTR_ERR(hwmon);
> +       }
> +
> +       return 0;
> +}
> +
> +static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> +                                  struct dfl_feature *feature)
> +{
> +       dev_dbg(&pdev->dev, "FME Thermal Management UInit.\n");
> +}
> +
> +static const struct dfl_feature_id fme_thermal_mgmt_id_table[] = {
> +       {.id = FME_FEATURE_ID_THERMAL_MGMT,},
> +       {0,}
> +};
> +
> +static const struct dfl_feature_ops fme_thermal_mgmt_ops = {
> +       .init = fme_thermal_mgmt_init,
> +       .uinit = fme_thermal_mgmt_uinit,
> +};
> +
>  static struct dfl_feature_driver fme_feature_drvs[] = {
>         {
>                 .id_table = fme_hdr_id_table,
> @@ -227,6 +435,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
>                 .ops = &fme_pr_mgmt_ops,
>         },
>         {
> +               .id_table = fme_thermal_mgmt_id_table,
> +               .ops = &fme_thermal_mgmt_ops,
> +       },
> +       {
>                 .ops = NULL,
>         },
>  };
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 16/18] fpga: dfl: fme: add power management support
  2019-04-29  8:55 ` [PATCH v2 16/18] fpga: dfl: fme: add power " Wu Hao
@ 2019-05-07 18:23   ` Alan Tull
  2019-05-07 18:36     ` Guenter Roeck
  0 siblings, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-07 18:23 UTC (permalink / raw)
  To: Wu Hao
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Xu Yilun, Jean Delvare, Guenter Roeck, Linux HWMON List

On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:

+ hwmon folks

>
> This patch adds support for power management private feature under
> FPGA Management Engine (FME). This private feature driver registers
> a hwmon for power (power1_input), thresholds information, e.g.
> (power1_cap / crit) and also read-only sysfs interfaces for other
> power management information. For configuration, user could write
> threshold values via above power1_cap / crit sysfs interface
> under hwmon too.
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> ---
> v2: create a dfl_fme_power hwmon to expose power sysfs interfaces.
>     move all sysfs interfaces under hwmon
>         consumed          --> hwmon power1_input
>         threshold1        --> hwmon power1_cap
>         threshold2        --> hwmon power1_crit
>         threshold1_status --> hwmon power1_cap_status
>         threshold2_status --> hwmon power1_crit_status
>         xeon_limit        --> hwmon power1_xeon_limit
>         fpga_limit        --> hwmon power1_fpga_limit
>         ltr               --> hwmon power1_ltr
> ---
>  Documentation/ABI/testing/sysfs-platform-dfl-fme |  67 ++++++
>  drivers/fpga/dfl-fme-main.c                      | 247 +++++++++++++++++++++++
>  2 files changed, 314 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> index dfbd315..e2ba92d 100644
> --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> @@ -52,6 +52,7 @@ Contact:      Wu Hao <hao.wu@intel.com>
>  Description:   Read-Only. Read this file to get the name of hwmon device, it
>                 supports values:
>                     'dfl_fme_thermal' - thermal hwmon device name
> +                   'dfl_fme_power'   - power hwmon device name
>
>  What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
>  Date:          April 2019
> @@ -108,3 +109,69 @@ Description:       Read-Only. Read this file to get the policy of hardware threshold1
>                 (see 'temp1_alarm'). It only supports two values (policies):
>                     0 - AP2 state (90% throttling)
>                     1 - AP1 state (50% throttling)
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_input
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns current FPGA power consumption in uW.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_cap
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file to get current hardware power
> +               threshold1 in uW. If power consumption rises at or above
> +               this threshold, hardware starts 50% throttling.
> +               Write this file to set current hardware power threshold1 in uW.
> +               As hardware only accepts values in Watts, so input value will
> +               be round down per Watts (< 1 watts part will be discarded).
> +               Write fails with -EINVAL if input parsing fails or input isn't
> +               in the valid range (0 - 127000000 uW).
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file to get current hardware power
> +               threshold2 in uW. If power consumption rises at or above
> +               this threshold, hardware starts 90% throttling.
> +               Write this file to set current hardware power threshold2 in uW.
> +               As hardware only accepts values in Watts, so input value will
> +               be round down per Watts (< 1 watts part will be discarded).
> +               Write fails with -EINVAL if input parsing fails or input isn't
> +               in the valid range (0 - 127000000 uW).
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_cap_status
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-only. It returns 1 if power consumption is currently at or
> +               above hardware threshold1 (see 'power1_cap'), otherwise 0.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit_status
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-only. It returns 1 if power consumption is currently at or
> +               above hardware threshold2 (see 'power1_crit'), otherwise 0.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_xeon_limit
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns power limit for XEON in uW.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_fpga_limit
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. It returns power limit for FPGA in uW.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_ltr
> +Date:          April 2019
> +KernelVersion: 5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-only. Read this file to get current Latency Tolerance
> +               Reporting (ltr) value. This ltr impacts the CPU low power
> +               state in integrated solution.
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index b9a68b8..7005316 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -425,6 +425,249 @@ static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
>         .uinit = fme_thermal_mgmt_uinit,
>  };
>
> +#define FME_PWR_STATUS         0x8
> +#define FME_LATENCY_TOLERANCE  BIT_ULL(18)
> +#define PWR_CONSUMED           GENMASK_ULL(17, 0)
> +
> +#define FME_PWR_THRESHOLD      0x10
> +#define PWR_THRESHOLD1         GENMASK_ULL(6, 0)       /* in Watts */
> +#define PWR_THRESHOLD2         GENMASK_ULL(14, 8)      /* in Watts */
> +#define PWR_THRESHOLD_MAX      0x7f                    /* in Watts */
> +#define PWR_THRESHOLD1_STATUS  BIT_ULL(16)
> +#define PWR_THRESHOLD2_STATUS  BIT_ULL(17)
> +
> +#define FME_PWR_XEON_LIMIT     0x18
> +#define XEON_PWR_LIMIT         GENMASK_ULL(14, 0)      /* in 0.1 Watts */
> +#define XEON_PWR_EN            BIT_ULL(15)
> +#define FME_PWR_FPGA_LIMIT     0x20
> +#define FPGA_PWR_LIMIT         GENMASK_ULL(14, 0)      /* in 0.1 Watts */
> +#define FPGA_PWR_EN            BIT_ULL(15)
> +
> +#define PWR_THRESHOLD_MAX_IN_UW (PWR_THRESHOLD_MAX * 1000000)
> +
> +static int power_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> +                           u32 attr, int channel, long *val)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       switch (attr) {
> +       case hwmon_power_input:
> +               v = readq(feature->ioaddr + FME_PWR_STATUS);
> +               *val = (long)(FIELD_GET(PWR_CONSUMED, v) * 1000000);
> +               break;
> +       case hwmon_power_cap:
> +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> +               *val = (long)(FIELD_GET(PWR_THRESHOLD1, v) * 1000000);
> +               break;
> +       case hwmon_power_crit:
> +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> +               *val = (long)(FIELD_GET(PWR_THRESHOLD2, v) * 1000000);
> +               break;
> +       default:
> +               return -EOPNOTSUPP;
> +       }
> +
> +       return 0;
> +}
> +
> +static int power_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
> +                            u32 attr, int channel, long val)
> +{
> +       struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       int ret = 0;
> +       u64 v;
> +
> +       if (val < 0 || val > PWR_THRESHOLD_MAX_IN_UW)
> +               return -EINVAL;
> +
> +       val = val / 1000000;
> +
> +       mutex_lock(&pdata->lock);
> +
> +       switch (attr) {
> +       case hwmon_power_cap:
> +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> +               v &= ~PWR_THRESHOLD1;
> +               v |= FIELD_PREP(PWR_THRESHOLD1, val);
> +               writeq(v, feature->ioaddr + FME_PWR_THRESHOLD);
> +               break;
> +       case hwmon_power_crit:
> +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> +               v &= ~PWR_THRESHOLD2;
> +               v |= FIELD_PREP(PWR_THRESHOLD2, val);
> +               writeq(v, feature->ioaddr + FME_PWR_THRESHOLD);
> +               break;
> +       default:
> +               ret = -EOPNOTSUPP;
> +               break;
> +       }
> +
> +       mutex_unlock(&pdata->lock);
> +
> +       return ret;
> +}
> +
> +static umode_t power_hwmon_attrs_visible(const void *drvdata,
> +                                        enum hwmon_sensor_types type,
> +                                        u32 attr, int channel)
> +{
> +       switch (attr) {
> +       case hwmon_power_input:
> +               return 0444;
> +       case hwmon_power_cap:
> +       case hwmon_power_crit:
> +               return 0644;
> +       }
> +
> +       return 0;
> +}
> +
> +static const u32 power_hwmon_config[] = {
> +       HWMON_P_INPUT | HWMON_P_CAP | HWMON_P_CRIT,
> +       0
> +};
> +
> +static const struct hwmon_channel_info hwmon_pwr_info = {
> +       .type = hwmon_power,
> +       .config = power_hwmon_config,
> +};
> +
> +static const struct hwmon_channel_info *power_hwmon_info[] = {
> +       &hwmon_pwr_info,
> +       NULL
> +};
> +
> +static const struct hwmon_ops power_hwmon_ops = {
> +       .is_visible = power_hwmon_attrs_visible,
> +       .read = power_hwmon_read,
> +       .write = power_hwmon_write,
> +};
> +
> +static const struct hwmon_chip_info power_hwmon_chip_info = {
> +       .ops = &power_hwmon_ops,
> +       .info = power_hwmon_info,
> +};
> +
> +static ssize_t power1_cap_status_show(struct device *dev,
> +                                     struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(PWR_THRESHOLD1_STATUS, v));
> +}
> +
> +static ssize_t power1_crit_status_show(struct device *dev,
> +                                      struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(PWR_THRESHOLD2_STATUS, v));
> +}
> +
> +static ssize_t power1_xeon_limit_show(struct device *dev,
> +                                     struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u16 xeon_limit = 0;
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_PWR_XEON_LIMIT);
> +
> +       if (FIELD_GET(XEON_PWR_EN, v))
> +               xeon_limit = FIELD_GET(XEON_PWR_LIMIT, v);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n", xeon_limit * 100000);
> +}
> +
> +static ssize_t power1_fpga_limit_show(struct device *dev,
> +                                     struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u16 fpga_limit = 0;
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_PWR_FPGA_LIMIT);
> +
> +       if (FIELD_GET(FPGA_PWR_EN, v))
> +               fpga_limit = FIELD_GET(FPGA_PWR_LIMIT, v);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n", fpga_limit * 100000);
> +}
> +
> +static ssize_t power1_ltr_show(struct device *dev,
> +                              struct device_attribute *attr, char *buf)
> +{
> +       struct dfl_feature *feature = dev_get_drvdata(dev);
> +       u64 v;
> +
> +       v = readq(feature->ioaddr + FME_PWR_STATUS);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(FME_LATENCY_TOLERANCE, v));
> +}
> +
> +static DEVICE_ATTR_RO(power1_cap_status);
> +static DEVICE_ATTR_RO(power1_crit_status);
> +static DEVICE_ATTR_RO(power1_xeon_limit);
> +static DEVICE_ATTR_RO(power1_fpga_limit);
> +static DEVICE_ATTR_RO(power1_ltr);
> +
> +static struct attribute *power_extra_attrs[] = {
> +       &dev_attr_power1_cap_status.attr,
> +       &dev_attr_power1_crit_status.attr,
> +       &dev_attr_power1_xeon_limit.attr,
> +       &dev_attr_power1_fpga_limit.attr,
> +       &dev_attr_power1_ltr.attr,
> +       NULL
> +};
> +
> +ATTRIBUTE_GROUPS(power_extra);
> +
> +static int fme_power_mgmt_init(struct platform_device *pdev,
> +                              struct dfl_feature *feature)
> +{
> +       struct device *hwmon;
> +
> +       dev_dbg(&pdev->dev, "FME Power Management Init.\n");
> +
> +       hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> +                                                    "dfl_fme_power", feature,
> +                                                    &power_hwmon_chip_info,
> +                                                    power_extra_groups);
> +       if (IS_ERR(hwmon)) {
> +               dev_err(&pdev->dev, "Fail to register power hwmon\n");
> +               return PTR_ERR(hwmon);
> +       }
> +
> +       return 0;
> +}
> +
> +static void fme_power_mgmt_uinit(struct platform_device *pdev,
> +                                struct dfl_feature *feature)
> +{
> +       dev_dbg(&pdev->dev, "FME Power Management UInit.\n");
> +}
> +
> +static const struct dfl_feature_id fme_power_mgmt_id_table[] = {
> +       {.id = FME_FEATURE_ID_POWER_MGMT,},
> +       {0,}
> +};
> +
> +static const struct dfl_feature_ops fme_power_mgmt_ops = {
> +       .init = fme_power_mgmt_init,
> +       .uinit = fme_power_mgmt_uinit,
> +};
> +
>  static struct dfl_feature_driver fme_feature_drvs[] = {
>         {
>                 .id_table = fme_hdr_id_table,
> @@ -439,6 +682,10 @@ static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
>                 .ops = &fme_thermal_mgmt_ops,
>         },
>         {
> +               .id_table = fme_power_mgmt_id_table,
> +               .ops = &fme_power_mgmt_ops,
> +       },
> +       {
>                 .ops = NULL,
>         },
>  };
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/18] fpga: dfl: fme: add thermal management support
  2019-04-29  8:55 ` [PATCH v2 15/18] fpga: dfl: fme: add thermal management support Wu Hao
  2019-05-07 18:20   ` Alan Tull
@ 2019-05-07 18:30   ` Moritz Fischer
  2019-05-08  6:11     ` Wu Hao
  1 sibling, 1 reply; 42+ messages in thread
From: Moritz Fischer @ 2019-05-07 18:30 UTC (permalink / raw)
  To: Wu Hao
  Cc: atull, mdf, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Russ Weight, Xu Yilun, linux-hwmon, linux

Please for next round:

+CC linux-hwmon, Guenter etc ...

On Mon, Apr 29, 2019 at 04:55:48PM +0800, Wu Hao wrote:
> This patch adds support to thermal management private feature for DFL
> FPGA Management Engine (FME). This private feature driver registers
> a hwmon for thermal/temperature monitoring (hwmon temp1_input).
> If hardware automatic throttling is supported by this hardware, then
> driver also exposes sysfs interfaces under hwmon for thresholds
> (temp1_alarm/ crit/ emergency), threshold status (temp1_alarm_status/
> temp1_crit_status) and throttling policy (temp1_alarm_policy).
> 
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> ---
> v2: create a dfl_fme_thermal hwmon to expose thermal information.
>     move all sysfs interfaces under hwmon
> 	tempareture       --> hwmon temp1_input
> 	threshold1        --> hwmon temp1_alarm
> 	threshold2        --> hwmon temp1_crit
> 	trip_threshold    --> hwmon temp1_emergency
> 	threshold1_status --> hwmon temp1_alarm_status
> 	threshold2_status --> hwmon temp1_crit_status
> 	threshold1_policy --> hwmon temp1_alarm_policy
> ---
>  Documentation/ABI/testing/sysfs-platform-dfl-fme |  64 +++++++
>  drivers/fpga/Kconfig                             |   2 +-
>  drivers/fpga/dfl-fme-main.c                      | 212 +++++++++++++++++++++++
>  3 files changed, 277 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> index d1aa375..dfbd315 100644
> --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> @@ -44,3 +44,67 @@ Description:	Read-only. It returns socket_id to indicate which socket
>  		this FPGA belongs to, only valid for integrated solution.
>  		User only needs this information, in case standard numa node
>  		can't provide correct information.
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/name
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-Only. Read this file to get the name of hwmon device, it
> +		supports values:
> +		    'dfl_fme_thermal' - thermal hwmon device name
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-Only. It returns FPGA device temperature in millidegrees
> +		Celsius.
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-Only. It returns hardware threshold1 temperature in
> +		millidegrees Celsius. If temperature rises at or above this
> +		threshold, hardware starts 50% or 90% throttling (see
> +		'temp1_alarm_policy').
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-Only. It returns hardware threshold2 temperature in
> +		millidegrees Celsius. If temperature rises at or above this
> +		threshold, hardware starts 100% throttling.
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_emergency
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-Only. It returns hardware trip threshold temperature in
> +		millidegrees Celsius. If temperature rises at or above this
> +		threshold, a fatal event will be triggered to board management
> +		controller (BMC) to shutdown FPGA.
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_status
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-only. It returns 1 if temperature is currently at or above
> +		hardware threshold1 (see 'temp1_alarm'), otherwise 0.
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit_status
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-only. It returns 1 if temperature is currently at or above
> +		hardware threshold2 (see 'temp1_crit'), otherwise 0.
> +
> +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_policy
> +Date:		April 2019
> +KernelVersion:	5.2
> +Contact:	Wu Hao <hao.wu@intel.com>
> +Description:	Read-Only. Read this file to get the policy of hardware threshold1
> +		(see 'temp1_alarm'). It only supports two values (policies):
> +		    0 - AP2 state (90% throttling)
> +		    1 - AP1 state (50% throttling)
> diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> index c20445b..a6d7588 100644
> --- a/drivers/fpga/Kconfig
> +++ b/drivers/fpga/Kconfig
> @@ -154,7 +154,7 @@ config FPGA_DFL
>  
>  config FPGA_DFL_FME
>  	tristate "FPGA DFL FME Driver"
> -	depends on FPGA_DFL
> +	depends on FPGA_DFL && HWMON
>  	help
>  	  The FPGA Management Engine (FME) is a feature device implemented
>  	  under Device Feature List (DFL) framework. Select this option to
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index 8339ee8..b9a68b8 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -14,6 +14,8 @@
>   *   Henry Mitchel <henry.mitchel@intel.com>
>   */
>  
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
>  #include <linux/uaccess.h>
> @@ -217,6 +219,212 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
>  	.ioctl = fme_hdr_ioctl,
>  };
>  
> +#define FME_THERM_THRESHOLD	0x8
> +#define TEMP_THRESHOLD1		GENMASK_ULL(6, 0)
> +#define TEMP_THRESHOLD1_EN	BIT_ULL(7)
> +#define TEMP_THRESHOLD2		GENMASK_ULL(14, 8)
> +#define TEMP_THRESHOLD2_EN	BIT_ULL(15)
> +#define TRIP_THRESHOLD		GENMASK_ULL(30, 24)
> +#define TEMP_THRESHOLD1_STATUS	BIT_ULL(32)		/* threshold1 reached */
> +#define TEMP_THRESHOLD2_STATUS	BIT_ULL(33)		/* threshold2 reached */
> +/* threshold1 policy: 0 - AP2 (90% throttle) / 1 - AP1 (50% throttle) */
> +#define TEMP_THRESHOLD1_POLICY	BIT_ULL(44)
> +
> +#define FME_THERM_RDSENSOR_FMT1	0x10
> +#define FPGA_TEMPERATURE	GENMASK_ULL(6, 0)
> +
> +#define FME_THERM_CAP		0x20
> +#define THERM_NO_THROTTLE	BIT_ULL(0)
> +
> +#define MD_PRE_DEG
> +
> +static bool fme_thermal_throttle_support(void __iomem *base)
> +{
> +	u64 v = readq(base + FME_THERM_CAP);
> +
> +	return FIELD_GET(THERM_NO_THROTTLE, v) ? false : true;
> +}
> +
> +static umode_t thermal_hwmon_attrs_visible(const void *drvdata,
> +					   enum hwmon_sensor_types type,
> +					   u32 attr, int channel)
> +{
> +	const struct dfl_feature *feature = drvdata;
> +
> +	/* temperature is always supported, and check hardware cap for others */
> +	if (attr == hwmon_temp_input)
> +		return 0444;
> +
> +	return fme_thermal_throttle_support(feature->ioaddr) ? 0444 : 0;
> +}
> +
> +static int thermal_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> +			      u32 attr, int channel, long *val)
> +{
> +	struct dfl_feature *feature = dev_get_drvdata(dev);
> +	u64 v;
> +
> +	switch (attr) {
> +	case hwmon_temp_input:
> +		v = readq(feature->ioaddr + FME_THERM_RDSENSOR_FMT1);
> +		*val = (long)(FIELD_GET(FPGA_TEMPERATURE, v) * 1000);
> +		break;
> +	case hwmon_temp_alarm:
> +		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +		*val = (long)(FIELD_GET(TEMP_THRESHOLD1, v) * 1000);
> +		break;
> +	case hwmon_temp_crit:
> +		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +		*val = (long)(FIELD_GET(TEMP_THRESHOLD2, v) * 1000);
> +		break;
> +	case hwmon_temp_emergency:
> +		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +		*val = (long)(FIELD_GET(TRIP_THRESHOLD, v) * 1000);
> +		break;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct hwmon_ops thermal_hwmon_ops = {
> +	.is_visible = thermal_hwmon_attrs_visible,
> +	.read = thermal_hwmon_read,
> +};
> +
> +static const u32 thermal_hwmon_temp_config[] = {
> +	HWMON_T_INPUT | HWMON_T_ALARM | HWMON_T_CRIT | HWMON_T_EMERGENCY,
> +	0
> +};
> +
> +static const struct hwmon_channel_info hwmon_temp_info = {
> +	.type = hwmon_temp,
> +	.config = thermal_hwmon_temp_config,
> +};
> +
> +static const struct hwmon_channel_info *thermal_hwmon_info[] = {
> +	&hwmon_temp_info,
> +	NULL
> +};
> +
> +static const struct hwmon_chip_info thermal_hwmon_chip_info = {
> +	.ops = &thermal_hwmon_ops,
> +	.info = thermal_hwmon_info,
> +};
> +
> +static ssize_t temp1_alarm_status_show(struct device *dev,
> +				       struct device_attribute *attr, char *buf)
> +{
> +	struct dfl_feature *feature = dev_get_drvdata(dev);
> +	u64 v;
> +
> +	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%u\n",
> +			 (unsigned int)FIELD_GET(TEMP_THRESHOLD1_STATUS, v));
> +}
> +
> +static ssize_t temp1_crit_status_show(struct device *dev,
> +				      struct device_attribute *attr, char *buf)
> +{
> +	struct dfl_feature *feature = dev_get_drvdata(dev);
> +	u64 v;
> +
> +	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%u\n",
> +			 (unsigned int)FIELD_GET(TEMP_THRESHOLD2_STATUS, v));
> +}
> +
> +static ssize_t temp1_alarm_policy_show(struct device *dev,
> +				       struct device_attribute *attr, char *buf)
> +{
> +	struct dfl_feature *feature = dev_get_drvdata(dev);
> +	u64 v;
> +
> +	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> +
> +	return scnprintf(buf, PAGE_SIZE, "%u\n",
> +			 (unsigned int)FIELD_GET(TEMP_THRESHOLD1_POLICY, v));
> +}
> +
> +static DEVICE_ATTR_RO(temp1_alarm_status);
> +static DEVICE_ATTR_RO(temp1_crit_status);
> +static DEVICE_ATTR_RO(temp1_alarm_policy);
> +
> +static struct attribute *thermal_extra_attrs[] = {
> +	&dev_attr_temp1_alarm_status.attr,
> +	&dev_attr_temp1_crit_status.attr,
> +	&dev_attr_temp1_alarm_policy.attr,
> +	NULL,
> +};
> +
> +static umode_t thermal_extra_attrs_visible(struct kobject *kobj,
> +					   struct attribute *attr, int index)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct dfl_feature *feature = dev_get_drvdata(dev);
> +
> +	return fme_thermal_throttle_support(feature->ioaddr) ? attr->mode : 0;
> +}
> +
> +static const struct attribute_group thermal_extra_group = {
> +	.attrs		= thermal_extra_attrs,
> +	.is_visible	= thermal_extra_attrs_visible,
> +};
> +__ATTRIBUTE_GROUPS(thermal_extra);
> +
> +static int fme_thermal_mgmt_init(struct platform_device *pdev,
> +				 struct dfl_feature *feature)
> +{
> +	struct device *hwmon;
> +
> +	dev_dbg(&pdev->dev, "FME Thermal Management Init.\n");
> +
> +	/*
> +	 * create hwmon to allow userspace monitoring temperature and other
> +	 * threshold information.
> +	 *
> +	 * temp1_alarm     -> hardware threshold 1 -> 50% or 90% throttling
> +	 * temp1_crit      -> hardware threshold 2 -> 100% throttling
> +	 * temp1_emergency -> hardware trip_threshold to shutdown FPGA
> +	 *
> +	 * create device specific sysfs interfaces, e.g. read temp1_alarm_policy
> +	 * to understand the actual hardware throttling action (50% vs 90%).
> +	 *
> +	 * If hardware doesn't support automatic throttling per thresholds,
> +	 * then all above sysfs interfaces are not visible except temp1_input
> +	 * for temperature.
> +	 */
> +	hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> +						     "dfl_fme_thermal", feature,
> +						     &thermal_hwmon_chip_info,
> +						     thermal_extra_groups);
> +	if (IS_ERR(hwmon)) {
> +		dev_err(&pdev->dev, "Fail to register thermal hwmon\n");
> +		return PTR_ERR(hwmon);
> +	}
> +
> +	return 0;
> +}
> +
> +static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> +				   struct dfl_feature *feature)
> +{
> +	dev_dbg(&pdev->dev, "FME Thermal Management UInit.\n");
> +}
> +
> +static const struct dfl_feature_id fme_thermal_mgmt_id_table[] = {
> +	{.id = FME_FEATURE_ID_THERMAL_MGMT,},
> +	{0,}
> +};
> +
> +static const struct dfl_feature_ops fme_thermal_mgmt_ops = {
> +	.init = fme_thermal_mgmt_init,
> +	.uinit = fme_thermal_mgmt_uinit,
> +};
> +
>  static struct dfl_feature_driver fme_feature_drvs[] = {
>  	{
>  		.id_table = fme_hdr_id_table,
> @@ -227,6 +435,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
>  		.ops = &fme_pr_mgmt_ops,
>  	},
>  	{
> +		.id_table = fme_thermal_mgmt_id_table,
> +		.ops = &fme_thermal_mgmt_ops,
> +	},
> +	{
>  		.ops = NULL,
>  	},
>  };
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/18] fpga: dfl: fme: add thermal management support
  2019-05-07 18:20   ` Alan Tull
@ 2019-05-07 18:35     ` Guenter Roeck
  2019-05-08  6:07       ` Wu Hao
  0 siblings, 1 reply; 42+ messages in thread
From: Guenter Roeck @ 2019-05-07 18:35 UTC (permalink / raw)
  To: Alan Tull
  Cc: Wu Hao, Moritz Fischer, linux-fpga, linux-kernel, linux-api,
	Luwei Kang, Russ Weight, Xu Yilun, Jean Delvare,
	Linux HWMON List

On Tue, May 07, 2019 at 01:20:52PM -0500, Alan Tull wrote:
> On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> + The hwmon people
> 
> >
> > This patch adds support to thermal management private feature for DFL
> > FPGA Management Engine (FME). This private feature driver registers
> > a hwmon for thermal/temperature monitoring (hwmon temp1_input).
> > If hardware automatic throttling is supported by this hardware, then
> > driver also exposes sysfs interfaces under hwmon for thresholds
> > (temp1_alarm/ crit/ emergency), threshold status (temp1_alarm_status/
> > temp1_crit_status) and throttling policy (temp1_alarm_policy).
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > ---
> > v2: create a dfl_fme_thermal hwmon to expose thermal information.
> >     move all sysfs interfaces under hwmon
> >         tempareture       --> hwmon temp1_input
> >         threshold1        --> hwmon temp1_alarm
> >         threshold2        --> hwmon temp1_crit
> >         trip_threshold    --> hwmon temp1_emergency
> >         threshold1_status --> hwmon temp1_alarm_status
> >         threshold2_status --> hwmon temp1_crit_status
> >         threshold1_policy --> hwmon temp1_alarm_policy

You should not write a hwmon driver if you don't want to follow the ABI.
The implementation will only confuse the sensors command, so what exactly
is the point ?

More on that below.

Guenter

> > ---
> >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  64 +++++++
> >  drivers/fpga/Kconfig                             |   2 +-
> >  drivers/fpga/dfl-fme-main.c                      | 212 +++++++++++++++++++++++
> >  3 files changed, 277 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > index d1aa375..dfbd315 100644
> > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > @@ -44,3 +44,67 @@ Description: Read-only. It returns socket_id to indicate which socket
> >                 this FPGA belongs to, only valid for integrated solution.
> >                 User only needs this information, in case standard numa node
> >                 can't provide correct information.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/name
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read this file to get the name of hwmon device, it
> > +               supports values:
> > +                   'dfl_fme_thermal' - thermal hwmon device name
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns FPGA device temperature in millidegrees
> > +               Celsius.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns hardware threshold1 temperature in
> > +               millidegrees Celsius. If temperature rises at or above this
> > +               threshold, hardware starts 50% or 90% throttling (see
> > +               'temp1_alarm_policy').
> > +

This does not follow the ABI. temp1_alarm is the alarm status, not the alarm
temperature. The ABI attribute name would be temp1_max.

> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns hardware threshold2 temperature in
> > +               millidegrees Celsius. If temperature rises at or above this
> > +               threshold, hardware starts 100% throttling.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_emergency
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns hardware trip threshold temperature in
> > +               millidegrees Celsius. If temperature rises at or above this
> > +               threshold, a fatal event will be triggered to board management
> > +               controller (BMC) to shutdown FPGA.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_status
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-only. It returns 1 if temperature is currently at or above
> > +               hardware threshold1 (see 'temp1_alarm'), otherwise 0.
> > +

Why not follow the ABI and use temp1_alarm ?

> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit_status
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-only. It returns 1 if temperature is currently at or above
> > +               hardware threshold2 (see 'temp1_crit'), otherwise 0.
> > +

Why not follow the ABI and use temp1_crit_alarm ?

> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_policy
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read this file to get the policy of hardware threshold1
> > +               (see 'temp1_alarm'). It only supports two values (policies):
> > +                   0 - AP2 state (90% throttling)
> > +                   1 - AP1 state (50% throttling)
> > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> > index c20445b..a6d7588 100644
> > --- a/drivers/fpga/Kconfig
> > +++ b/drivers/fpga/Kconfig
> > @@ -154,7 +154,7 @@ config FPGA_DFL
> >
> >  config FPGA_DFL_FME
> >         tristate "FPGA DFL FME Driver"
> > -       depends on FPGA_DFL
> > +       depends on FPGA_DFL && HWMON
> >         help
> >           The FPGA Management Engine (FME) is a feature device implemented
> >           under Device Feature List (DFL) framework. Select this option to
> > diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> > index 8339ee8..b9a68b8 100644
> > --- a/drivers/fpga/dfl-fme-main.c
> > +++ b/drivers/fpga/dfl-fme-main.c
> > @@ -14,6 +14,8 @@
> >   *   Henry Mitchel <henry.mitchel@intel.com>
> >   */
> >
> > +#include <linux/hwmon.h>
> > +#include <linux/hwmon-sysfs.h>
> >  #include <linux/kernel.h>
> >  #include <linux/module.h>
> >  #include <linux/uaccess.h>
> > @@ -217,6 +219,212 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
> >         .ioctl = fme_hdr_ioctl,
> >  };
> >
> > +#define FME_THERM_THRESHOLD    0x8
> > +#define TEMP_THRESHOLD1                GENMASK_ULL(6, 0)
> > +#define TEMP_THRESHOLD1_EN     BIT_ULL(7)
> > +#define TEMP_THRESHOLD2                GENMASK_ULL(14, 8)
> > +#define TEMP_THRESHOLD2_EN     BIT_ULL(15)
> > +#define TRIP_THRESHOLD         GENMASK_ULL(30, 24)
> > +#define TEMP_THRESHOLD1_STATUS BIT_ULL(32)             /* threshold1 reached */
> > +#define TEMP_THRESHOLD2_STATUS BIT_ULL(33)             /* threshold2 reached */
> > +/* threshold1 policy: 0 - AP2 (90% throttle) / 1 - AP1 (50% throttle) */
> > +#define TEMP_THRESHOLD1_POLICY BIT_ULL(44)
> > +
> > +#define FME_THERM_RDSENSOR_FMT1        0x10
> > +#define FPGA_TEMPERATURE       GENMASK_ULL(6, 0)
> > +
> > +#define FME_THERM_CAP          0x20
> > +#define THERM_NO_THROTTLE      BIT_ULL(0)
> > +
> > +#define MD_PRE_DEG
> > +
> > +static bool fme_thermal_throttle_support(void __iomem *base)
> > +{
> > +       u64 v = readq(base + FME_THERM_CAP);
> > +
> > +       return FIELD_GET(THERM_NO_THROTTLE, v) ? false : true;
> > +}
> > +
> > +static umode_t thermal_hwmon_attrs_visible(const void *drvdata,
> > +                                          enum hwmon_sensor_types type,
> > +                                          u32 attr, int channel)
> > +{
> > +       const struct dfl_feature *feature = drvdata;
> > +
> > +       /* temperature is always supported, and check hardware cap for others */
> > +       if (attr == hwmon_temp_input)
> > +               return 0444;
> > +
> > +       return fme_thermal_throttle_support(feature->ioaddr) ? 0444 : 0;
> > +}
> > +
> > +static int thermal_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> > +                             u32 attr, int channel, long *val)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       switch (attr) {
> > +       case hwmon_temp_input:
> > +               v = readq(feature->ioaddr + FME_THERM_RDSENSOR_FMT1);
> > +               *val = (long)(FIELD_GET(FPGA_TEMPERATURE, v) * 1000);
> > +               break;
> > +       case hwmon_temp_alarm:
> > +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +               *val = (long)(FIELD_GET(TEMP_THRESHOLD1, v) * 1000);

This is supposed to return 0 or 1.

> > +               break;
> > +       case hwmon_temp_crit:
> > +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +               *val = (long)(FIELD_GET(TEMP_THRESHOLD2, v) * 1000);
> > +               break;
> > +       case hwmon_temp_emergency:
> > +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +               *val = (long)(FIELD_GET(TRIP_THRESHOLD, v) * 1000);
> > +               break;
> > +       default:
> > +               return -EOPNOTSUPP;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct hwmon_ops thermal_hwmon_ops = {
> > +       .is_visible = thermal_hwmon_attrs_visible,
> > +       .read = thermal_hwmon_read,
> > +};
> > +
> > +static const u32 thermal_hwmon_temp_config[] = {
> > +       HWMON_T_INPUT | HWMON_T_ALARM | HWMON_T_CRIT | HWMON_T_EMERGENCY,
> > +       0
> > +};
> > +
> > +static const struct hwmon_channel_info hwmon_temp_info = {
> > +       .type = hwmon_temp,
> > +       .config = thermal_hwmon_temp_config,
> > +};
> > +
> > +static const struct hwmon_channel_info *thermal_hwmon_info[] = {
> > +       &hwmon_temp_info,
> > +       NULL
> > +};
> > +
> > +static const struct hwmon_chip_info thermal_hwmon_chip_info = {
> > +       .ops = &thermal_hwmon_ops,
> > +       .info = thermal_hwmon_info,
> > +};
> > +
> > +static ssize_t temp1_alarm_status_show(struct device *dev,
> > +                                      struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD1_STATUS, v));
> > +}
> > +
> > +static ssize_t temp1_crit_status_show(struct device *dev,
> > +                                     struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD2_STATUS, v));
> > +}
> > +
> > +static ssize_t temp1_alarm_policy_show(struct device *dev,
> > +                                      struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD1_POLICY, v));
> > +}
> > +
> > +static DEVICE_ATTR_RO(temp1_alarm_status);
> > +static DEVICE_ATTR_RO(temp1_crit_status);
> > +static DEVICE_ATTR_RO(temp1_alarm_policy);
> > +
> > +static struct attribute *thermal_extra_attrs[] = {
> > +       &dev_attr_temp1_alarm_status.attr,
> > +       &dev_attr_temp1_crit_status.attr,

Why not use standard attributes for the above ?

> > +       &dev_attr_temp1_alarm_policy.attr,
> > +       NULL,
> > +};
> > +
> > +static umode_t thermal_extra_attrs_visible(struct kobject *kobj,
> > +                                          struct attribute *attr, int index)
> > +{
> > +       struct device *dev = kobj_to_dev(kobj);
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +
> > +       return fme_thermal_throttle_support(feature->ioaddr) ? attr->mode : 0;
> > +}
> > +
> > +static const struct attribute_group thermal_extra_group = {
> > +       .attrs          = thermal_extra_attrs,
> > +       .is_visible     = thermal_extra_attrs_visible,
> > +};
> > +__ATTRIBUTE_GROUPS(thermal_extra);
> > +
> > +static int fme_thermal_mgmt_init(struct platform_device *pdev,
> > +                                struct dfl_feature *feature)
> > +{
> > +       struct device *hwmon;
> > +
> > +       dev_dbg(&pdev->dev, "FME Thermal Management Init.\n");
> > +
> > +       /*
> > +        * create hwmon to allow userspace monitoring temperature and other
> > +        * threshold information.
> > +        *
> > +        * temp1_alarm     -> hardware threshold 1 -> 50% or 90% throttling
> > +        * temp1_crit      -> hardware threshold 2 -> 100% throttling
> > +        * temp1_emergency -> hardware trip_threshold to shutdown FPGA
> > +        *
> > +        * create device specific sysfs interfaces, e.g. read temp1_alarm_policy
> > +        * to understand the actual hardware throttling action (50% vs 90%).
> > +        *
> > +        * If hardware doesn't support automatic throttling per thresholds,
> > +        * then all above sysfs interfaces are not visible except temp1_input
> > +        * for temperature.
> > +        */
> > +       hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> > +                                                    "dfl_fme_thermal", feature,
> > +                                                    &thermal_hwmon_chip_info,
> > +                                                    thermal_extra_groups);
> > +       if (IS_ERR(hwmon)) {
> > +               dev_err(&pdev->dev, "Fail to register thermal hwmon\n");
> > +               return PTR_ERR(hwmon);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> > +                                  struct dfl_feature *feature)
> > +{
> > +       dev_dbg(&pdev->dev, "FME Thermal Management UInit.\n");
> > +}
> > +
> > +static const struct dfl_feature_id fme_thermal_mgmt_id_table[] = {
> > +       {.id = FME_FEATURE_ID_THERMAL_MGMT,},
> > +       {0,}
> > +};
> > +
> > +static const struct dfl_feature_ops fme_thermal_mgmt_ops = {
> > +       .init = fme_thermal_mgmt_init,
> > +       .uinit = fme_thermal_mgmt_uinit,
> > +};
> > +
> >  static struct dfl_feature_driver fme_feature_drvs[] = {
> >         {
> >                 .id_table = fme_hdr_id_table,
> > @@ -227,6 +435,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
> >                 .ops = &fme_pr_mgmt_ops,
> >         },
> >         {
> > +               .id_table = fme_thermal_mgmt_id_table,
> > +               .ops = &fme_thermal_mgmt_ops,
> > +       },
> > +       {
> >                 .ops = NULL,
> >         },
> >  };
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 16/18] fpga: dfl: fme: add power management support
  2019-05-07 18:23   ` Alan Tull
@ 2019-05-07 18:36     ` Guenter Roeck
  0 siblings, 0 replies; 42+ messages in thread
From: Guenter Roeck @ 2019-05-07 18:36 UTC (permalink / raw)
  To: Alan Tull
  Cc: Wu Hao, Moritz Fischer, linux-fpga, linux-kernel, linux-api,
	Luwei Kang, Xu Yilun, Jean Delvare, Linux HWMON List

On Tue, May 07, 2019 at 01:23:33PM -0500, Alan Tull wrote:
> On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> + hwmon folks
> 
> >
> > This patch adds support for power management private feature under
> > FPGA Management Engine (FME). This private feature driver registers
> > a hwmon for power (power1_input), thresholds information, e.g.
> > (power1_cap / crit) and also read-only sysfs interfaces for other
> > power management information. For configuration, user could write
> > threshold values via above power1_cap / crit sysfs interface
> > under hwmon too.
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > ---
> > v2: create a dfl_fme_power hwmon to expose power sysfs interfaces.
> >     move all sysfs interfaces under hwmon
> >         consumed          --> hwmon power1_input
> >         threshold1        --> hwmon power1_cap
> >         threshold2        --> hwmon power1_crit
> >         threshold1_status --> hwmon power1_cap_status
> >         threshold2_status --> hwmon power1_crit_status
> >         xeon_limit        --> hwmon power1_xeon_limit
> >         fpga_limit        --> hwmon power1_fpga_limit
> >         ltr               --> hwmon power1_ltr

Same response as before.

Guenter

> > ---
> >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  67 ++++++
> >  drivers/fpga/dfl-fme-main.c                      | 247 +++++++++++++++++++++++
> >  2 files changed, 314 insertions(+)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > index dfbd315..e2ba92d 100644
> > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > @@ -52,6 +52,7 @@ Contact:      Wu Hao <hao.wu@intel.com>
> >  Description:   Read-Only. Read this file to get the name of hwmon device, it
> >                 supports values:
> >                     'dfl_fme_thermal' - thermal hwmon device name
> > +                   'dfl_fme_power'   - power hwmon device name
> >
> >  What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> >  Date:          April 2019
> > @@ -108,3 +109,69 @@ Description:       Read-Only. Read this file to get the policy of hardware threshold1
> >                 (see 'temp1_alarm'). It only supports two values (policies):
> >                     0 - AP2 state (90% throttling)
> >                     1 - AP1 state (50% throttling)
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_input
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns current FPGA power consumption in uW.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_cap
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file to get current hardware power
> > +               threshold1 in uW. If power consumption rises at or above
> > +               this threshold, hardware starts 50% throttling.
> > +               Write this file to set current hardware power threshold1 in uW.
> > +               As hardware only accepts values in Watts, so input value will
> > +               be round down per Watts (< 1 watts part will be discarded).
> > +               Write fails with -EINVAL if input parsing fails or input isn't
> > +               in the valid range (0 - 127000000 uW).
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file to get current hardware power
> > +               threshold2 in uW. If power consumption rises at or above
> > +               this threshold, hardware starts 90% throttling.
> > +               Write this file to set current hardware power threshold2 in uW.
> > +               As hardware only accepts values in Watts, so input value will
> > +               be round down per Watts (< 1 watts part will be discarded).
> > +               Write fails with -EINVAL if input parsing fails or input isn't
> > +               in the valid range (0 - 127000000 uW).
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_cap_status
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-only. It returns 1 if power consumption is currently at or
> > +               above hardware threshold1 (see 'power1_cap'), otherwise 0.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit_status
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-only. It returns 1 if power consumption is currently at or
> > +               above hardware threshold2 (see 'power1_crit'), otherwise 0.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_xeon_limit
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns power limit for XEON in uW.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_fpga_limit
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. It returns power limit for FPGA in uW.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_ltr
> > +Date:          April 2019
> > +KernelVersion: 5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-only. Read this file to get current Latency Tolerance
> > +               Reporting (ltr) value. This ltr impacts the CPU low power
> > +               state in integrated solution.
> > diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> > index b9a68b8..7005316 100644
> > --- a/drivers/fpga/dfl-fme-main.c
> > +++ b/drivers/fpga/dfl-fme-main.c
> > @@ -425,6 +425,249 @@ static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> >         .uinit = fme_thermal_mgmt_uinit,
> >  };
> >
> > +#define FME_PWR_STATUS         0x8
> > +#define FME_LATENCY_TOLERANCE  BIT_ULL(18)
> > +#define PWR_CONSUMED           GENMASK_ULL(17, 0)
> > +
> > +#define FME_PWR_THRESHOLD      0x10
> > +#define PWR_THRESHOLD1         GENMASK_ULL(6, 0)       /* in Watts */
> > +#define PWR_THRESHOLD2         GENMASK_ULL(14, 8)      /* in Watts */
> > +#define PWR_THRESHOLD_MAX      0x7f                    /* in Watts */
> > +#define PWR_THRESHOLD1_STATUS  BIT_ULL(16)
> > +#define PWR_THRESHOLD2_STATUS  BIT_ULL(17)
> > +
> > +#define FME_PWR_XEON_LIMIT     0x18
> > +#define XEON_PWR_LIMIT         GENMASK_ULL(14, 0)      /* in 0.1 Watts */
> > +#define XEON_PWR_EN            BIT_ULL(15)
> > +#define FME_PWR_FPGA_LIMIT     0x20
> > +#define FPGA_PWR_LIMIT         GENMASK_ULL(14, 0)      /* in 0.1 Watts */
> > +#define FPGA_PWR_EN            BIT_ULL(15)
> > +
> > +#define PWR_THRESHOLD_MAX_IN_UW (PWR_THRESHOLD_MAX * 1000000)
> > +
> > +static int power_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> > +                           u32 attr, int channel, long *val)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       switch (attr) {
> > +       case hwmon_power_input:
> > +               v = readq(feature->ioaddr + FME_PWR_STATUS);
> > +               *val = (long)(FIELD_GET(PWR_CONSUMED, v) * 1000000);
> > +               break;
> > +       case hwmon_power_cap:
> > +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> > +               *val = (long)(FIELD_GET(PWR_THRESHOLD1, v) * 1000000);
> > +               break;
> > +       case hwmon_power_crit:
> > +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> > +               *val = (long)(FIELD_GET(PWR_THRESHOLD2, v) * 1000000);
> > +               break;
> > +       default:
> > +               return -EOPNOTSUPP;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int power_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
> > +                            u32 attr, int channel, long val)
> > +{
> > +       struct dfl_feature_platform_data *pdata = dev_get_platdata(dev->parent);
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       int ret = 0;
> > +       u64 v;
> > +
> > +       if (val < 0 || val > PWR_THRESHOLD_MAX_IN_UW)
> > +               return -EINVAL;
> > +
> > +       val = val / 1000000;
> > +
> > +       mutex_lock(&pdata->lock);
> > +
> > +       switch (attr) {
> > +       case hwmon_power_cap:
> > +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> > +               v &= ~PWR_THRESHOLD1;
> > +               v |= FIELD_PREP(PWR_THRESHOLD1, val);
> > +               writeq(v, feature->ioaddr + FME_PWR_THRESHOLD);
> > +               break;
> > +       case hwmon_power_crit:
> > +               v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> > +               v &= ~PWR_THRESHOLD2;
> > +               v |= FIELD_PREP(PWR_THRESHOLD2, val);
> > +               writeq(v, feature->ioaddr + FME_PWR_THRESHOLD);
> > +               break;
> > +       default:
> > +               ret = -EOPNOTSUPP;
> > +               break;
> > +       }
> > +
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return ret;
> > +}
> > +
> > +static umode_t power_hwmon_attrs_visible(const void *drvdata,
> > +                                        enum hwmon_sensor_types type,
> > +                                        u32 attr, int channel)
> > +{
> > +       switch (attr) {
> > +       case hwmon_power_input:
> > +               return 0444;
> > +       case hwmon_power_cap:
> > +       case hwmon_power_crit:
> > +               return 0644;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static const u32 power_hwmon_config[] = {
> > +       HWMON_P_INPUT | HWMON_P_CAP | HWMON_P_CRIT,
> > +       0
> > +};
> > +
> > +static const struct hwmon_channel_info hwmon_pwr_info = {
> > +       .type = hwmon_power,
> > +       .config = power_hwmon_config,
> > +};
> > +
> > +static const struct hwmon_channel_info *power_hwmon_info[] = {
> > +       &hwmon_pwr_info,
> > +       NULL
> > +};
> > +
> > +static const struct hwmon_ops power_hwmon_ops = {
> > +       .is_visible = power_hwmon_attrs_visible,
> > +       .read = power_hwmon_read,
> > +       .write = power_hwmon_write,
> > +};
> > +
> > +static const struct hwmon_chip_info power_hwmon_chip_info = {
> > +       .ops = &power_hwmon_ops,
> > +       .info = power_hwmon_info,
> > +};
> > +
> > +static ssize_t power1_cap_status_show(struct device *dev,
> > +                                     struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(PWR_THRESHOLD1_STATUS, v));
> > +}
> > +
> > +static ssize_t power1_crit_status_show(struct device *dev,
> > +                                      struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_PWR_THRESHOLD);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(PWR_THRESHOLD2_STATUS, v));
> > +}
> > +
> > +static ssize_t power1_xeon_limit_show(struct device *dev,
> > +                                     struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u16 xeon_limit = 0;
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_PWR_XEON_LIMIT);
> > +
> > +       if (FIELD_GET(XEON_PWR_EN, v))
> > +               xeon_limit = FIELD_GET(XEON_PWR_LIMIT, v);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n", xeon_limit * 100000);
> > +}
> > +
> > +static ssize_t power1_fpga_limit_show(struct device *dev,
> > +                                     struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u16 fpga_limit = 0;
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_PWR_FPGA_LIMIT);
> > +
> > +       if (FIELD_GET(FPGA_PWR_EN, v))
> > +               fpga_limit = FIELD_GET(FPGA_PWR_LIMIT, v);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n", fpga_limit * 100000);
> > +}
> > +
> > +static ssize_t power1_ltr_show(struct device *dev,
> > +                              struct device_attribute *attr, char *buf)
> > +{
> > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > +       u64 v;
> > +
> > +       v = readq(feature->ioaddr + FME_PWR_STATUS);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(FME_LATENCY_TOLERANCE, v));
> > +}
> > +
> > +static DEVICE_ATTR_RO(power1_cap_status);
> > +static DEVICE_ATTR_RO(power1_crit_status);
> > +static DEVICE_ATTR_RO(power1_xeon_limit);
> > +static DEVICE_ATTR_RO(power1_fpga_limit);
> > +static DEVICE_ATTR_RO(power1_ltr);
> > +
> > +static struct attribute *power_extra_attrs[] = {
> > +       &dev_attr_power1_cap_status.attr,
> > +       &dev_attr_power1_crit_status.attr,
> > +       &dev_attr_power1_xeon_limit.attr,
> > +       &dev_attr_power1_fpga_limit.attr,
> > +       &dev_attr_power1_ltr.attr,
> > +       NULL
> > +};
> > +
> > +ATTRIBUTE_GROUPS(power_extra);
> > +
> > +static int fme_power_mgmt_init(struct platform_device *pdev,
> > +                              struct dfl_feature *feature)
> > +{
> > +       struct device *hwmon;
> > +
> > +       dev_dbg(&pdev->dev, "FME Power Management Init.\n");
> > +
> > +       hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> > +                                                    "dfl_fme_power", feature,
> > +                                                    &power_hwmon_chip_info,
> > +                                                    power_extra_groups);
> > +       if (IS_ERR(hwmon)) {
> > +               dev_err(&pdev->dev, "Fail to register power hwmon\n");
> > +               return PTR_ERR(hwmon);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void fme_power_mgmt_uinit(struct platform_device *pdev,
> > +                                struct dfl_feature *feature)
> > +{
> > +       dev_dbg(&pdev->dev, "FME Power Management UInit.\n");
> > +}
> > +
> > +static const struct dfl_feature_id fme_power_mgmt_id_table[] = {
> > +       {.id = FME_FEATURE_ID_POWER_MGMT,},
> > +       {0,}
> > +};
> > +
> > +static const struct dfl_feature_ops fme_power_mgmt_ops = {
> > +       .init = fme_power_mgmt_init,
> > +       .uinit = fme_power_mgmt_uinit,
> > +};
> > +
> >  static struct dfl_feature_driver fme_feature_drvs[] = {
> >         {
> >                 .id_table = fme_hdr_id_table,
> > @@ -439,6 +682,10 @@ static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> >                 .ops = &fme_thermal_mgmt_ops,
> >         },
> >         {
> > +               .id_table = fme_power_mgmt_id_table,
> > +               .ops = &fme_power_mgmt_ops,
> > +       },
> > +       {
> >                 .ops = NULL,
> >         },
> >  };
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/18] fpga: dfl: fme: add thermal management support
  2019-05-07 18:35     ` Guenter Roeck
@ 2019-05-08  6:07       ` Wu Hao
  0 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-05-08  6:07 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alan Tull, Moritz Fischer, linux-fpga, linux-kernel, linux-api,
	Luwei Kang, Russ Weight, Xu Yilun, Jean Delvare,
	Linux HWMON List

On Tue, May 07, 2019 at 11:35:36AM -0700, Guenter Roeck wrote:
> On Tue, May 07, 2019 at 01:20:52PM -0500, Alan Tull wrote:
> > On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:
> > 
> > + The hwmon people
> > 
> > >
> > > This patch adds support to thermal management private feature for DFL
> > > FPGA Management Engine (FME). This private feature driver registers
> > > a hwmon for thermal/temperature monitoring (hwmon temp1_input).
> > > If hardware automatic throttling is supported by this hardware, then
> > > driver also exposes sysfs interfaces under hwmon for thresholds
> > > (temp1_alarm/ crit/ emergency), threshold status (temp1_alarm_status/
> > > temp1_crit_status) and throttling policy (temp1_alarm_policy).
> > >
> > > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > > ---
> > > v2: create a dfl_fme_thermal hwmon to expose thermal information.
> > >     move all sysfs interfaces under hwmon
> > >         tempareture       --> hwmon temp1_input
> > >         threshold1        --> hwmon temp1_alarm
> > >         threshold2        --> hwmon temp1_crit
> > >         trip_threshold    --> hwmon temp1_emergency
> > >         threshold1_status --> hwmon temp1_alarm_status
> > >         threshold2_status --> hwmon temp1_crit_status
> > >         threshold1_policy --> hwmon temp1_alarm_policy
> 
> You should not write a hwmon driver if you don't want to follow the ABI.
> The implementation will only confuse the sensors command, so what exactly
> is the point ?
> 
> More on that below.

Hi Guenter,

Thanks a lot for the review comments. Yes, I should use the standard ABI of
the hwmon. I will fix them in the next version patch.

For thermal hwmon

 tempareture       --> hwmon temp1_input
 threshold1        --> hwmon temp1_alarm          ---> temp1_max
 threshold2        --> hwmon temp1_crit          
 trip_threshold    --> hwmon temp1_emergency
 threshold1_status --> hwmon temp1_alarm_status   ---> temp1_max_alarm
 threshold2_status --> hwmon temp1_crit_status    ---> temp1_crit_alarm
 threshold1_policy --> hwmon temp1_alarm_policy   ---> temp1_max_policy

and power hwmon

 consumed          --> hwmon power1_input
 threshold1        --> hwmon power1_cap           ---> power1_max
 threshold2        --> hwmon power1_crit
 threshold1_status --> hwmon power1_cap_status    ---> power1_max_alarm
 threshold2_status --> hwmon power1_crit_status   ---> power1_crit_alarm
 xeon_limit        --> hwmon power1_xeon_limit
 fpga_limit        --> hwmon power1_fpga_limit
 ltr               --> hwmon power1_ltr

switch to power1_max in power hwmon to make it aligned with thermal hwmon on
threshold1.

Thanks
Hao

> 
> Guenter
> 
> > > ---
> > >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  64 +++++++
> > >  drivers/fpga/Kconfig                             |   2 +-
> > >  drivers/fpga/dfl-fme-main.c                      | 212 +++++++++++++++++++++++
> > >  3 files changed, 277 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > index d1aa375..dfbd315 100644
> > > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > @@ -44,3 +44,67 @@ Description: Read-only. It returns socket_id to indicate which socket
> > >                 this FPGA belongs to, only valid for integrated solution.
> > >                 User only needs this information, in case standard numa node
> > >                 can't provide correct information.
> > > +
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/name
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-Only. Read this file to get the name of hwmon device, it
> > > +               supports values:
> > > +                   'dfl_fme_thermal' - thermal hwmon device name
> > > +
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-Only. It returns FPGA device temperature in millidegrees
> > > +               Celsius.
> > > +
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-Only. It returns hardware threshold1 temperature in
> > > +               millidegrees Celsius. If temperature rises at or above this
> > > +               threshold, hardware starts 50% or 90% throttling (see
> > > +               'temp1_alarm_policy').
> > > +
> 
> This does not follow the ABI. temp1_alarm is the alarm status, not the alarm
> temperature. The ABI attribute name would be temp1_max.
> 
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-Only. It returns hardware threshold2 temperature in
> > > +               millidegrees Celsius. If temperature rises at or above this
> > > +               threshold, hardware starts 100% throttling.
> > > +
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_emergency
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-Only. It returns hardware trip threshold temperature in
> > > +               millidegrees Celsius. If temperature rises at or above this
> > > +               threshold, a fatal event will be triggered to board management
> > > +               controller (BMC) to shutdown FPGA.
> > > +
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_status
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-only. It returns 1 if temperature is currently at or above
> > > +               hardware threshold1 (see 'temp1_alarm'), otherwise 0.
> > > +
> 
> Why not follow the ABI and use temp1_alarm ?
> 
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit_status
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-only. It returns 1 if temperature is currently at or above
> > > +               hardware threshold2 (see 'temp1_crit'), otherwise 0.
> > > +
> 
> Why not follow the ABI and use temp1_crit_alarm ?
> 
> > > +What:          /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_policy
> > > +Date:          April 2019
> > > +KernelVersion: 5.2
> > > +Contact:       Wu Hao <hao.wu@intel.com>
> > > +Description:   Read-Only. Read this file to get the policy of hardware threshold1
> > > +               (see 'temp1_alarm'). It only supports two values (policies):
> > > +                   0 - AP2 state (90% throttling)
> > > +                   1 - AP1 state (50% throttling)
> > > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> > > index c20445b..a6d7588 100644
> > > --- a/drivers/fpga/Kconfig
> > > +++ b/drivers/fpga/Kconfig
> > > @@ -154,7 +154,7 @@ config FPGA_DFL
> > >
> > >  config FPGA_DFL_FME
> > >         tristate "FPGA DFL FME Driver"
> > > -       depends on FPGA_DFL
> > > +       depends on FPGA_DFL && HWMON
> > >         help
> > >           The FPGA Management Engine (FME) is a feature device implemented
> > >           under Device Feature List (DFL) framework. Select this option to
> > > diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> > > index 8339ee8..b9a68b8 100644
> > > --- a/drivers/fpga/dfl-fme-main.c
> > > +++ b/drivers/fpga/dfl-fme-main.c
> > > @@ -14,6 +14,8 @@
> > >   *   Henry Mitchel <henry.mitchel@intel.com>
> > >   */
> > >
> > > +#include <linux/hwmon.h>
> > > +#include <linux/hwmon-sysfs.h>
> > >  #include <linux/kernel.h>
> > >  #include <linux/module.h>
> > >  #include <linux/uaccess.h>
> > > @@ -217,6 +219,212 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
> > >         .ioctl = fme_hdr_ioctl,
> > >  };
> > >
> > > +#define FME_THERM_THRESHOLD    0x8
> > > +#define TEMP_THRESHOLD1                GENMASK_ULL(6, 0)
> > > +#define TEMP_THRESHOLD1_EN     BIT_ULL(7)
> > > +#define TEMP_THRESHOLD2                GENMASK_ULL(14, 8)
> > > +#define TEMP_THRESHOLD2_EN     BIT_ULL(15)
> > > +#define TRIP_THRESHOLD         GENMASK_ULL(30, 24)
> > > +#define TEMP_THRESHOLD1_STATUS BIT_ULL(32)             /* threshold1 reached */
> > > +#define TEMP_THRESHOLD2_STATUS BIT_ULL(33)             /* threshold2 reached */
> > > +/* threshold1 policy: 0 - AP2 (90% throttle) / 1 - AP1 (50% throttle) */
> > > +#define TEMP_THRESHOLD1_POLICY BIT_ULL(44)
> > > +
> > > +#define FME_THERM_RDSENSOR_FMT1        0x10
> > > +#define FPGA_TEMPERATURE       GENMASK_ULL(6, 0)
> > > +
> > > +#define FME_THERM_CAP          0x20
> > > +#define THERM_NO_THROTTLE      BIT_ULL(0)
> > > +
> > > +#define MD_PRE_DEG
> > > +
> > > +static bool fme_thermal_throttle_support(void __iomem *base)
> > > +{
> > > +       u64 v = readq(base + FME_THERM_CAP);
> > > +
> > > +       return FIELD_GET(THERM_NO_THROTTLE, v) ? false : true;
> > > +}
> > > +
> > > +static umode_t thermal_hwmon_attrs_visible(const void *drvdata,
> > > +                                          enum hwmon_sensor_types type,
> > > +                                          u32 attr, int channel)
> > > +{
> > > +       const struct dfl_feature *feature = drvdata;
> > > +
> > > +       /* temperature is always supported, and check hardware cap for others */
> > > +       if (attr == hwmon_temp_input)
> > > +               return 0444;
> > > +
> > > +       return fme_thermal_throttle_support(feature->ioaddr) ? 0444 : 0;
> > > +}
> > > +
> > > +static int thermal_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> > > +                             u32 attr, int channel, long *val)
> > > +{
> > > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > > +       u64 v;
> > > +
> > > +       switch (attr) {
> > > +       case hwmon_temp_input:
> > > +               v = readq(feature->ioaddr + FME_THERM_RDSENSOR_FMT1);
> > > +               *val = (long)(FIELD_GET(FPGA_TEMPERATURE, v) * 1000);
> > > +               break;
> > > +       case hwmon_temp_alarm:
> > > +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > > +               *val = (long)(FIELD_GET(TEMP_THRESHOLD1, v) * 1000);
> 
> This is supposed to return 0 or 1.
> 
> > > +               break;
> > > +       case hwmon_temp_crit:
> > > +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > > +               *val = (long)(FIELD_GET(TEMP_THRESHOLD2, v) * 1000);
> > > +               break;
> > > +       case hwmon_temp_emergency:
> > > +               v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > > +               *val = (long)(FIELD_GET(TRIP_THRESHOLD, v) * 1000);
> > > +               break;
> > > +       default:
> > > +               return -EOPNOTSUPP;
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static const struct hwmon_ops thermal_hwmon_ops = {
> > > +       .is_visible = thermal_hwmon_attrs_visible,
> > > +       .read = thermal_hwmon_read,
> > > +};
> > > +
> > > +static const u32 thermal_hwmon_temp_config[] = {
> > > +       HWMON_T_INPUT | HWMON_T_ALARM | HWMON_T_CRIT | HWMON_T_EMERGENCY,
> > > +       0
> > > +};
> > > +
> > > +static const struct hwmon_channel_info hwmon_temp_info = {
> > > +       .type = hwmon_temp,
> > > +       .config = thermal_hwmon_temp_config,
> > > +};
> > > +
> > > +static const struct hwmon_channel_info *thermal_hwmon_info[] = {
> > > +       &hwmon_temp_info,
> > > +       NULL
> > > +};
> > > +
> > > +static const struct hwmon_chip_info thermal_hwmon_chip_info = {
> > > +       .ops = &thermal_hwmon_ops,
> > > +       .info = thermal_hwmon_info,
> > > +};
> > > +
> > > +static ssize_t temp1_alarm_status_show(struct device *dev,
> > > +                                      struct device_attribute *attr, char *buf)
> > > +{
> > > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > > +       u64 v;
> > > +
> > > +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > > +
> > > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > > +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD1_STATUS, v));
> > > +}
> > > +
> > > +static ssize_t temp1_crit_status_show(struct device *dev,
> > > +                                     struct device_attribute *attr, char *buf)
> > > +{
> > > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > > +       u64 v;
> > > +
> > > +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > > +
> > > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > > +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD2_STATUS, v));
> > > +}
> > > +
> > > +static ssize_t temp1_alarm_policy_show(struct device *dev,
> > > +                                      struct device_attribute *attr, char *buf)
> > > +{
> > > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > > +       u64 v;
> > > +
> > > +       v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > > +
> > > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > > +                        (unsigned int)FIELD_GET(TEMP_THRESHOLD1_POLICY, v));
> > > +}
> > > +
> > > +static DEVICE_ATTR_RO(temp1_alarm_status);
> > > +static DEVICE_ATTR_RO(temp1_crit_status);
> > > +static DEVICE_ATTR_RO(temp1_alarm_policy);
> > > +
> > > +static struct attribute *thermal_extra_attrs[] = {
> > > +       &dev_attr_temp1_alarm_status.attr,
> > > +       &dev_attr_temp1_crit_status.attr,
> 
> Why not use standard attributes for the above ?
> 
> > > +       &dev_attr_temp1_alarm_policy.attr,
> > > +       NULL,
> > > +};
> > > +
> > > +static umode_t thermal_extra_attrs_visible(struct kobject *kobj,
> > > +                                          struct attribute *attr, int index)
> > > +{
> > > +       struct device *dev = kobj_to_dev(kobj);
> > > +       struct dfl_feature *feature = dev_get_drvdata(dev);
> > > +
> > > +       return fme_thermal_throttle_support(feature->ioaddr) ? attr->mode : 0;
> > > +}
> > > +
> > > +static const struct attribute_group thermal_extra_group = {
> > > +       .attrs          = thermal_extra_attrs,
> > > +       .is_visible     = thermal_extra_attrs_visible,
> > > +};
> > > +__ATTRIBUTE_GROUPS(thermal_extra);
> > > +
> > > +static int fme_thermal_mgmt_init(struct platform_device *pdev,
> > > +                                struct dfl_feature *feature)
> > > +{
> > > +       struct device *hwmon;
> > > +
> > > +       dev_dbg(&pdev->dev, "FME Thermal Management Init.\n");
> > > +
> > > +       /*
> > > +        * create hwmon to allow userspace monitoring temperature and other
> > > +        * threshold information.
> > > +        *
> > > +        * temp1_alarm     -> hardware threshold 1 -> 50% or 90% throttling
> > > +        * temp1_crit      -> hardware threshold 2 -> 100% throttling
> > > +        * temp1_emergency -> hardware trip_threshold to shutdown FPGA
> > > +        *
> > > +        * create device specific sysfs interfaces, e.g. read temp1_alarm_policy
> > > +        * to understand the actual hardware throttling action (50% vs 90%).
> > > +        *
> > > +        * If hardware doesn't support automatic throttling per thresholds,
> > > +        * then all above sysfs interfaces are not visible except temp1_input
> > > +        * for temperature.
> > > +        */
> > > +       hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> > > +                                                    "dfl_fme_thermal", feature,
> > > +                                                    &thermal_hwmon_chip_info,
> > > +                                                    thermal_extra_groups);
> > > +       if (IS_ERR(hwmon)) {
> > > +               dev_err(&pdev->dev, "Fail to register thermal hwmon\n");
> > > +               return PTR_ERR(hwmon);
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> > > +                                  struct dfl_feature *feature)
> > > +{
> > > +       dev_dbg(&pdev->dev, "FME Thermal Management UInit.\n");
> > > +}
> > > +
> > > +static const struct dfl_feature_id fme_thermal_mgmt_id_table[] = {
> > > +       {.id = FME_FEATURE_ID_THERMAL_MGMT,},
> > > +       {0,}
> > > +};
> > > +
> > > +static const struct dfl_feature_ops fme_thermal_mgmt_ops = {
> > > +       .init = fme_thermal_mgmt_init,
> > > +       .uinit = fme_thermal_mgmt_uinit,
> > > +};
> > > +
> > >  static struct dfl_feature_driver fme_feature_drvs[] = {
> > >         {
> > >                 .id_table = fme_hdr_id_table,
> > > @@ -227,6 +435,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
> > >                 .ops = &fme_pr_mgmt_ops,
> > >         },
> > >         {
> > > +               .id_table = fme_thermal_mgmt_id_table,
> > > +               .ops = &fme_thermal_mgmt_ops,
> > > +       },
> > > +       {
> > >                 .ops = NULL,
> > >         },
> > >  };
> > > --
> > > 1.8.3.1
> > >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 15/18] fpga: dfl: fme: add thermal management support
  2019-05-07 18:30   ` Moritz Fischer
@ 2019-05-08  6:11     ` Wu Hao
  0 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-05-08  6:11 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: atull, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Russ Weight, Xu Yilun, linux-hwmon, linux

On Tue, May 07, 2019 at 11:30:57AM -0700, Moritz Fischer wrote:
> Please for next round:
> 
> +CC linux-hwmon, Guenter etc ...

Thanks a lot for the kindly reminder.

I will make sure linux-hwmon, Guenter cced for the next version patchset.

Thanks
Hao

> 
> On Mon, Apr 29, 2019 at 04:55:48PM +0800, Wu Hao wrote:
> > This patch adds support to thermal management private feature for DFL
> > FPGA Management Engine (FME). This private feature driver registers
> > a hwmon for thermal/temperature monitoring (hwmon temp1_input).
> > If hardware automatic throttling is supported by this hardware, then
> > driver also exposes sysfs interfaces under hwmon for thresholds
> > (temp1_alarm/ crit/ emergency), threshold status (temp1_alarm_status/
> > temp1_crit_status) and throttling policy (temp1_alarm_policy).
> > 
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > ---
> > v2: create a dfl_fme_thermal hwmon to expose thermal information.
> >     move all sysfs interfaces under hwmon
> > 	tempareture       --> hwmon temp1_input
> > 	threshold1        --> hwmon temp1_alarm
> > 	threshold2        --> hwmon temp1_crit
> > 	trip_threshold    --> hwmon temp1_emergency
> > 	threshold1_status --> hwmon temp1_alarm_status
> > 	threshold2_status --> hwmon temp1_crit_status
> > 	threshold1_policy --> hwmon temp1_alarm_policy
> > ---
> >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  64 +++++++
> >  drivers/fpga/Kconfig                             |   2 +-
> >  drivers/fpga/dfl-fme-main.c                      | 212 +++++++++++++++++++++++
> >  3 files changed, 277 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > index d1aa375..dfbd315 100644
> > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > @@ -44,3 +44,67 @@ Description:	Read-only. It returns socket_id to indicate which socket
> >  		this FPGA belongs to, only valid for integrated solution.
> >  		User only needs this information, in case standard numa node
> >  		can't provide correct information.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/name
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-Only. Read this file to get the name of hwmon device, it
> > +		supports values:
> > +		    'dfl_fme_thermal' - thermal hwmon device name
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-Only. It returns FPGA device temperature in millidegrees
> > +		Celsius.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-Only. It returns hardware threshold1 temperature in
> > +		millidegrees Celsius. If temperature rises at or above this
> > +		threshold, hardware starts 50% or 90% throttling (see
> > +		'temp1_alarm_policy').
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-Only. It returns hardware threshold2 temperature in
> > +		millidegrees Celsius. If temperature rises at or above this
> > +		threshold, hardware starts 100% throttling.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_emergency
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-Only. It returns hardware trip threshold temperature in
> > +		millidegrees Celsius. If temperature rises at or above this
> > +		threshold, a fatal event will be triggered to board management
> > +		controller (BMC) to shutdown FPGA.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_status
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. It returns 1 if temperature is currently at or above
> > +		hardware threshold1 (see 'temp1_alarm'), otherwise 0.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_crit_status
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-only. It returns 1 if temperature is currently at or above
> > +		hardware threshold2 (see 'temp1_crit'), otherwise 0.
> > +
> > +What:		/sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_alarm_policy
> > +Date:		April 2019
> > +KernelVersion:	5.2
> > +Contact:	Wu Hao <hao.wu@intel.com>
> > +Description:	Read-Only. Read this file to get the policy of hardware threshold1
> > +		(see 'temp1_alarm'). It only supports two values (policies):
> > +		    0 - AP2 state (90% throttling)
> > +		    1 - AP1 state (50% throttling)
> > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig
> > index c20445b..a6d7588 100644
> > --- a/drivers/fpga/Kconfig
> > +++ b/drivers/fpga/Kconfig
> > @@ -154,7 +154,7 @@ config FPGA_DFL
> >  
> >  config FPGA_DFL_FME
> >  	tristate "FPGA DFL FME Driver"
> > -	depends on FPGA_DFL
> > +	depends on FPGA_DFL && HWMON
> >  	help
> >  	  The FPGA Management Engine (FME) is a feature device implemented
> >  	  under Device Feature List (DFL) framework. Select this option to
> > diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> > index 8339ee8..b9a68b8 100644
> > --- a/drivers/fpga/dfl-fme-main.c
> > +++ b/drivers/fpga/dfl-fme-main.c
> > @@ -14,6 +14,8 @@
> >   *   Henry Mitchel <henry.mitchel@intel.com>
> >   */
> >  
> > +#include <linux/hwmon.h>
> > +#include <linux/hwmon-sysfs.h>
> >  #include <linux/kernel.h>
> >  #include <linux/module.h>
> >  #include <linux/uaccess.h>
> > @@ -217,6 +219,212 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
> >  	.ioctl = fme_hdr_ioctl,
> >  };
> >  
> > +#define FME_THERM_THRESHOLD	0x8
> > +#define TEMP_THRESHOLD1		GENMASK_ULL(6, 0)
> > +#define TEMP_THRESHOLD1_EN	BIT_ULL(7)
> > +#define TEMP_THRESHOLD2		GENMASK_ULL(14, 8)
> > +#define TEMP_THRESHOLD2_EN	BIT_ULL(15)
> > +#define TRIP_THRESHOLD		GENMASK_ULL(30, 24)
> > +#define TEMP_THRESHOLD1_STATUS	BIT_ULL(32)		/* threshold1 reached */
> > +#define TEMP_THRESHOLD2_STATUS	BIT_ULL(33)		/* threshold2 reached */
> > +/* threshold1 policy: 0 - AP2 (90% throttle) / 1 - AP1 (50% throttle) */
> > +#define TEMP_THRESHOLD1_POLICY	BIT_ULL(44)
> > +
> > +#define FME_THERM_RDSENSOR_FMT1	0x10
> > +#define FPGA_TEMPERATURE	GENMASK_ULL(6, 0)
> > +
> > +#define FME_THERM_CAP		0x20
> > +#define THERM_NO_THROTTLE	BIT_ULL(0)
> > +
> > +#define MD_PRE_DEG
> > +
> > +static bool fme_thermal_throttle_support(void __iomem *base)
> > +{
> > +	u64 v = readq(base + FME_THERM_CAP);
> > +
> > +	return FIELD_GET(THERM_NO_THROTTLE, v) ? false : true;
> > +}
> > +
> > +static umode_t thermal_hwmon_attrs_visible(const void *drvdata,
> > +					   enum hwmon_sensor_types type,
> > +					   u32 attr, int channel)
> > +{
> > +	const struct dfl_feature *feature = drvdata;
> > +
> > +	/* temperature is always supported, and check hardware cap for others */
> > +	if (attr == hwmon_temp_input)
> > +		return 0444;
> > +
> > +	return fme_thermal_throttle_support(feature->ioaddr) ? 0444 : 0;
> > +}
> > +
> > +static int thermal_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> > +			      u32 attr, int channel, long *val)
> > +{
> > +	struct dfl_feature *feature = dev_get_drvdata(dev);
> > +	u64 v;
> > +
> > +	switch (attr) {
> > +	case hwmon_temp_input:
> > +		v = readq(feature->ioaddr + FME_THERM_RDSENSOR_FMT1);
> > +		*val = (long)(FIELD_GET(FPGA_TEMPERATURE, v) * 1000);
> > +		break;
> > +	case hwmon_temp_alarm:
> > +		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +		*val = (long)(FIELD_GET(TEMP_THRESHOLD1, v) * 1000);
> > +		break;
> > +	case hwmon_temp_crit:
> > +		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +		*val = (long)(FIELD_GET(TEMP_THRESHOLD2, v) * 1000);
> > +		break;
> > +	case hwmon_temp_emergency:
> > +		v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +		*val = (long)(FIELD_GET(TRIP_THRESHOLD, v) * 1000);
> > +		break;
> > +	default:
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static const struct hwmon_ops thermal_hwmon_ops = {
> > +	.is_visible = thermal_hwmon_attrs_visible,
> > +	.read = thermal_hwmon_read,
> > +};
> > +
> > +static const u32 thermal_hwmon_temp_config[] = {
> > +	HWMON_T_INPUT | HWMON_T_ALARM | HWMON_T_CRIT | HWMON_T_EMERGENCY,
> > +	0
> > +};
> > +
> > +static const struct hwmon_channel_info hwmon_temp_info = {
> > +	.type = hwmon_temp,
> > +	.config = thermal_hwmon_temp_config,
> > +};
> > +
> > +static const struct hwmon_channel_info *thermal_hwmon_info[] = {
> > +	&hwmon_temp_info,
> > +	NULL
> > +};
> > +
> > +static const struct hwmon_chip_info thermal_hwmon_chip_info = {
> > +	.ops = &thermal_hwmon_ops,
> > +	.info = thermal_hwmon_info,
> > +};
> > +
> > +static ssize_t temp1_alarm_status_show(struct device *dev,
> > +				       struct device_attribute *attr, char *buf)
> > +{
> > +	struct dfl_feature *feature = dev_get_drvdata(dev);
> > +	u64 v;
> > +
> > +	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +			 (unsigned int)FIELD_GET(TEMP_THRESHOLD1_STATUS, v));
> > +}
> > +
> > +static ssize_t temp1_crit_status_show(struct device *dev,
> > +				      struct device_attribute *attr, char *buf)
> > +{
> > +	struct dfl_feature *feature = dev_get_drvdata(dev);
> > +	u64 v;
> > +
> > +	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +			 (unsigned int)FIELD_GET(TEMP_THRESHOLD2_STATUS, v));
> > +}
> > +
> > +static ssize_t temp1_alarm_policy_show(struct device *dev,
> > +				       struct device_attribute *attr, char *buf)
> > +{
> > +	struct dfl_feature *feature = dev_get_drvdata(dev);
> > +	u64 v;
> > +
> > +	v = readq(feature->ioaddr + FME_THERM_THRESHOLD);
> > +
> > +	return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +			 (unsigned int)FIELD_GET(TEMP_THRESHOLD1_POLICY, v));
> > +}
> > +
> > +static DEVICE_ATTR_RO(temp1_alarm_status);
> > +static DEVICE_ATTR_RO(temp1_crit_status);
> > +static DEVICE_ATTR_RO(temp1_alarm_policy);
> > +
> > +static struct attribute *thermal_extra_attrs[] = {
> > +	&dev_attr_temp1_alarm_status.attr,
> > +	&dev_attr_temp1_crit_status.attr,
> > +	&dev_attr_temp1_alarm_policy.attr,
> > +	NULL,
> > +};
> > +
> > +static umode_t thermal_extra_attrs_visible(struct kobject *kobj,
> > +					   struct attribute *attr, int index)
> > +{
> > +	struct device *dev = kobj_to_dev(kobj);
> > +	struct dfl_feature *feature = dev_get_drvdata(dev);
> > +
> > +	return fme_thermal_throttle_support(feature->ioaddr) ? attr->mode : 0;
> > +}
> > +
> > +static const struct attribute_group thermal_extra_group = {
> > +	.attrs		= thermal_extra_attrs,
> > +	.is_visible	= thermal_extra_attrs_visible,
> > +};
> > +__ATTRIBUTE_GROUPS(thermal_extra);
> > +
> > +static int fme_thermal_mgmt_init(struct platform_device *pdev,
> > +				 struct dfl_feature *feature)
> > +{
> > +	struct device *hwmon;
> > +
> > +	dev_dbg(&pdev->dev, "FME Thermal Management Init.\n");
> > +
> > +	/*
> > +	 * create hwmon to allow userspace monitoring temperature and other
> > +	 * threshold information.
> > +	 *
> > +	 * temp1_alarm     -> hardware threshold 1 -> 50% or 90% throttling
> > +	 * temp1_crit      -> hardware threshold 2 -> 100% throttling
> > +	 * temp1_emergency -> hardware trip_threshold to shutdown FPGA
> > +	 *
> > +	 * create device specific sysfs interfaces, e.g. read temp1_alarm_policy
> > +	 * to understand the actual hardware throttling action (50% vs 90%).
> > +	 *
> > +	 * If hardware doesn't support automatic throttling per thresholds,
> > +	 * then all above sysfs interfaces are not visible except temp1_input
> > +	 * for temperature.
> > +	 */
> > +	hwmon = devm_hwmon_device_register_with_info(&pdev->dev,
> > +						     "dfl_fme_thermal", feature,
> > +						     &thermal_hwmon_chip_info,
> > +						     thermal_extra_groups);
> > +	if (IS_ERR(hwmon)) {
> > +		dev_err(&pdev->dev, "Fail to register thermal hwmon\n");
> > +		return PTR_ERR(hwmon);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void fme_thermal_mgmt_uinit(struct platform_device *pdev,
> > +				   struct dfl_feature *feature)
> > +{
> > +	dev_dbg(&pdev->dev, "FME Thermal Management UInit.\n");
> > +}
> > +
> > +static const struct dfl_feature_id fme_thermal_mgmt_id_table[] = {
> > +	{.id = FME_FEATURE_ID_THERMAL_MGMT,},
> > +	{0,}
> > +};
> > +
> > +static const struct dfl_feature_ops fme_thermal_mgmt_ops = {
> > +	.init = fme_thermal_mgmt_init,
> > +	.uinit = fme_thermal_mgmt_uinit,
> > +};
> > +
> >  static struct dfl_feature_driver fme_feature_drvs[] = {
> >  	{
> >  		.id_table = fme_hdr_id_table,
> > @@ -227,6 +435,10 @@ static long fme_hdr_ioctl(struct platform_device *pdev,
> >  		.ops = &fme_pr_mgmt_ops,
> >  	},
> >  	{
> > +		.id_table = fme_thermal_mgmt_id_table,
> > +		.ops = &fme_thermal_mgmt_ops,
> > +	},
> > +	{
> >  		.ops = NULL,
> >  	},
> >  };
> > -- 
> > 1.8.3.1
> > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR
  2019-05-07 17:26   ` Moritz Fischer
@ 2019-05-08 17:58     ` Alan Tull
  0 siblings, 0 replies; 42+ messages in thread
From: Alan Tull @ 2019-05-08 17:58 UTC (permalink / raw)
  To: Moritz Fischer; +Cc: Wu Hao, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Tue, May 7, 2019 at 12:26 PM Moritz Fischer <mdf@kernel.org> wrote:
>
> On Mon, Apr 29, 2019 at 04:55:35PM +0800, Wu Hao wrote:
> > This patch removes copy_to_user() code in partial reconfiguration
> > ioctl, as it's useless as user never needs to read the data
> > structure after ioctl.
> >
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> Acked-by: Moritz Fischer <mdf@kernel.org>

Acked-by: Alan Tull <atull@kernel.org>

Alan

> > ---
> > v2: clean up code split from patch 2 in v1 patchset.
> > ---
> >  drivers/fpga/dfl-fme-pr.c | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
> > index d9ca955..6ec0f09 100644
> > --- a/drivers/fpga/dfl-fme-pr.c
> > +++ b/drivers/fpga/dfl-fme-pr.c
> > @@ -159,9 +159,6 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
> >       mutex_unlock(&pdata->lock);
> >  free_exit:
> >       vfree(buf);
> > -     if (copy_to_user((void __user *)arg, &port_pr, minsz))
> > -             return -EFAULT;
> > -
> >       return ret;
> >  }
> >
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 12/18] fpga: dfl: afu: add error reporting support.
  2019-04-29  8:55 ` [PATCH v2 12/18] fpga: dfl: afu: add error reporting support Wu Hao
@ 2019-05-09 14:41   ` Alan Tull
  0 siblings, 0 replies; 42+ messages in thread
From: Alan Tull @ 2019-05-09 14:41 UTC (permalink / raw)
  To: Wu Hao; +Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:
>
> Error reporting is one important private feature, it reports error
> detected on port and accelerated function unit (AFU). It introduces
> several sysfs interfaces to allow userspace to check and clear
> errors detected by hardware.
>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>

Acked-by: Alan Tull <atull@kernel.org>

Thanks!
Alan

> ---
> v2: add more error code description for error clear sysfs in doc.
>     return -EINVAL instead of -EBUSY when input error code doesn't
>     match in error clear sysfs.
> ---
>  Documentation/ABI/testing/sysfs-platform-dfl-port |  39 ++++
>  drivers/fpga/Makefile                             |   1 +
>  drivers/fpga/dfl-afu-error.c                      | 225 ++++++++++++++++++++++
>  drivers/fpga/dfl-afu-main.c                       |   4 +
>  drivers/fpga/dfl-afu.h                            |   4 +
>  5 files changed, 273 insertions(+)
>  create mode 100644 drivers/fpga/dfl-afu-error.c

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support
  2019-04-29  8:55 ` [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support Wu Hao
@ 2019-05-09 16:27   ` Alan Tull
  2019-05-10  2:23     ` Wu Hao
  0 siblings, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-09 16:27 UTC (permalink / raw)
  To: Wu Hao
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Ananda Ravuri, Xu Yilun

On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:

Hi Hao,

The changes look good.  There's one easy to fix thing that Greg has
pointed out recently on another patch (below).

>
> This patch adds support for global error reporting for FPGA
> Management Engine (FME), it introduces sysfs interfaces to
> report different error detected by the hardware, and allow
> user to clear errors or inject error for testing purpose.
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>

Acked-by: Alan Tull <atull@kernel.org>

> ---
> v2: fix issues found in sysfs doc.
>     fix returned error code issues for writable sysfs interfaces.
>     (use -EINVAL if input doesn't match error code)
>     reorder the sysfs groups in code.

> +static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
> +                            char *buf)
> +{
> +       struct device *err_dev = dev->parent;
> +       void __iomem *base;
> +
> +       base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n", dfl_feature_revision(base));

Greg is discouraging use of scnprintf for sysfs attributes where it's
not needed [1].

Please fix this up the attributes added in this patchset.  Besides
that, looks good, I added my Ack.

Alan

> +}
> +static DEVICE_ATTR_RO(revision);

[1] https://lkml.org/lkml/2019/4/25/1050

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support
  2019-05-09 16:27   ` Alan Tull
@ 2019-05-10  2:23     ` Wu Hao
  0 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-05-10  2:23 UTC (permalink / raw)
  To: Alan Tull
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Ananda Ravuri, Xu Yilun

On Thu, May 09, 2019 at 11:27:36AM -0500, Alan Tull wrote:
> On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> Hi Hao,
> 
> The changes look good.  There's one easy to fix thing that Greg has
> pointed out recently on another patch (below).
> 
> >
> > This patch adds support for global error reporting for FPGA
> > Management Engine (FME), it introduces sysfs interfaces to
> > report different error detected by the hardware, and allow
> > user to clear errors or inject error for testing purpose.
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> 
> Acked-by: Alan Tull <atull@kernel.org>
> 
> > ---
> > v2: fix issues found in sysfs doc.
> >     fix returned error code issues for writable sysfs interfaces.
> >     (use -EINVAL if input doesn't match error code)
> >     reorder the sysfs groups in code.
> 
> > +static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
> > +                            char *buf)
> > +{
> > +       struct device *err_dev = dev->parent;
> > +       void __iomem *base;
> > +
> > +       base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n", dfl_feature_revision(base));
> 
> Greg is discouraging use of scnprintf for sysfs attributes where it's
> not needed [1].
> 
> Please fix this up the attributes added in this patchset.  Besides
> that, looks good, I added my Ack.

Sure, will fix them in the next patchset.

thanks a lot!

Hao

> 
> Alan
> 
> > +}
> > +static DEVICE_ATTR_RO(revision);
> 
> [1] https://lkml.org/lkml/2019/4/25/1050

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/18] fpga: dfl: fme: add performance reporting support
  2019-04-29  8:55 ` [PATCH v2 18/18] fpga: dfl: fme: add performance " Wu Hao
@ 2019-05-16 17:28   ` Alan Tull
  2019-05-17  3:48     ` Wu Hao
  0 siblings, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-16 17:28 UTC (permalink / raw)
  To: Wu Hao
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Xu Yilun

On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:

Hi Hao,

>
> This patch adds support for performance reporting private feature
> for FPGA Management Engine (FME). Actually it supports 4 categories
> performance counters, 'clock', 'cache', 'iommu' and 'fabric', user
> could read the performance counter via exposed sysfs interfaces.
> Please refer to sysfs doc for more details.
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>
> ---
> v2: improve sysfs doc
> ---
>  Documentation/ABI/testing/sysfs-platform-dfl-fme |  93 +++
>  drivers/fpga/Makefile                            |   1 +
>  drivers/fpga/dfl-fme-main.c                      |   4 +
>  drivers/fpga/dfl-fme-perf.c                      | 950 +++++++++++++++++++++++
>  drivers/fpga/dfl-fme.h                           |   2 +
>  drivers/fpga/dfl.c                               |   1 +
>  drivers/fpga/dfl.h                               |   2 +
>  7 files changed, 1053 insertions(+)
>  create mode 100644 drivers/fpga/dfl-fme-perf.c
>
> diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> index 503984b..a7f7eb6 100644
> --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> @@ -250,3 +250,96 @@ Description:       Write-only. Write error code to this file to clear all errors
>                 logged in errors, first_error and next_error. Write fails with
>                 -EINVAL if input parsing fails or input error code doesn't
>                 match.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/clock
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read for Accelerator Function Unit (AFU) clock
> +               counter.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/cache/freeze
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file for the current status of 'cache'
> +               category performance counters, and Write '1' or '0' to freeze
> +               or unfreeze 'cache' performance counters.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/cache/<counter>
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read 'cache' category performance counters:
> +               read_hit, read_miss, write_hit, write_miss, hold_request,
> +               data_write_port_contention, tag_write_port_contention,
> +               tx_req_stall, rx_req_stall and rx_eviction.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/iommu/freeze
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file for the current status of 'iommu'
> +               category performance counters, and Write '1' or '0' to freeze
> +               or unfreeze 'iommu' performance counters.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/iommu/<sip_counter>
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read 'iommu' category 'sip' sub category
> +               performance counters: iotlb_4k_hit, iotlb_2m_hit,
> +               iotlb_1g_hit, slpwc_l3_hit, slpwc_l4_hit, rcc_hit,
> +               rcc_miss, iotlb_4k_miss, iotlb_2m_miss, iotlb_1g_miss,
> +               slpwc_l3_miss and slpwc_l4_miss.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/iommu/afu0/<counter>
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read 'iommu' category 'afuX' sub category
> +               performance counters: read_transaction, write_transaction,
> +               devtlb_read_hit, devtlb_write_hit, devtlb_4k_fill,
> +               devtlb_2m_fill and devtlb_1g_fill.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/freeze
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file for the current status of 'fabric'
> +               category performance counters, and Write '1' or '0' to freeze
> +               or unfreeze 'fabric' performance counters.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/<counter>
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read 'fabric' category performance counters:
> +               pcie0_read, pcie0_write, pcie1_read, pcie1_write,
> +               upi_read, upi_write and mmio_read.

Also mmio_write

> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/enable
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file for current status of device level
> +               fabric counters. Write "1" to enable device level fabric
> +               counters. Once device level fabric counters are enabled, port
> +               level fabric counters will be disabled automatically.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/port0/<counter>
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Only. Read 'fabric' category "portX" sub category
> +               performance counters: pcie0_read, pcie0_write, pcie1_read,
> +               pcie1_write, upi_read, upi_write and mmio_read.
> +
> +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/port0/enable
> +Date:          April 2019
> +KernelVersion:  5.2
> +Contact:       Wu Hao <hao.wu@intel.com>
> +Description:   Read-Write. Read this file for current status of port level
> +               fabric counters. Write "1" to enable port level fabric counters.
> +               Once port level fabric counters are enabled, device level fabric
> +               counters will be disabled automatically.
> diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
> index 1a9fa3d..7df3971 100644
> --- a/drivers/fpga/Makefile
> +++ b/drivers/fpga/Makefile
> @@ -39,6 +39,7 @@ obj-$(CONFIG_FPGA_DFL_FME_REGION)     += dfl-fme-region.o
>  obj-$(CONFIG_FPGA_DFL_AFU)             += dfl-afu.o
>
>  dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o dfl-fme-error.o
> +dfl-fme-objs += dfl-fme-perf.o
>  dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
>  dfl-afu-objs += dfl-afu-error.o
>
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index 1986b32..221f4ec 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -690,6 +690,10 @@ static void fme_power_mgmt_uinit(struct platform_device *pdev,
>                 .ops = &fme_global_err_ops,
>         },
>         {
> +               .id_table = fme_perf_id_table,
> +               .ops = &fme_perf_ops,
> +       },
> +       {
>                 .ops = NULL,
>         },
>  };
> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
> new file mode 100644
> index 0000000..035bb68
> --- /dev/null
> +++ b/drivers/fpga/dfl-fme-perf.c
> @@ -0,0 +1,950 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Driver for FPGA Management Engine (FME) Global Performance Reporting
> + *
> + * Copyright 2019 Intel Corporation, Inc.
> + *
> + * Authors:
> + *   Kang Luwei <luwei.kang@intel.com>
> + *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
> + *   Wu Hao <hao.wu@intel.com>
> + *   Joseph Grecco <joe.grecco@intel.com>
> + *   Enno Luebbers <enno.luebbers@intel.com>
> + *   Tim Whisonant <tim.whisonant@intel.com>
> + *   Ananda Ravuri <ananda.ravuri@intel.com>
> + *   Mitchel, Henry <henry.mitchel@intel.com>
> + */
> +
> +#include "dfl.h"
> +#include "dfl-fme.h"
> +
> +/*
> + * Performance Counter Registers for Cache.
> + *
> + * Cache Events are listed below as CACHE_EVNT_*.
> + */
> +#define CACHE_CTRL                     0x8
> +#define CACHE_RESET_CNTR               BIT_ULL(0)
> +#define CACHE_FREEZE_CNTR              BIT_ULL(8)
> +#define CACHE_CTRL_EVNT                        GENMASK_ULL(19, 16)
> +#define CACHE_EVNT_RD_HIT              0x0
> +#define CACHE_EVNT_WR_HIT              0x1
> +#define CACHE_EVNT_RD_MISS             0x2
> +#define CACHE_EVNT_WR_MISS             0x3
> +#define CACHE_EVNT_RSVD                        0x4
> +#define CACHE_EVNT_HOLD_REQ            0x5
> +#define CACHE_EVNT_DATA_WR_PORT_CONTEN 0x6
> +#define CACHE_EVNT_TAG_WR_PORT_CONTEN  0x7
> +#define CACHE_EVNT_TX_REQ_STALL                0x8
> +#define CACHE_EVNT_RX_REQ_STALL                0x9
> +#define CACHE_EVNT_EVICTIONS           0xa
> +#define CACHE_EVNT_MAX                 CACHE_EVNT_EVICTIONS
> +#define CACHE_CHANNEL_SEL              BIT_ULL(20)
> +#define CACHE_CHANNEL_RD               0
> +#define CACHE_CHANNEL_WR               1
> +#define CACHE_CHANNEL_MAX              2
> +#define CACHE_CNTR0                    0x10
> +#define CACHE_CNTR1                    0x18
> +#define CACHE_CNTR_EVNT_CNTR           GENMASK_ULL(47, 0)
> +#define CACHE_CNTR_EVNT                        GENMASK_ULL(63, 60)
> +
> +/*
> + * Performance Counter Registers for Fabric.
> + *
> + * Fabric Events are listed below as FAB_EVNT_*
> + */
> +#define FAB_CTRL                       0x20
> +#define FAB_RESET_CNTR                 BIT_ULL(0)
> +#define FAB_FREEZE_CNTR                        BIT_ULL(8)
> +#define FAB_CTRL_EVNT                  GENMASK_ULL(19, 16)
> +#define FAB_EVNT_PCIE0_RD              0x0
> +#define FAB_EVNT_PCIE0_WR              0x1
> +#define FAB_EVNT_PCIE1_RD              0x2
> +#define FAB_EVNT_PCIE1_WR              0x3
> +#define FAB_EVNT_UPI_RD                        0x4
> +#define FAB_EVNT_UPI_WR                        0x5
> +#define FAB_EVNT_MMIO_RD               0x6
> +#define FAB_EVNT_MMIO_WR               0x7
> +#define FAB_EVNT_MAX                   FAB_EVNT_MMIO_WR
> +#define FAB_PORT_ID                    GENMASK_ULL(21, 20)
> +#define FAB_PORT_FILTER                        BIT_ULL(23)
> +#define FAB_PORT_FILTER_DISABLE                0
> +#define FAB_PORT_FILTER_ENABLE         1
> +#define FAB_CNTR                       0x28
> +#define FAB_CNTR_EVNT_CNTR             GENMASK_ULL(59, 0)
> +#define FAB_CNTR_EVNT                  GENMASK_ULL(63, 60)
> +
> +/*
> + * Performance Counter Registers for Clock.
> + *
> + * Clock Counter can't be reset or frozen by SW.
> + */
> +#define CLK_CNTR                       0x30
> +
> +/*
> + * Performance Counter Registers for IOMMU / VT-D.
> + *
> + * VT-D Events are listed below as VTD_EVNT_* and VTD_SIP_EVNT_*
> + */
> +#define VTD_CTRL                       0x38
> +#define VTD_RESET_CNTR                 BIT_ULL(0)
> +#define VTD_FREEZE_CNTR                        BIT_ULL(8)
> +#define VTD_CTRL_EVNT                  GENMASK_ULL(19, 16)
> +#define VTD_EVNT_AFU_MEM_RD_TRANS      0x0
> +#define VTD_EVNT_AFU_MEM_WR_TRANS      0x1
> +#define VTD_EVNT_AFU_DEVTLB_RD_HIT     0x2
> +#define VTD_EVNT_AFU_DEVTLB_WR_HIT     0x3
> +#define VTD_EVNT_DEVTLB_4K_FILL                0x4
> +#define VTD_EVNT_DEVTLB_2M_FILL                0x5
> +#define VTD_EVNT_DEVTLB_1G_FILL                0x6
> +#define VTD_EVNT_MAX                   VTD_EVNT_DEVTLB_1G_FILL
> +#define VTD_CNTR                       0x40
> +#define VTD_CNTR_EVNT                  GENMASK_ULL(63, 60)
> +#define VTD_CNTR_EVNT_CNTR             GENMASK_ULL(47, 0)
> +#define VTD_SIP_CTRL                   0x48
> +#define VTD_SIP_RESET_CNTR             BIT_ULL(0)
> +#define VTD_SIP_FREEZE_CNTR            BIT_ULL(8)
> +#define VTD_SIP_CTRL_EVNT              GENMASK_ULL(19, 16)
> +#define VTD_SIP_EVNT_IOTLB_4K_HIT      0x0
> +#define VTD_SIP_EVNT_IOTLB_2M_HIT      0x1
> +#define VTD_SIP_EVNT_IOTLB_1G_HIT      0x2
> +#define VTD_SIP_EVNT_SLPWC_L3_HIT      0x3
> +#define VTD_SIP_EVNT_SLPWC_L4_HIT      0x4
> +#define VTD_SIP_EVNT_RCC_HIT           0x5
> +#define VTD_SIP_EVNT_IOTLB_4K_MISS     0x6
> +#define VTD_SIP_EVNT_IOTLB_2M_MISS     0x7
> +#define VTD_SIP_EVNT_IOTLB_1G_MISS     0x8
> +#define VTD_SIP_EVNT_SLPWC_L3_MISS     0x9
> +#define VTD_SIP_EVNT_SLPWC_L4_MISS     0xa
> +#define VTD_SIP_EVNT_RCC_MISS          0xb
> +#define VTD_SIP_EVNT_MAX               VTD_SIP_EVNT_RCC_MISS
> +#define VTD_SIP_CNTR                   0X50
> +#define VTD_SIP_CNTR_EVNT              GENMASK_ULL(63, 60)
> +#define VTD_SIP_CNTR_EVNT_CNTR         GENMASK_ULL(47, 0)
> +
> +#define PERF_OBJ_ROOT_ID               (~0)
> +
> +#define PERF_TIMEOUT                   30
> +
> +/**
> + * struct perf_object - object of performance counter
> + *
> + * @id: instance id. PERF_OBJ_ROOT_ID indicates it is a parent object which
> + *      counts performance counters for all instances.
> + * @attr_groups: the sysfs files are associated with this object.
> + * @feature: pointer to related private feature.
> + * @node: used to link itself to parent's children list.
> + * @children: used to link its children objects together.
> + * @kobj: generic kobject interface.
> + *
> + * 'node' and 'children' are used to construct parent-children hierarchy.
> + */
> +struct perf_object {
> +       int id;
> +       const struct attribute_group **attr_groups;
> +       struct dfl_feature *feature;
> +
> +       struct list_head node;
> +       struct list_head children;
> +       struct kobject kobj;
> +};
> +
> +/**
> + * struct perf_obj_attribute - attribute of perf object
> + *
> + * @attr: attribute of this perf object.
> + * @show: show callback for sysfs attribute.
> + * @store: store callback for sysfs attribute.
> + */
> +struct perf_obj_attribute {
> +       struct attribute attr;
> +       ssize_t (*show)(struct perf_object *pobj, char *buf);
> +       ssize_t (*store)(struct perf_object *pobj,
> +                        const char *buf, size_t n);
> +};
> +
> +#define to_perf_obj_attr(_attr)                                        \
> +               container_of(_attr, struct perf_obj_attribute, attr)
> +#define to_perf_obj(_kobj)                                     \
> +               container_of(_kobj, struct perf_object, kobj)
> +
> +#define PERF_OBJ_ATTR(_name, _filename, _mode, _show, _store)  \
> +struct perf_obj_attribute perf_obj_attr_##_name =              \
> +       __ATTR(_filename, _mode, _show, _store)

This #define and the ones below set up an interdependency with sysfs.h
that I'm scratching my head about.  You're defining your own type of
attribute struct which is fine, but counting on __ATTR to continue to
work with it in the future.  It wouldn't be much of a change here to
define your own macro here (which is the same as __ATTR) and then use
it for your PERF_OBJ_ATTR_RW, etc.  Maybe I'm being overcautious, but
it's a small change.

> +
> +#define PERF_OBJ_ATTR_RW(_name)                                        \
> +       struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_RW(_name)
> +#define PERF_OBJ_ATTR_RO(_name)                                        \
> +       struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_RO(_name)
> +#define PERF_OBJ_ATTR_WO(_name)                                        \
> +       struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_WO(_name)
> +
> +static ssize_t perf_obj_attr_show(struct kobject *kobj,
> +                                 struct attribute *__attr, char *buf)
> +{
> +       struct perf_obj_attribute *attr = to_perf_obj_attr(__attr);
> +       struct perf_object *pobj = to_perf_obj(kobj);
> +       ssize_t ret = -EIO;

Would this be -EPERM?

> +
> +       if (attr->show)
> +               ret = attr->show(pobj, buf);

Actually is it even possible for !attr->show if this were a WO attribute?

> +       return ret;
> +}
> +
> +static ssize_t perf_obj_attr_store(struct kobject *kobj,
> +                                  struct attribute *__attr,
> +                                  const char *buf, size_t n)
> +{
> +       struct perf_obj_attribute *attr = to_perf_obj_attr(__attr);
> +       struct perf_object *pobj = to_perf_obj(kobj);
> +       ssize_t ret = -EIO;

Same here

> +
> +       if (attr->store)
> +               ret = attr->store(pobj, buf, n);
> +       return ret;
> +}
> +
> +static const struct sysfs_ops perf_obj_sysfs_ops = {
> +       .show = perf_obj_attr_show,
> +       .store = perf_obj_attr_store,
> +};
> +
> +static void perf_obj_release(struct kobject *kobj)
> +{
> +       kfree(to_perf_obj(kobj));
> +}
> +
> +static struct kobj_type perf_obj_ktype = {
> +       .sysfs_ops = &perf_obj_sysfs_ops,
> +       .release = perf_obj_release,
> +};
> +
> +static struct perf_object *
> +create_perf_obj(struct dfl_feature *feature, struct kobject *parent, int id,
> +               const struct attribute_group **groups, const char *name)
> +{
> +       struct perf_object *pobj;
> +       int ret;
> +
> +       pobj = kzalloc(sizeof(*pobj), GFP_KERNEL);
> +       if (!pobj)
> +               return ERR_PTR(-ENOMEM);
> +
> +       pobj->id = id;
> +       pobj->feature = feature;
> +       pobj->attr_groups = groups;
> +       INIT_LIST_HEAD(&pobj->node);
> +       INIT_LIST_HEAD(&pobj->children);
> +
> +       if (id != PERF_OBJ_ROOT_ID)
> +               ret = kobject_init_and_add(&pobj->kobj, &perf_obj_ktype,
> +                                          parent, "%s%d", name, id);
> +       else
> +               ret = kobject_init_and_add(&pobj->kobj, &perf_obj_ktype,
> +                                          parent, "%s", name);
> +       if (ret)
> +               goto put_exit;
> +
> +       if (pobj->attr_groups) {
> +               ret = sysfs_create_groups(&pobj->kobj, pobj->attr_groups);
> +               if (ret)
> +                       goto del_exit;
> +       }
> +
> +       return pobj;
> +
> +del_exit:
> +       kobject_del(&pobj->kobj);

kobject_put will delete and clean up, you won't need kobject_del.

> +put_exit:
> +       kobject_put(&pobj->kobj);
> +       return ERR_PTR(ret);
> +}
> +
> +/*
> + * Counter Sysfs Interface for Clock.
> + */
> +static ssize_t clock_show(struct perf_object *pobj, char *buf)
> +{
> +       void __iomem *base = pobj->feature->ioaddr;
> +
> +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
> +                        (unsigned long long)readq(base + CLK_CNTR));

It's fine to use sprintf, as mentioned recently on one of the other patches.

> +}
> +static PERF_OBJ_ATTR_RO(clock);
> +
> +static struct attribute *clock_attrs[] = {
> +       &perf_obj_attr_clock.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group clock_attr_group = {
> +       .attrs = clock_attrs,
> +};
> +
> +static const struct attribute_group *perf_dev_attr_groups[] = {
> +       &clock_attr_group,
> +       NULL,
> +};
> +
> +static void destroy_perf_obj(struct perf_object *pobj)
> +{
> +       struct perf_object *obj, *obj_tmp;
> +
> +       list_for_each_entry_safe(obj, obj_tmp, &pobj->children, node)
> +               destroy_perf_obj(obj);
> +
> +       list_del(&pobj->node);
> +       if (pobj->attr_groups)
> +               sysfs_remove_groups(&pobj->kobj, pobj->attr_groups);

The attributes should be removed before anything else goes away.

> +       kobject_put(&pobj->kobj);
> +}
> +
> +static struct perf_object *create_perf_dev(struct dfl_feature *feature)
> +{
> +       struct platform_device *pdev = feature->pdev;
> +
> +       return create_perf_obj(feature, &pdev->dev.kobj, PERF_OBJ_ROOT_ID,
> +                              perf_dev_attr_groups, "perf");
> +}
> +
> +/*
> + * Counter Sysfs Interfaces for Cache.
> + */
> +static ssize_t cache_freeze_show(struct perf_object *pobj, char *buf)
> +{
> +       void __iomem *base = pobj->feature->ioaddr;
> +       u64 v;
> +
> +       v = readq(base + CACHE_CTRL);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(CACHE_FREEZE_CNTR, v));
> +}
> +
> +static ssize_t cache_freeze_store(struct perf_object *pobj,
> +                                 const char *buf, size_t n)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       bool state;
> +       u64 v;
> +
> +       if (strtobool(buf, &state))
> +               return -EINVAL;
> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       v = readq(base + CACHE_CTRL);
> +       v &= ~CACHE_FREEZE_CNTR;
> +       v |= FIELD_PREP(CACHE_FREEZE_CNTR, state ? 1 : 0);
> +       writeq(v, base + CACHE_CTRL);
> +       mutex_unlock(&pdata->lock);
> +
> +       return n;
> +}
> +static PERF_OBJ_ATTR(cache_freeze, freeze, 0644,
> +                    cache_freeze_show, cache_freeze_store);
> +
> +static ssize_t read_cache_counter(struct perf_object *pobj, char *buf,
> +                                 u8 channel, u8 event)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       u64 v, count;
> +
> +       if (event > CACHE_EVNT_MAX || channel > CACHE_CHANNEL_MAX)
> +               return -EINVAL;

This would only happen if there was a coding error using one of the
macros below, right?

> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       /* set channel access type and cache event code. */
> +       v = readq(base + CACHE_CTRL);
> +       v &= ~(CACHE_CHANNEL_SEL | CACHE_CTRL_EVNT);
> +       v |= FIELD_PREP(CACHE_CHANNEL_SEL, channel);
> +       v |= FIELD_PREP(CACHE_CTRL_EVNT, event);
> +       writeq(v, base + CACHE_CTRL);
> +
> +       if (readq_poll_timeout(base + CACHE_CNTR0, v,
> +                              FIELD_GET(CACHE_CNTR_EVNT, v) == event,
> +                              1, PERF_TIMEOUT)) {
> +               dev_err(&feature->pdev->dev, "timeout, unmatched cache event type in counter registers.\n");
> +               mutex_unlock(&pdata->lock);
> +               return -ETIMEDOUT;
> +       }
> +
> +       v = readq(base + CACHE_CNTR0);
> +       count = FIELD_GET(CACHE_CNTR_EVNT_CNTR, v);
> +       v = readq(base + CACHE_CNTR1);
> +       count += FIELD_GET(CACHE_CNTR_EVNT_CNTR, v);
> +       mutex_unlock(&pdata->lock);
> +
> +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> +}
> +
> +#define CACHE_SHOW(name, type, event)                                  \
> +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> +{                                                                      \
> +       return read_cache_counter(pobj, buf, type, event);              \
> +}                                                                      \
> +static PERF_OBJ_ATTR_RO(name)
> +
> +CACHE_SHOW(read_hit, CACHE_CHANNEL_RD, CACHE_EVNT_RD_HIT);
> +CACHE_SHOW(read_miss, CACHE_CHANNEL_RD, CACHE_EVNT_RD_MISS);
> +CACHE_SHOW(write_hit, CACHE_CHANNEL_WR, CACHE_EVNT_WR_HIT);
> +CACHE_SHOW(write_miss, CACHE_CHANNEL_WR, CACHE_EVNT_WR_MISS);
> +CACHE_SHOW(hold_request, CACHE_CHANNEL_RD, CACHE_EVNT_HOLD_REQ);
> +CACHE_SHOW(tx_req_stall, CACHE_CHANNEL_RD, CACHE_EVNT_TX_REQ_STALL);
> +CACHE_SHOW(rx_req_stall, CACHE_CHANNEL_RD, CACHE_EVNT_RX_REQ_STALL);
> +CACHE_SHOW(rx_eviction, CACHE_CHANNEL_RD, CACHE_EVNT_EVICTIONS);
> +CACHE_SHOW(data_write_port_contention, CACHE_CHANNEL_WR,
> +          CACHE_EVNT_DATA_WR_PORT_CONTEN);
> +CACHE_SHOW(tag_write_port_contention, CACHE_CHANNEL_WR,
> +          CACHE_EVNT_TAG_WR_PORT_CONTEN);
> +
> +static struct attribute *cache_attrs[] = {
> +       &perf_obj_attr_read_hit.attr,
> +       &perf_obj_attr_read_miss.attr,
> +       &perf_obj_attr_write_hit.attr,
> +       &perf_obj_attr_write_miss.attr,
> +       &perf_obj_attr_hold_request.attr,
> +       &perf_obj_attr_data_write_port_contention.attr,
> +       &perf_obj_attr_tag_write_port_contention.attr,
> +       &perf_obj_attr_tx_req_stall.attr,
> +       &perf_obj_attr_rx_req_stall.attr,
> +       &perf_obj_attr_rx_eviction.attr,
> +       &perf_obj_attr_cache_freeze.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group cache_attr_group = {
> +       .attrs = cache_attrs,
> +};
> +
> +static const struct attribute_group *cache_attr_groups[] = {
> +       &cache_attr_group,
> +       NULL,
> +};
> +
> +static int create_perf_cache_obj(struct perf_object *perf_dev)
> +{
> +       struct perf_object *pobj;
> +
> +       pobj = create_perf_obj(perf_dev->feature, &perf_dev->kobj,
> +                              PERF_OBJ_ROOT_ID, cache_attr_groups, "cache");
> +       if (IS_ERR(pobj))
> +               return PTR_ERR(pobj);
> +
> +       list_add(&pobj->node, &perf_dev->children);
> +
> +       return 0;
> +}
> +
> +/*
> + * Counter Sysfs Interfaces for VT-D / IOMMU.
> + */
> +static ssize_t vtd_freeze_show(struct perf_object *pobj, char *buf)
> +{
> +       void __iomem *base = pobj->feature->ioaddr;
> +       u64 v;
> +
> +       v = readq(base + VTD_CTRL);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(VTD_FREEZE_CNTR, v));
> +}
> +
> +static ssize_t vtd_freeze_store(struct perf_object *pobj,
> +                               const char *buf, size_t n)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       bool state;
> +       u64 v;
> +
> +       if (strtobool(buf, &state))
> +               return -EINVAL;
> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       v = readq(base + VTD_CTRL);
> +       v &= ~VTD_FREEZE_CNTR;
> +       v |= FIELD_PREP(VTD_FREEZE_CNTR, state ? 1 : 0);
> +       writeq(v, base + VTD_CTRL);
> +       mutex_unlock(&pdata->lock);
> +
> +       return n;
> +}
> +static PERF_OBJ_ATTR(vtd_freeze, freeze, 0644,
> +                    vtd_freeze_show, vtd_freeze_store);
> +
> +static struct attribute *iommu_top_attrs[] = {
> +       &perf_obj_attr_vtd_freeze.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group iommu_top_attr_group = {
> +       .attrs = iommu_top_attrs,
> +};
> +
> +static ssize_t read_iommu_sip_counter(struct perf_object *pobj,
> +                                     u8 event, char *buf)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       u64 v, count;
> +
> +       if (event > VTD_SIP_EVNT_MAX)
> +               return -EINVAL;
> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       v = readq(base + VTD_SIP_CTRL);
> +       v &= ~VTD_SIP_CTRL_EVNT;
> +       v |= FIELD_PREP(VTD_SIP_CTRL_EVNT, event);
> +       writeq(v, base + VTD_SIP_CTRL);
> +
> +       if (readq_poll_timeout(base + VTD_SIP_CNTR, v,
> +                              FIELD_GET(VTD_SIP_CNTR_EVNT, v) == event,
> +                              1, PERF_TIMEOUT)) {
> +               dev_err(&feature->pdev->dev, "timeout, unmatched VTd SIP event type in counter registers\n");
> +               mutex_unlock(&pdata->lock);
> +               return -ETIMEDOUT;
> +       }
> +
> +       v = readq(base + VTD_SIP_CNTR);
> +       count = FIELD_GET(VTD_SIP_CNTR_EVNT_CNTR, v);
> +       mutex_unlock(&pdata->lock);
> +
> +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> +}
> +
> +#define VTD_SIP_SHOW(name, event)                                      \
> +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> +{                                                                      \
> +       return read_iommu_sip_counter(pobj, event, buf);                \
> +}                                                                      \
> +static PERF_OBJ_ATTR_RO(name)
> +
> +VTD_SIP_SHOW(iotlb_4k_hit, VTD_SIP_EVNT_IOTLB_4K_HIT);
> +VTD_SIP_SHOW(iotlb_2m_hit, VTD_SIP_EVNT_IOTLB_2M_HIT);
> +VTD_SIP_SHOW(iotlb_1g_hit, VTD_SIP_EVNT_IOTLB_1G_HIT);
> +VTD_SIP_SHOW(slpwc_l3_hit, VTD_SIP_EVNT_SLPWC_L3_HIT);
> +VTD_SIP_SHOW(slpwc_l4_hit, VTD_SIP_EVNT_SLPWC_L4_HIT);
> +VTD_SIP_SHOW(rcc_hit, VTD_SIP_EVNT_RCC_HIT);
> +VTD_SIP_SHOW(iotlb_4k_miss, VTD_SIP_EVNT_IOTLB_4K_MISS);
> +VTD_SIP_SHOW(iotlb_2m_miss, VTD_SIP_EVNT_IOTLB_2M_MISS);
> +VTD_SIP_SHOW(iotlb_1g_miss, VTD_SIP_EVNT_IOTLB_1G_MISS);
> +VTD_SIP_SHOW(slpwc_l3_miss, VTD_SIP_EVNT_SLPWC_L3_MISS);
> +VTD_SIP_SHOW(slpwc_l4_miss, VTD_SIP_EVNT_SLPWC_L4_MISS);
> +VTD_SIP_SHOW(rcc_miss, VTD_SIP_EVNT_RCC_MISS);
> +
> +static struct attribute *iommu_sip_attrs[] = {
> +       &perf_obj_attr_iotlb_4k_hit.attr,
> +       &perf_obj_attr_iotlb_2m_hit.attr,
> +       &perf_obj_attr_iotlb_1g_hit.attr,
> +       &perf_obj_attr_slpwc_l3_hit.attr,
> +       &perf_obj_attr_slpwc_l4_hit.attr,
> +       &perf_obj_attr_rcc_hit.attr,
> +       &perf_obj_attr_iotlb_4k_miss.attr,
> +       &perf_obj_attr_iotlb_2m_miss.attr,
> +       &perf_obj_attr_iotlb_1g_miss.attr,
> +       &perf_obj_attr_slpwc_l3_miss.attr,
> +       &perf_obj_attr_slpwc_l4_miss.attr,
> +       &perf_obj_attr_rcc_miss.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group iommu_sip_attr_group = {
> +       .attrs = iommu_sip_attrs,
> +};
> +
> +static const struct attribute_group *iommu_top_attr_groups[] = {
> +       &iommu_top_attr_group,
> +       &iommu_sip_attr_group,
> +       NULL,
> +};
> +
> +static ssize_t read_iommu_counter(struct perf_object *pobj, u8 event, char *buf)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       u64 v, count;
> +
> +       if (event > VTD_EVNT_MAX)
> +               return -EINVAL;
> +
> +       event += pobj->id;
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       v = readq(base + VTD_CTRL);
> +       v &= ~VTD_CTRL_EVNT;
> +       v |= FIELD_PREP(VTD_CTRL_EVNT, event);
> +       writeq(v, base + VTD_CTRL);
> +
> +       if (readq_poll_timeout(base + VTD_CNTR, v,
> +                              FIELD_GET(VTD_CNTR_EVNT, v) == event, 1,
> +                              PERF_TIMEOUT)) {
> +               dev_err(&feature->pdev->dev, "timeout, unmatched VTd event type in counter registers\n");
> +               mutex_unlock(&pdata->lock);
> +               return -ETIMEDOUT;
> +       }
> +
> +       v = readq(base + VTD_CNTR);
> +       count = FIELD_GET(VTD_CNTR_EVNT_CNTR, v);
> +       mutex_unlock(&pdata->lock);
> +
> +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> +}
> +
> +#define VTD_SHOW(name, base_event)                                     \
> +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> +{                                                                      \
> +       return read_iommu_counter(pobj, base_event, buf);               \
> +}                                                                      \
> +static PERF_OBJ_ATTR_RO(name)
> +
> +VTD_SHOW(read_transaction, VTD_EVNT_AFU_MEM_RD_TRANS);
> +VTD_SHOW(write_transaction, VTD_EVNT_AFU_MEM_WR_TRANS);
> +VTD_SHOW(devtlb_read_hit, VTD_EVNT_AFU_DEVTLB_RD_HIT);
> +VTD_SHOW(devtlb_write_hit, VTD_EVNT_AFU_DEVTLB_WR_HIT);
> +VTD_SHOW(devtlb_4k_fill, VTD_EVNT_DEVTLB_4K_FILL);
> +VTD_SHOW(devtlb_2m_fill, VTD_EVNT_DEVTLB_2M_FILL);
> +VTD_SHOW(devtlb_1g_fill, VTD_EVNT_DEVTLB_1G_FILL);
> +
> +static struct attribute *iommu_attrs[] = {
> +       &perf_obj_attr_read_transaction.attr,
> +       &perf_obj_attr_write_transaction.attr,
> +       &perf_obj_attr_devtlb_read_hit.attr,
> +       &perf_obj_attr_devtlb_write_hit.attr,
> +       &perf_obj_attr_devtlb_4k_fill.attr,
> +       &perf_obj_attr_devtlb_2m_fill.attr,
> +       &perf_obj_attr_devtlb_1g_fill.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group iommu_attr_group = {
> +       .attrs = iommu_attrs,
> +};
> +
> +static const struct attribute_group *iommu_attr_groups[] = {
> +       &iommu_attr_group,
> +       NULL,
> +};
> +
> +#define PERF_MAX_PORT_NUM      1
> +
> +static int create_perf_iommu_obj(struct perf_object *perf_dev)
> +{
> +       struct dfl_feature *feature = perf_dev->feature;
> +       struct device *dev = &feature->pdev->dev;
> +       struct perf_object *pobj, *obj;
> +       void __iomem *base;
> +       u64 v;
> +       int i;
> +
> +       /* check if iommu is not supported on this device. */
> +       base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
> +       v = readq(base + FME_HDR_CAP);
> +       if (!FIELD_GET(FME_CAP_IOMMU_AVL, v))
> +               return 0;
> +
> +       pobj = create_perf_obj(feature, &perf_dev->kobj, PERF_OBJ_ROOT_ID,
> +                              iommu_top_attr_groups, "iommu");
> +       if (IS_ERR(pobj))
> +               return PTR_ERR(pobj);
> +
> +       list_add(&pobj->node, &perf_dev->children);
> +
> +       for (i = 0; i < PERF_MAX_PORT_NUM; i++) {
> +               obj = create_perf_obj(feature, &pobj->kobj, i,
> +                                     iommu_attr_groups, "afu");
> +               if (IS_ERR(obj))
> +                       return PTR_ERR(obj);
> +
> +               list_add(&obj->node, &pobj->children);
> +       }
> +
> +       return 0;
> +}
> +
> +/*
> + * Counter Sysfs Interfaces for Fabric
> + */
> +static bool fabric_pobj_is_enabled(struct perf_object *pobj)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       void __iomem *base = feature->ioaddr;
> +       u64 v;
> +
> +       v = readq(base + FAB_CTRL);
> +
> +       if (FIELD_GET(FAB_PORT_FILTER, v) == FAB_PORT_FILTER_DISABLE)
> +               return pobj->id == PERF_OBJ_ROOT_ID;
> +
> +       return pobj->id == FIELD_GET(FAB_PORT_ID, v);
> +}
> +
> +static ssize_t read_fabric_counter(struct perf_object *pobj,
> +                                  u8 event, char *buf)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       u64 v, count = 0;
> +
> +       if (event > FAB_EVNT_MAX)
> +               return -EINVAL;
> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       /* if it is disabled, force the counter to return zero. */
> +       if (!fabric_pobj_is_enabled(pobj))
> +               goto exit;
> +
> +       v = readq(base + FAB_CTRL);
> +       v &= ~FAB_CTRL_EVNT;
> +       v |= FIELD_PREP(FAB_CTRL_EVNT, event);
> +       writeq(v, base + FAB_CTRL);
> +
> +       if (readq_poll_timeout(base + FAB_CNTR, v,
> +                              FIELD_GET(FAB_CNTR_EVNT, v) == event,
> +                              1, PERF_TIMEOUT)) {
> +               dev_err(&feature->pdev->dev, "timeout, unmatched fab event type in counter registers.\n");
> +               mutex_unlock(&pdata->lock);
> +               return -ETIMEDOUT;
> +       }
> +
> +       v = readq(base + FAB_CNTR);
> +       count = FIELD_GET(FAB_CNTR_EVNT_CNTR, v);
> +exit:
> +       mutex_unlock(&pdata->lock);
> +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> +}
> +
> +#define FAB_SHOW(name, event)                                          \
> +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> +{                                                                      \
> +       return read_fabric_counter(pobj, event, buf);                   \
> +}                                                                      \
> +static PERF_OBJ_ATTR_RO(name)
> +
> +FAB_SHOW(pcie0_read, FAB_EVNT_PCIE0_RD);
> +FAB_SHOW(pcie0_write, FAB_EVNT_PCIE0_WR);
> +FAB_SHOW(pcie1_read, FAB_EVNT_PCIE1_RD);
> +FAB_SHOW(pcie1_write, FAB_EVNT_PCIE1_WR);
> +FAB_SHOW(upi_read, FAB_EVNT_UPI_RD);
> +FAB_SHOW(upi_write, FAB_EVNT_UPI_WR);
> +FAB_SHOW(mmio_read, FAB_EVNT_MMIO_RD);
> +FAB_SHOW(mmio_write, FAB_EVNT_MMIO_WR);
> +
> +static ssize_t fab_enable_show(struct perf_object *pobj, char *buf)
> +{
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)!!fabric_pobj_is_enabled(pobj));
> +}
> +
> +/*
> + * If enable one port or all port event counter in fabric, other
> + * fabric event counter originally enabled will be disable automatically.
> + */
> +static ssize_t fab_enable_store(struct perf_object *pobj,
> +                               const char *buf, size_t n)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       bool state;
> +       u64 v;
> +
> +       if (strtobool(buf, &state) || !state)
> +               return -EINVAL;
> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       /* if it is already enabled. */
> +       if (fabric_pobj_is_enabled(pobj))
> +               return n;
> +
> +       mutex_lock(&pdata->lock);
> +       v = readq(base + FAB_CTRL);
> +       v &= ~(FAB_PORT_FILTER | FAB_PORT_ID);
> +
> +       if (pobj->id == PERF_OBJ_ROOT_ID) {
> +               v |= FIELD_PREP(FAB_PORT_FILTER, FAB_PORT_FILTER_DISABLE);
> +       } else {
> +               v |= FIELD_PREP(FAB_PORT_FILTER, FAB_PORT_FILTER_ENABLE);
> +               v |= FIELD_PREP(FAB_PORT_ID, pobj->id);
> +       }
> +       writeq(v, base + FAB_CTRL);
> +       mutex_unlock(&pdata->lock);
> +
> +       return n;
> +}
> +static PERF_OBJ_ATTR(fab_enable, enable, 0644,
> +                    fab_enable_show, fab_enable_store);
> +
> +static struct attribute *fabric_attrs[] = {
> +       &perf_obj_attr_pcie0_read.attr,
> +       &perf_obj_attr_pcie0_write.attr,
> +       &perf_obj_attr_pcie1_read.attr,
> +       &perf_obj_attr_pcie1_write.attr,
> +       &perf_obj_attr_upi_read.attr,
> +       &perf_obj_attr_upi_write.attr,
> +       &perf_obj_attr_mmio_read.attr,
> +       &perf_obj_attr_mmio_write.attr,
> +       &perf_obj_attr_fab_enable.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group fabric_attr_group = {
> +       .attrs = fabric_attrs,
> +};
> +
> +static const struct attribute_group *fabric_attr_groups[] = {
> +       &fabric_attr_group,
> +       NULL,
> +};
> +
> +static ssize_t fab_freeze_show(struct perf_object *pobj, char *buf)
> +{
> +       void __iomem *base = pobj->feature->ioaddr;
> +       u64 v;
> +
> +       v = readq(base + FAB_CTRL);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> +                        (unsigned int)FIELD_GET(FAB_FREEZE_CNTR, v));
> +}
> +
> +static ssize_t fab_freeze_store(struct perf_object *pobj,
> +                               const char *buf, size_t n)
> +{
> +       struct dfl_feature *feature = pobj->feature;
> +       struct dfl_feature_platform_data *pdata;
> +       void __iomem *base = feature->ioaddr;
> +       bool state;
> +       u64 v;
> +
> +       if (strtobool(buf, &state))
> +               return -EINVAL;
> +
> +       pdata = dev_get_platdata(&feature->pdev->dev);
> +
> +       mutex_lock(&pdata->lock);
> +       v = readq(base + FAB_CTRL);
> +       v &= ~FAB_FREEZE_CNTR;
> +       v |= FIELD_PREP(FAB_FREEZE_CNTR, state ? 1 : 0);
> +       writeq(v, base + FAB_CTRL);
> +       mutex_unlock(&pdata->lock);
> +
> +       return n;
> +}
> +static PERF_OBJ_ATTR(fab_freeze, freeze, 0644,
> +                    fab_freeze_show, fab_freeze_store);

PERF_OBJ_ATTR_RW ?  Also in a few other places, wherever '0644' shows up.



> +
> +static struct attribute *fabric_top_attrs[] = {
> +       &perf_obj_attr_fab_freeze.attr,
> +       NULL,
> +};
> +
> +static struct attribute_group fabric_top_attr_group = {
> +       .attrs = fabric_top_attrs,
> +};
> +
> +static const struct attribute_group *fabric_top_attr_groups[] = {
> +       &fabric_attr_group,
> +       &fabric_top_attr_group,
> +       NULL,
> +};
> +
> +static int create_perf_fabric_obj(struct perf_object *perf_dev)
> +{
> +       struct perf_object *pobj, *obj;
> +       int i;
> +
> +       pobj = create_perf_obj(perf_dev->feature, &perf_dev->kobj,
> +                              PERF_OBJ_ROOT_ID, fabric_top_attr_groups,
> +                              "fabric");
> +       if (IS_ERR(pobj))
> +               return PTR_ERR(pobj);
> +
> +       list_add(&pobj->node, &perf_dev->children);
> +
> +       for (i = 0; i < PERF_MAX_PORT_NUM; i++) {
> +               obj = create_perf_obj(perf_dev->feature, &pobj->kobj, i,
> +                                     fabric_attr_groups, "port");
> +               if (IS_ERR(obj))
> +                       return PTR_ERR(obj);
> +
> +               list_add(&obj->node, &pobj->children);
> +       }
> +
> +       return 0;
> +}
> +
> +static int fme_perf_init(struct platform_device *pdev,
> +                        struct dfl_feature *feature)
> +{
> +       struct perf_object *perf_dev;
> +       int ret;
> +
> +       perf_dev = create_perf_dev(feature);
> +       if (IS_ERR(perf_dev))
> +               return PTR_ERR(perf_dev);
> +
> +       ret = create_perf_fabric_obj(perf_dev);
> +       if (ret)
> +               goto done;
> +
> +       if (feature->id == FME_FEATURE_ID_GLOBAL_IPERF) {
> +               /*
> +                * Cache and IOMMU(VT-D) performance counters are not supported
> +                * on discreted solutions e.g. Intel Programmable Acceleration
> +                * Card based on PCIe.
> +                */
> +               ret = create_perf_cache_obj(perf_dev);
> +               if (ret)
> +                       goto done;
> +
> +               ret = create_perf_iommu_obj(perf_dev);
> +               if (ret)
> +                       goto done;
> +       }
> +
> +       feature->priv = perf_dev;
> +       return 0;
> +
> +done:
> +       destroy_perf_obj(perf_dev);
> +       return ret;
> +}
> +
> +static void fme_perf_uinit(struct platform_device *pdev,
> +                          struct dfl_feature *feature)
> +{
> +       struct perf_object *perf_dev = feature->priv;
> +
> +       destroy_perf_obj(perf_dev);
> +}
> +
> +const struct dfl_feature_id fme_perf_id_table[] = {
> +       {.id = FME_FEATURE_ID_GLOBAL_IPERF,},
> +       {.id = FME_FEATURE_ID_GLOBAL_DPERF,},
> +       {0,}
> +};
> +
> +const struct dfl_feature_ops fme_perf_ops = {
> +       .init = fme_perf_init,
> +       .uinit = fme_perf_uinit,
> +};
> diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
> index 5fbe3f5..dc71048 100644
> --- a/drivers/fpga/dfl-fme.h
> +++ b/drivers/fpga/dfl-fme.h
> @@ -39,5 +39,7 @@ struct dfl_fme {
>  extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
>  extern const struct dfl_feature_ops fme_global_err_ops;
>  extern const struct dfl_feature_id fme_global_err_id_table[];
> +extern const struct dfl_feature_ops fme_perf_ops;
> +extern const struct dfl_feature_id fme_perf_id_table[];
>
>  #endif /* __DFL_FME_H */
> diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> index 65f91ef..637692a 100644
> --- a/drivers/fpga/dfl.c
> +++ b/drivers/fpga/dfl.c
> @@ -507,6 +507,7 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
>                 struct dfl_feature *feature = &pdata->features[index];
>
>                 /* save resource information for each feature */
> +               feature->pdev = fdev;
>                 feature->id = finfo->fid;
>                 feature->resource_index = index;
>                 feature->ioaddr = finfo->ioaddr;
> diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
> index 6c32080..bf23436 100644
> --- a/drivers/fpga/dfl.h
> +++ b/drivers/fpga/dfl.h
> @@ -191,6 +191,7 @@ struct dfl_feature_driver {
>  /**
>   * struct dfl_feature - sub feature of the feature devices
>   *
> + * @pdev: parent platform device.
>   * @id: sub feature id.
>   * @resource_index: each sub feature has one mmio resource for its registers.
>   *                 this index is used to find its mmio resource from the
> @@ -200,6 +201,7 @@ struct dfl_feature_driver {
>   * @priv: priv data of this feature.
>   */
>  struct dfl_feature {
> +       struct platform_device *pdev;
>         u64 id;
>         int resource_index;
>         void __iomem *ioaddr;
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR
  2019-04-29  8:55 ` [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR Wu Hao
@ 2019-05-16 17:35   ` Alan Tull
  2019-05-17  3:50     ` Wu Hao
  0 siblings, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-16 17:35 UTC (permalink / raw)
  To: Wu Hao, Scott Wood
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api,
	Ananda Ravuri, Xu Yilun

On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:

It looks like this addressed the review comments.  Adding my Ack.  Is
there anything else on this patch?

Alan

>
> In early partial reconfiguration private feature, it only
> supports 32bit data width when writing data to hardware for
> PR. 512bit data width PR support is an important optimization
> for some specific solutions (e.g. XEON with FPGA integrated),
> it allows driver to use AVX512 instruction to improve the
> performance of partial reconfiguration. e.g. programming one
> 100MB bitstream image via this 512bit data width PR hardware
> only takes ~300ms, but 32bit revision requires ~3s per test
> result.
>
> Please note now this optimization is only done on revision 2
> of this PR private feature which is only used in integrated
> solution that AVX512 is always supported. This revision 2
> hardware doesn't support 32bit PR.
>
> Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>

Acked-by: Alan Tull <atull@kernel.org>


> ---
> v2: check AVX512 support using cpu_feature_enabled()
>     fix other comments from Scott Wood <swood@redhat.com>
> ---
>  drivers/fpga/dfl-fme-main.c |   3 ++
>  drivers/fpga/dfl-fme-mgr.c  | 113 +++++++++++++++++++++++++++++++++++++-------
>  drivers/fpga/dfl-fme-pr.c   |  43 +++++++++++------
>  drivers/fpga/dfl-fme.h      |   2 +
>  drivers/fpga/dfl.h          |   5 ++
>  5 files changed, 135 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> index 086ad24..076d74f 100644
> --- a/drivers/fpga/dfl-fme-main.c
> +++ b/drivers/fpga/dfl-fme-main.c
> @@ -21,6 +21,8 @@
>  #include "dfl.h"
>  #include "dfl-fme.h"
>
> +#define DRV_VERSION    "0.8"
> +
>  static ssize_t ports_num_show(struct device *dev,
>                               struct device_attribute *attr, char *buf)
>  {
> @@ -277,3 +279,4 @@ static int fme_remove(struct platform_device *pdev)
>  MODULE_AUTHOR("Intel Corporation");
>  MODULE_LICENSE("GPL v2");
>  MODULE_ALIAS("platform:dfl-fme");
> +MODULE_VERSION(DRV_VERSION);
> diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c
> index b3f7eee..d1a4ba5 100644
> --- a/drivers/fpga/dfl-fme-mgr.c
> +++ b/drivers/fpga/dfl-fme-mgr.c
> @@ -22,14 +22,18 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/fpga/fpga-mgr.h>
>
> +#include "dfl.h"
>  #include "dfl-fme-pr.h"
>
> +#define DRV_VERSION    "0.8"
> +
>  /* FME Partial Reconfiguration Sub Feature Register Set */
>  #define FME_PR_DFH             0x0
>  #define FME_PR_CTRL            0x8
>  #define FME_PR_STS             0x10
>  #define FME_PR_DATA            0x18
>  #define FME_PR_ERR             0x20
> +#define FME_PR_512_DATA                0x40 /* Data Register for 512bit datawidth PR */
>  #define FME_PR_INTFC_ID_L      0xA8
>  #define FME_PR_INTFC_ID_H      0xB0
>
> @@ -67,8 +71,43 @@
>  #define PR_WAIT_TIMEOUT   8000000
>  #define PR_HOST_STATUS_IDLE    0
>
> +#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512)
> +
> +#include <linux/cpufeature.h>
> +#include <asm/fpu/api.h>
> +
> +static inline int is_cpu_avx512_enabled(void)
> +{
> +       return cpu_feature_enabled(X86_FEATURE_AVX512F);
> +}
> +
> +static inline void copy512(const void *src, void __iomem *dst)
> +{
> +       kernel_fpu_begin();
> +
> +       asm volatile("vmovdqu64 (%0), %%zmm0;"
> +                    "vmovntdq %%zmm0, (%1);"
> +                    :
> +                    : "r"(src), "r"(dst)
> +                    : "memory");
> +
> +       kernel_fpu_end();
> +}
> +#else
> +static inline int is_cpu_avx512_enabled(void)
> +{
> +       return 0;
> +}
> +
> +static inline void copy512(const void *src, void __iomem *dst)
> +{
> +       WARN_ON_ONCE(1);
> +}
> +#endif
> +
>  struct fme_mgr_priv {
>         void __iomem *ioaddr;
> +       unsigned int pr_datawidth;
>         u64 pr_error;
>  };
>
> @@ -169,7 +208,7 @@ static int fme_mgr_write(struct fpga_manager *mgr,
>         struct fme_mgr_priv *priv = mgr->priv;
>         void __iomem *fme_pr = priv->ioaddr;
>         u64 pr_ctrl, pr_status, pr_data;
> -       int delay = 0, pr_credit, i = 0;
> +       int ret = 0, delay = 0, pr_credit;
>
>         dev_dbg(dev, "start request\n");
>
> @@ -181,9 +220,9 @@ static int fme_mgr_write(struct fpga_manager *mgr,
>
>         /*
>          * driver can push data to PR hardware using PR_DATA register once HW
> -        * has enough pr_credit (> 1), pr_credit reduces one for every 32bit
> -        * pr data write to PR_DATA register. If pr_credit <= 1, driver needs
> -        * to wait for enough pr_credit from hardware by polling.
> +        * has enough pr_credit (> 1), pr_credit reduces one for every pr data
> +        * width write to PR_DATA register. If pr_credit <= 1, driver needs to
> +        * wait for enough pr_credit from hardware by polling.
>          */
>         pr_status = readq(fme_pr + FME_PR_STS);
>         pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
> @@ -192,7 +231,8 @@ static int fme_mgr_write(struct fpga_manager *mgr,
>                 while (pr_credit <= 1) {
>                         if (delay++ > PR_WAIT_TIMEOUT) {
>                                 dev_err(dev, "PR_CREDIT timeout\n");
> -                               return -ETIMEDOUT;
> +                               ret = -ETIMEDOUT;
> +                               goto done;
>                         }
>                         udelay(1);
>
> @@ -200,21 +240,27 @@ static int fme_mgr_write(struct fpga_manager *mgr,
>                         pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
>                 }
>
> -               if (count < 4) {
> -                       dev_err(dev, "Invalid PR bitstream size\n");
> -                       return -EINVAL;
> +               WARN_ON(count < priv->pr_datawidth);
> +
> +               switch (priv->pr_datawidth) {
> +               case 4:
> +                       pr_data = FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
> +                                            *(u32 *)buf);
> +                       writeq(pr_data, fme_pr + FME_PR_DATA);
> +                       break;
> +               case 64:
> +                       copy512(buf, fme_pr + FME_PR_512_DATA);
> +                       break;
> +               default:
> +                       WARN_ON_ONCE(1);
>                 }
> -
> -               pr_data = 0;
> -               pr_data |= FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
> -                                     *(((u32 *)buf) + i));
> -               writeq(pr_data, fme_pr + FME_PR_DATA);
> -               count -= 4;
> +               buf += priv->pr_datawidth;
> +               count -= priv->pr_datawidth;
>                 pr_credit--;
> -               i++;
>         }
>
> -       return 0;
> +done:
> +       return ret;
>  }
>
>  static int fme_mgr_write_complete(struct fpga_manager *mgr,
> @@ -279,6 +325,36 @@ static void fme_mgr_get_compat_id(void __iomem *fme_pr,
>         id->id_h = readq(fme_pr + FME_PR_INTFC_ID_H);
>  }
>
> +static u8 fme_mgr_get_pr_datawidth(struct device *dev, void __iomem *fme_pr)
> +{
> +       u8 revision = dfl_feature_revision(fme_pr);
> +
> +       if (revision < 2) {
> +               /*
> +                * revision 0 and 1 only support 32bit data width partial
> +                * reconfiguration, so pr_datawidth is 4 (Byte).
> +                */
> +               return 4;
> +       } else if (revision == 2) {
> +               /*
> +                * revision 2 hardware has optimization to support 512bit data
> +                * width partial reconfiguration with AVX512 instructions. So
> +                * pr_datawidth is 64 (Byte). As revision 2 hardware is only
> +                * used in integrated solution, CPU supports AVX512 instructions
> +                * for sure, but it still needs to check here as AVX512 could be
> +                * disabled in kernel (e.g. using clearcpuid boot option).
> +                */
> +               if (is_cpu_avx512_enabled())
> +                       return 64;
> +
> +               dev_err(dev, "revision 2: AVX512 is disabled\n");
> +               return 0;
> +       }
> +
> +       dev_err(dev, "revision %d is not supported yet\n", revision);
> +       return 0;
> +}
> +
>  static int fme_mgr_probe(struct platform_device *pdev)
>  {
>         struct dfl_fme_mgr_pdata *pdata = dev_get_platdata(&pdev->dev);
> @@ -302,6 +378,10 @@ static int fme_mgr_probe(struct platform_device *pdev)
>                         return PTR_ERR(priv->ioaddr);
>         }
>
> +       priv->pr_datawidth = fme_mgr_get_pr_datawidth(dev, priv->ioaddr);
> +       if (!priv->pr_datawidth)
> +               return -ENODEV;
> +
>         compat_id = devm_kzalloc(dev, sizeof(*compat_id), GFP_KERNEL);
>         if (!compat_id)
>                 return -ENOMEM;
> @@ -342,3 +422,4 @@ static int fme_mgr_remove(struct platform_device *pdev)
>  MODULE_AUTHOR("Intel Corporation");
>  MODULE_LICENSE("GPL v2");
>  MODULE_ALIAS("platform:dfl-fme-mgr");
> +MODULE_VERSION(DRV_VERSION);
> diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
> index 3c71dc3..cd94ba8 100644
> --- a/drivers/fpga/dfl-fme-pr.c
> +++ b/drivers/fpga/dfl-fme-pr.c
> @@ -83,7 +83,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>         if (copy_from_user(&port_pr, argp, minsz))
>                 return -EFAULT;
>
> -       if (port_pr.argsz < minsz || port_pr.flags)
> +       if (port_pr.argsz < minsz || port_pr.flags || !port_pr.buffer_size)
>                 return -EINVAL;
>
>         /* get fme header region */
> @@ -101,15 +101,25 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>                        port_pr.buffer_size))
>                 return -EFAULT;
>
> +       mutex_lock(&pdata->lock);
> +       fme = dfl_fpga_pdata_get_private(pdata);
> +       /* fme device has been unregistered. */
> +       if (!fme) {
> +               ret = -EINVAL;
> +               goto unlock_exit;
> +       }
> +
>         /*
>          * align PR buffer per PR bandwidth, as HW ignores the extra padding
>          * data automatically.
>          */
> -       length = ALIGN(port_pr.buffer_size, 4);
> +       length = ALIGN(port_pr.buffer_size, fme->pr_datawidth);
>
>         buf = vmalloc(length);
> -       if (!buf)
> -               return -ENOMEM;
> +       if (!buf) {
> +               ret = -ENOMEM;
> +               goto unlock_exit;
> +       }
>
>         if (copy_from_user(buf,
>                            (void __user *)(unsigned long)port_pr.buffer_address,
> @@ -127,18 +137,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>
>         info->flags |= FPGA_MGR_PARTIAL_RECONFIG;
>
> -       mutex_lock(&pdata->lock);
> -       fme = dfl_fpga_pdata_get_private(pdata);
> -       /* fme device has been unregistered. */
> -       if (!fme) {
> -               ret = -EINVAL;
> -               goto unlock_exit;
> -       }
> -
>         region = dfl_fme_region_find(fme, port_pr.port_id);
>         if (!region) {
>                 ret = -EINVAL;
> -               goto unlock_exit;
> +               goto free_exit;
>         }
>
>         fpga_image_info_free(region->info);
> @@ -159,10 +161,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
>                 fpga_bridges_put(&region->bridge_list);
>
>         put_device(&region->dev);
> -unlock_exit:
> -       mutex_unlock(&pdata->lock);
>  free_exit:
>         vfree(buf);
> +unlock_exit:
> +       mutex_unlock(&pdata->lock);
>         return ret;
>  }
>
> @@ -388,6 +390,17 @@ static int pr_mgmt_init(struct platform_device *pdev,
>         mutex_lock(&pdata->lock);
>         priv = dfl_fpga_pdata_get_private(pdata);
>
> +       /*
> +        * Initialize PR data width.
> +        * Only revision 2 supports 512bit datawidth for better performance,
> +        * other revisions use default 32bit datawidth. This is used for
> +        * buffer alignment.
> +        */
> +       if (dfl_feature_revision(feature->ioaddr) == 2)
> +               priv->pr_datawidth = 64;
> +       else
> +               priv->pr_datawidth = 4;
> +
>         /* Initialize the region and bridge sub device list */
>         INIT_LIST_HEAD(&priv->region_list);
>         INIT_LIST_HEAD(&priv->bridge_list);
> diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
> index 5394a21..de20755 100644
> --- a/drivers/fpga/dfl-fme.h
> +++ b/drivers/fpga/dfl-fme.h
> @@ -21,12 +21,14 @@
>  /**
>   * struct dfl_fme - dfl fme private data
>   *
> + * @pr_datawidth: data width for partial reconfiguration.
>   * @mgr: FME's FPGA manager platform device.
>   * @region_list: linked list of FME's FPGA regions.
>   * @bridge_list: linked list of FME's FPGA bridges.
>   * @pdata: fme platform device's pdata.
>   */
>  struct dfl_fme {
> +       int pr_datawidth;
>         struct platform_device *mgr;
>         struct list_head region_list;
>         struct list_head bridge_list;
> diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
> index a8b869e..8851c6c 100644
> --- a/drivers/fpga/dfl.h
> +++ b/drivers/fpga/dfl.h
> @@ -331,6 +331,11 @@ static inline bool dfl_feature_is_port(void __iomem *base)
>                 (FIELD_GET(DFH_ID, v) == DFH_ID_FIU_PORT);
>  }
>
> +static inline u8 dfl_feature_revision(void __iomem *base)
> +{
> +       return (u8)FIELD_GET(DFH_REVISION, readq(base + DFH));
> +}
> +
>  /**
>   * struct dfl_fpga_enum_info - DFL FPGA enumeration information
>   *
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces.
  2019-04-29  8:55 ` [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces Wu Hao
@ 2019-05-16 17:36   ` Alan Tull
  2019-05-16 17:53     ` Alan Tull
  0 siblings, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-16 17:36 UTC (permalink / raw)
  To: Wu Hao; +Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:
>
> This patch adds virtualization support description for DFL based
> FPGA devices (based on PCIe SRIOV), and introductions to new
> interfaces added by new dfl private feature drivers.
>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>

Acked-by: Alan Tull <atull@kernel.org>

Thanks,
Alan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces.
  2019-05-16 17:36   ` Alan Tull
@ 2019-05-16 17:53     ` Alan Tull
  2019-05-17  4:11       ` Wu Hao
  0 siblings, 1 reply; 42+ messages in thread
From: Alan Tull @ 2019-05-16 17:53 UTC (permalink / raw)
  To: Wu Hao; +Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Thu, May 16, 2019 at 12:36 PM Alan Tull <atull@kernel.org> wrote:
>
> On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:

Hi Hao,

Most of this patchset looks ready to go upstream or nearly so with
pretty straightforward changes .  Patches 17 and 18 need minor changes
and please change the scnprintf in the other patches.  The patches
that had nontrivial changes are the power and thermal ones involving
hwmon.  I'm hoping to send up the patchset minus the hwmon patches in
the next version if there's no unforseen issues.  If the hwmon patches
are ready then also, that's great, but otherwise those patches don't
need to hold up all the rest of the patchset.  How's that sound?

Alan

> >
> > This patch adds virtualization support description for DFL based
> > FPGA devices (based on PCIe SRIOV), and introductions to new
> > interfaces added by new dfl private feature drivers.
> >
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
>
> Acked-by: Alan Tull <atull@kernel.org>
>
> Thanks,
> Alan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/18] fpga: dfl: fme: add performance reporting support
  2019-05-16 17:28   ` Alan Tull
@ 2019-05-17  3:48     ` Wu Hao
  0 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-05-17  3:48 UTC (permalink / raw)
  To: Alan Tull
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Xu Yilun

On Thu, May 16, 2019 at 12:28:08PM -0500, Alan Tull wrote:
> On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> Hi Hao,
> 
> >
> > This patch adds support for performance reporting private feature
> > for FPGA Management Engine (FME). Actually it supports 4 categories
> > performance counters, 'clock', 'cache', 'iommu' and 'fabric', user
> > could read the performance counter via exposed sysfs interfaces.
> > Please refer to sysfs doc for more details.
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > ---
> > v2: improve sysfs doc
> > ---
> >  Documentation/ABI/testing/sysfs-platform-dfl-fme |  93 +++
> >  drivers/fpga/Makefile                            |   1 +
> >  drivers/fpga/dfl-fme-main.c                      |   4 +
> >  drivers/fpga/dfl-fme-perf.c                      | 950 +++++++++++++++++++++++
> >  drivers/fpga/dfl-fme.h                           |   2 +
> >  drivers/fpga/dfl.c                               |   1 +
> >  drivers/fpga/dfl.h                               |   2 +
> >  7 files changed, 1053 insertions(+)
> >  create mode 100644 drivers/fpga/dfl-fme-perf.c
> >
> > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > index 503984b..a7f7eb6 100644
> > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > @@ -250,3 +250,96 @@ Description:       Write-only. Write error code to this file to clear all errors
> >                 logged in errors, first_error and next_error. Write fails with
> >                 -EINVAL if input parsing fails or input error code doesn't
> >                 match.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/clock
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read for Accelerator Function Unit (AFU) clock
> > +               counter.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/cache/freeze
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file for the current status of 'cache'
> > +               category performance counters, and Write '1' or '0' to freeze
> > +               or unfreeze 'cache' performance counters.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/cache/<counter>
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read 'cache' category performance counters:
> > +               read_hit, read_miss, write_hit, write_miss, hold_request,
> > +               data_write_port_contention, tag_write_port_contention,
> > +               tx_req_stall, rx_req_stall and rx_eviction.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/iommu/freeze
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file for the current status of 'iommu'
> > +               category performance counters, and Write '1' or '0' to freeze
> > +               or unfreeze 'iommu' performance counters.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/iommu/<sip_counter>
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read 'iommu' category 'sip' sub category
> > +               performance counters: iotlb_4k_hit, iotlb_2m_hit,
> > +               iotlb_1g_hit, slpwc_l3_hit, slpwc_l4_hit, rcc_hit,
> > +               rcc_miss, iotlb_4k_miss, iotlb_2m_miss, iotlb_1g_miss,
> > +               slpwc_l3_miss and slpwc_l4_miss.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/iommu/afu0/<counter>
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read 'iommu' category 'afuX' sub category
> > +               performance counters: read_transaction, write_transaction,
> > +               devtlb_read_hit, devtlb_write_hit, devtlb_4k_fill,
> > +               devtlb_2m_fill and devtlb_1g_fill.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/freeze
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file for the current status of 'fabric'
> > +               category performance counters, and Write '1' or '0' to freeze
> > +               or unfreeze 'fabric' performance counters.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/<counter>
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read 'fabric' category performance counters:
> > +               pcie0_read, pcie0_write, pcie1_read, pcie1_write,
> > +               upi_read, upi_write and mmio_read.
> 
> Also mmio_write

Thanks for the comments, will fix it.

> 
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/enable
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file for current status of device level
> > +               fabric counters. Write "1" to enable device level fabric
> > +               counters. Once device level fabric counters are enabled, port
> > +               level fabric counters will be disabled automatically.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/port0/<counter>
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Only. Read 'fabric' category "portX" sub category
> > +               performance counters: pcie0_read, pcie0_write, pcie1_read,
> > +               pcie1_write, upi_read, upi_write and mmio_read.
> > +
> > +What:          /sys/bus/platform/devices/dfl-fme.0/perf/fabric/port0/enable
> > +Date:          April 2019
> > +KernelVersion:  5.2
> > +Contact:       Wu Hao <hao.wu@intel.com>
> > +Description:   Read-Write. Read this file for current status of port level
> > +               fabric counters. Write "1" to enable port level fabric counters.
> > +               Once port level fabric counters are enabled, device level fabric
> > +               counters will be disabled automatically.
> > diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile
> > index 1a9fa3d..7df3971 100644
> > --- a/drivers/fpga/Makefile
> > +++ b/drivers/fpga/Makefile
> > @@ -39,6 +39,7 @@ obj-$(CONFIG_FPGA_DFL_FME_REGION)     += dfl-fme-region.o
> >  obj-$(CONFIG_FPGA_DFL_AFU)             += dfl-afu.o
> >
> >  dfl-fme-objs := dfl-fme-main.o dfl-fme-pr.o dfl-fme-error.o
> > +dfl-fme-objs += dfl-fme-perf.o
> >  dfl-afu-objs := dfl-afu-main.o dfl-afu-region.o dfl-afu-dma-region.o
> >  dfl-afu-objs += dfl-afu-error.o
> >
> > diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> > index 1986b32..221f4ec 100644
> > --- a/drivers/fpga/dfl-fme-main.c
> > +++ b/drivers/fpga/dfl-fme-main.c
> > @@ -690,6 +690,10 @@ static void fme_power_mgmt_uinit(struct platform_device *pdev,
> >                 .ops = &fme_global_err_ops,
> >         },
> >         {
> > +               .id_table = fme_perf_id_table,
> > +               .ops = &fme_perf_ops,
> > +       },
> > +       {
> >                 .ops = NULL,
> >         },
> >  };
> > diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c
> > new file mode 100644
> > index 0000000..035bb68
> > --- /dev/null
> > +++ b/drivers/fpga/dfl-fme-perf.c
> > @@ -0,0 +1,950 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Driver for FPGA Management Engine (FME) Global Performance Reporting
> > + *
> > + * Copyright 2019 Intel Corporation, Inc.
> > + *
> > + * Authors:
> > + *   Kang Luwei <luwei.kang@intel.com>
> > + *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
> > + *   Wu Hao <hao.wu@intel.com>
> > + *   Joseph Grecco <joe.grecco@intel.com>
> > + *   Enno Luebbers <enno.luebbers@intel.com>
> > + *   Tim Whisonant <tim.whisonant@intel.com>
> > + *   Ananda Ravuri <ananda.ravuri@intel.com>
> > + *   Mitchel, Henry <henry.mitchel@intel.com>
> > + */
> > +
> > +#include "dfl.h"
> > +#include "dfl-fme.h"
> > +
> > +/*
> > + * Performance Counter Registers for Cache.
> > + *
> > + * Cache Events are listed below as CACHE_EVNT_*.
> > + */
> > +#define CACHE_CTRL                     0x8
> > +#define CACHE_RESET_CNTR               BIT_ULL(0)
> > +#define CACHE_FREEZE_CNTR              BIT_ULL(8)
> > +#define CACHE_CTRL_EVNT                        GENMASK_ULL(19, 16)
> > +#define CACHE_EVNT_RD_HIT              0x0
> > +#define CACHE_EVNT_WR_HIT              0x1
> > +#define CACHE_EVNT_RD_MISS             0x2
> > +#define CACHE_EVNT_WR_MISS             0x3
> > +#define CACHE_EVNT_RSVD                        0x4
> > +#define CACHE_EVNT_HOLD_REQ            0x5
> > +#define CACHE_EVNT_DATA_WR_PORT_CONTEN 0x6
> > +#define CACHE_EVNT_TAG_WR_PORT_CONTEN  0x7
> > +#define CACHE_EVNT_TX_REQ_STALL                0x8
> > +#define CACHE_EVNT_RX_REQ_STALL                0x9
> > +#define CACHE_EVNT_EVICTIONS           0xa
> > +#define CACHE_EVNT_MAX                 CACHE_EVNT_EVICTIONS
> > +#define CACHE_CHANNEL_SEL              BIT_ULL(20)
> > +#define CACHE_CHANNEL_RD               0
> > +#define CACHE_CHANNEL_WR               1
> > +#define CACHE_CHANNEL_MAX              2
> > +#define CACHE_CNTR0                    0x10
> > +#define CACHE_CNTR1                    0x18
> > +#define CACHE_CNTR_EVNT_CNTR           GENMASK_ULL(47, 0)
> > +#define CACHE_CNTR_EVNT                        GENMASK_ULL(63, 60)
> > +
> > +/*
> > + * Performance Counter Registers for Fabric.
> > + *
> > + * Fabric Events are listed below as FAB_EVNT_*
> > + */
> > +#define FAB_CTRL                       0x20
> > +#define FAB_RESET_CNTR                 BIT_ULL(0)
> > +#define FAB_FREEZE_CNTR                        BIT_ULL(8)
> > +#define FAB_CTRL_EVNT                  GENMASK_ULL(19, 16)
> > +#define FAB_EVNT_PCIE0_RD              0x0
> > +#define FAB_EVNT_PCIE0_WR              0x1
> > +#define FAB_EVNT_PCIE1_RD              0x2
> > +#define FAB_EVNT_PCIE1_WR              0x3
> > +#define FAB_EVNT_UPI_RD                        0x4
> > +#define FAB_EVNT_UPI_WR                        0x5
> > +#define FAB_EVNT_MMIO_RD               0x6
> > +#define FAB_EVNT_MMIO_WR               0x7
> > +#define FAB_EVNT_MAX                   FAB_EVNT_MMIO_WR
> > +#define FAB_PORT_ID                    GENMASK_ULL(21, 20)
> > +#define FAB_PORT_FILTER                        BIT_ULL(23)
> > +#define FAB_PORT_FILTER_DISABLE                0
> > +#define FAB_PORT_FILTER_ENABLE         1
> > +#define FAB_CNTR                       0x28
> > +#define FAB_CNTR_EVNT_CNTR             GENMASK_ULL(59, 0)
> > +#define FAB_CNTR_EVNT                  GENMASK_ULL(63, 60)
> > +
> > +/*
> > + * Performance Counter Registers for Clock.
> > + *
> > + * Clock Counter can't be reset or frozen by SW.
> > + */
> > +#define CLK_CNTR                       0x30
> > +
> > +/*
> > + * Performance Counter Registers for IOMMU / VT-D.
> > + *
> > + * VT-D Events are listed below as VTD_EVNT_* and VTD_SIP_EVNT_*
> > + */
> > +#define VTD_CTRL                       0x38
> > +#define VTD_RESET_CNTR                 BIT_ULL(0)
> > +#define VTD_FREEZE_CNTR                        BIT_ULL(8)
> > +#define VTD_CTRL_EVNT                  GENMASK_ULL(19, 16)
> > +#define VTD_EVNT_AFU_MEM_RD_TRANS      0x0
> > +#define VTD_EVNT_AFU_MEM_WR_TRANS      0x1
> > +#define VTD_EVNT_AFU_DEVTLB_RD_HIT     0x2
> > +#define VTD_EVNT_AFU_DEVTLB_WR_HIT     0x3
> > +#define VTD_EVNT_DEVTLB_4K_FILL                0x4
> > +#define VTD_EVNT_DEVTLB_2M_FILL                0x5
> > +#define VTD_EVNT_DEVTLB_1G_FILL                0x6
> > +#define VTD_EVNT_MAX                   VTD_EVNT_DEVTLB_1G_FILL
> > +#define VTD_CNTR                       0x40
> > +#define VTD_CNTR_EVNT                  GENMASK_ULL(63, 60)
> > +#define VTD_CNTR_EVNT_CNTR             GENMASK_ULL(47, 0)
> > +#define VTD_SIP_CTRL                   0x48
> > +#define VTD_SIP_RESET_CNTR             BIT_ULL(0)
> > +#define VTD_SIP_FREEZE_CNTR            BIT_ULL(8)
> > +#define VTD_SIP_CTRL_EVNT              GENMASK_ULL(19, 16)
> > +#define VTD_SIP_EVNT_IOTLB_4K_HIT      0x0
> > +#define VTD_SIP_EVNT_IOTLB_2M_HIT      0x1
> > +#define VTD_SIP_EVNT_IOTLB_1G_HIT      0x2
> > +#define VTD_SIP_EVNT_SLPWC_L3_HIT      0x3
> > +#define VTD_SIP_EVNT_SLPWC_L4_HIT      0x4
> > +#define VTD_SIP_EVNT_RCC_HIT           0x5
> > +#define VTD_SIP_EVNT_IOTLB_4K_MISS     0x6
> > +#define VTD_SIP_EVNT_IOTLB_2M_MISS     0x7
> > +#define VTD_SIP_EVNT_IOTLB_1G_MISS     0x8
> > +#define VTD_SIP_EVNT_SLPWC_L3_MISS     0x9
> > +#define VTD_SIP_EVNT_SLPWC_L4_MISS     0xa
> > +#define VTD_SIP_EVNT_RCC_MISS          0xb
> > +#define VTD_SIP_EVNT_MAX               VTD_SIP_EVNT_RCC_MISS
> > +#define VTD_SIP_CNTR                   0X50
> > +#define VTD_SIP_CNTR_EVNT              GENMASK_ULL(63, 60)
> > +#define VTD_SIP_CNTR_EVNT_CNTR         GENMASK_ULL(47, 0)
> > +
> > +#define PERF_OBJ_ROOT_ID               (~0)
> > +
> > +#define PERF_TIMEOUT                   30
> > +
> > +/**
> > + * struct perf_object - object of performance counter
> > + *
> > + * @id: instance id. PERF_OBJ_ROOT_ID indicates it is a parent object which
> > + *      counts performance counters for all instances.
> > + * @attr_groups: the sysfs files are associated with this object.
> > + * @feature: pointer to related private feature.
> > + * @node: used to link itself to parent's children list.
> > + * @children: used to link its children objects together.
> > + * @kobj: generic kobject interface.
> > + *
> > + * 'node' and 'children' are used to construct parent-children hierarchy.
> > + */
> > +struct perf_object {
> > +       int id;
> > +       const struct attribute_group **attr_groups;
> > +       struct dfl_feature *feature;
> > +
> > +       struct list_head node;
> > +       struct list_head children;
> > +       struct kobject kobj;
> > +};
> > +
> > +/**
> > + * struct perf_obj_attribute - attribute of perf object
> > + *
> > + * @attr: attribute of this perf object.
> > + * @show: show callback for sysfs attribute.
> > + * @store: store callback for sysfs attribute.
> > + */
> > +struct perf_obj_attribute {
> > +       struct attribute attr;
> > +       ssize_t (*show)(struct perf_object *pobj, char *buf);
> > +       ssize_t (*store)(struct perf_object *pobj,
> > +                        const char *buf, size_t n);
> > +};
> > +
> > +#define to_perf_obj_attr(_attr)                                        \
> > +               container_of(_attr, struct perf_obj_attribute, attr)
> > +#define to_perf_obj(_kobj)                                     \
> > +               container_of(_kobj, struct perf_object, kobj)
> > +
> > +#define PERF_OBJ_ATTR(_name, _filename, _mode, _show, _store)  \
> > +struct perf_obj_attribute perf_obj_attr_##_name =              \
> > +       __ATTR(_filename, _mode, _show, _store)
> 
> This #define and the ones below set up an interdependency with sysfs.h
> that I'm scratching my head about.  You're defining your own type of
> attribute struct which is fine, but counting on __ATTR to continue to
> work with it in the future.  It wouldn't be much of a change here to
> define your own macro here (which is the same as __ATTR) and then use
> it for your PERF_OBJ_ATTR_RW, etc.  Maybe I'm being overcautious, but
> it's a small change.

Sure, I can change it in the next version.

> 
> > +
> > +#define PERF_OBJ_ATTR_RW(_name)                                        \
> > +       struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_RW(_name)
> > +#define PERF_OBJ_ATTR_RO(_name)                                        \
> > +       struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_RO(_name)
> > +#define PERF_OBJ_ATTR_WO(_name)                                        \
> > +       struct perf_obj_attribute perf_obj_attr_##_name = __ATTR_WO(_name)
> > +
> > +static ssize_t perf_obj_attr_show(struct kobject *kobj,
> > +                                 struct attribute *__attr, char *buf)
> > +{
> > +       struct perf_obj_attribute *attr = to_perf_obj_attr(__attr);
> > +       struct perf_object *pobj = to_perf_obj(kobj);
> > +       ssize_t ret = -EIO;
> 
> Would this be -EPERM?

Actually i use the same error code as other code in drivers/base/core.c.

> 
> > +
> > +       if (attr->show)
> > +               ret = attr->show(pobj, buf);
> 
> Actually is it even possible for !attr->show if this were a WO attribute?

It's possible, but i think if this is a WO attribute, you may get -EPERM
directly when trying to open it for Read, and this show function should
not be invoked at all.

> 
> > +       return ret;
> > +}
> > +
> > +static ssize_t perf_obj_attr_store(struct kobject *kobj,
> > +                                  struct attribute *__attr,
> > +                                  const char *buf, size_t n)
> > +{
> > +       struct perf_obj_attribute *attr = to_perf_obj_attr(__attr);
> > +       struct perf_object *pobj = to_perf_obj(kobj);
> > +       ssize_t ret = -EIO;
> 
> Same here
> 
> > +
> > +       if (attr->store)
> > +               ret = attr->store(pobj, buf, n);
> > +       return ret;
> > +}
> > +
> > +static const struct sysfs_ops perf_obj_sysfs_ops = {
> > +       .show = perf_obj_attr_show,
> > +       .store = perf_obj_attr_store,
> > +};
> > +
> > +static void perf_obj_release(struct kobject *kobj)
> > +{
> > +       kfree(to_perf_obj(kobj));
> > +}
> > +
> > +static struct kobj_type perf_obj_ktype = {
> > +       .sysfs_ops = &perf_obj_sysfs_ops,
> > +       .release = perf_obj_release,
> > +};
> > +
> > +static struct perf_object *
> > +create_perf_obj(struct dfl_feature *feature, struct kobject *parent, int id,
> > +               const struct attribute_group **groups, const char *name)
> > +{
> > +       struct perf_object *pobj;
> > +       int ret;
> > +
> > +       pobj = kzalloc(sizeof(*pobj), GFP_KERNEL);
> > +       if (!pobj)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       pobj->id = id;
> > +       pobj->feature = feature;
> > +       pobj->attr_groups = groups;
> > +       INIT_LIST_HEAD(&pobj->node);
> > +       INIT_LIST_HEAD(&pobj->children);
> > +
> > +       if (id != PERF_OBJ_ROOT_ID)
> > +               ret = kobject_init_and_add(&pobj->kobj, &perf_obj_ktype,
> > +                                          parent, "%s%d", name, id);
> > +       else
> > +               ret = kobject_init_and_add(&pobj->kobj, &perf_obj_ktype,
> > +                                          parent, "%s", name);
> > +       if (ret)
> > +               goto put_exit;
> > +
> > +       if (pobj->attr_groups) {
> > +               ret = sysfs_create_groups(&pobj->kobj, pobj->attr_groups);
> > +               if (ret)
> > +                       goto del_exit;
> > +       }
> > +
> > +       return pobj;
> > +
> > +del_exit:
> > +       kobject_del(&pobj->kobj);
> 
> kobject_put will delete and clean up, you won't need kobject_del.

Will fix this, kobject_put should be enough.

> 
> > +put_exit:
> > +       kobject_put(&pobj->kobj);
> > +       return ERR_PTR(ret);
> > +}
> > +
> > +/*
> > + * Counter Sysfs Interface for Clock.
> > + */
> > +static ssize_t clock_show(struct perf_object *pobj, char *buf)
> > +{
> > +       void __iomem *base = pobj->feature->ioaddr;
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n",
> > +                        (unsigned long long)readq(base + CLK_CNTR));
> 
> It's fine to use sprintf, as mentioned recently on one of the other patches.

Sure, will fix this.

> 
> > +}
> > +static PERF_OBJ_ATTR_RO(clock);
> > +
> > +static struct attribute *clock_attrs[] = {
> > +       &perf_obj_attr_clock.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group clock_attr_group = {
> > +       .attrs = clock_attrs,
> > +};
> > +
> > +static const struct attribute_group *perf_dev_attr_groups[] = {
> > +       &clock_attr_group,
> > +       NULL,
> > +};
> > +
> > +static void destroy_perf_obj(struct perf_object *pobj)
> > +{
> > +       struct perf_object *obj, *obj_tmp;
> > +
> > +       list_for_each_entry_safe(obj, obj_tmp, &pobj->children, node)
> > +               destroy_perf_obj(obj);
> > +
> > +       list_del(&pobj->node);
> > +       if (pobj->attr_groups)
> > +               sysfs_remove_groups(&pobj->kobj, pobj->attr_groups);
> 
> The attributes should be removed before anything else goes away.

Sure.

> 
> > +       kobject_put(&pobj->kobj);
> > +}
> > +
> > +static struct perf_object *create_perf_dev(struct dfl_feature *feature)
> > +{
> > +       struct platform_device *pdev = feature->pdev;
> > +
> > +       return create_perf_obj(feature, &pdev->dev.kobj, PERF_OBJ_ROOT_ID,
> > +                              perf_dev_attr_groups, "perf");
> > +}
> > +
> > +/*
> > + * Counter Sysfs Interfaces for Cache.
> > + */
> > +static ssize_t cache_freeze_show(struct perf_object *pobj, char *buf)
> > +{
> > +       void __iomem *base = pobj->feature->ioaddr;
> > +       u64 v;
> > +
> > +       v = readq(base + CACHE_CTRL);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(CACHE_FREEZE_CNTR, v));
> > +}
> > +
> > +static ssize_t cache_freeze_store(struct perf_object *pobj,
> > +                                 const char *buf, size_t n)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       bool state;
> > +       u64 v;
> > +
> > +       if (strtobool(buf, &state))
> > +               return -EINVAL;
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       v = readq(base + CACHE_CTRL);
> > +       v &= ~CACHE_FREEZE_CNTR;
> > +       v |= FIELD_PREP(CACHE_FREEZE_CNTR, state ? 1 : 0);
> > +       writeq(v, base + CACHE_CTRL);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return n;
> > +}
> > +static PERF_OBJ_ATTR(cache_freeze, freeze, 0644,
> > +                    cache_freeze_show, cache_freeze_store);
> > +
> > +static ssize_t read_cache_counter(struct perf_object *pobj, char *buf,
> > +                                 u8 channel, u8 event)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       u64 v, count;
> > +
> > +       if (event > CACHE_EVNT_MAX || channel > CACHE_CHANNEL_MAX)
> > +               return -EINVAL;
> 
> This would only happen if there was a coding error using one of the
> macros below, right?

Yes. So WARN should be better instead -EINVAL. Let me fix this.

> 
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       /* set channel access type and cache event code. */
> > +       v = readq(base + CACHE_CTRL);
> > +       v &= ~(CACHE_CHANNEL_SEL | CACHE_CTRL_EVNT);
> > +       v |= FIELD_PREP(CACHE_CHANNEL_SEL, channel);
> > +       v |= FIELD_PREP(CACHE_CTRL_EVNT, event);
> > +       writeq(v, base + CACHE_CTRL);
> > +
> > +       if (readq_poll_timeout(base + CACHE_CNTR0, v,
> > +                              FIELD_GET(CACHE_CNTR_EVNT, v) == event,
> > +                              1, PERF_TIMEOUT)) {
> > +               dev_err(&feature->pdev->dev, "timeout, unmatched cache event type in counter registers.\n");
> > +               mutex_unlock(&pdata->lock);
> > +               return -ETIMEDOUT;
> > +       }
> > +
> > +       v = readq(base + CACHE_CNTR0);
> > +       count = FIELD_GET(CACHE_CNTR_EVNT_CNTR, v);
> > +       v = readq(base + CACHE_CNTR1);
> > +       count += FIELD_GET(CACHE_CNTR_EVNT_CNTR, v);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> > +}
> > +
> > +#define CACHE_SHOW(name, type, event)                                  \
> > +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> > +{                                                                      \
> > +       return read_cache_counter(pobj, buf, type, event);              \
> > +}                                                                      \
> > +static PERF_OBJ_ATTR_RO(name)
> > +
> > +CACHE_SHOW(read_hit, CACHE_CHANNEL_RD, CACHE_EVNT_RD_HIT);
> > +CACHE_SHOW(read_miss, CACHE_CHANNEL_RD, CACHE_EVNT_RD_MISS);
> > +CACHE_SHOW(write_hit, CACHE_CHANNEL_WR, CACHE_EVNT_WR_HIT);
> > +CACHE_SHOW(write_miss, CACHE_CHANNEL_WR, CACHE_EVNT_WR_MISS);
> > +CACHE_SHOW(hold_request, CACHE_CHANNEL_RD, CACHE_EVNT_HOLD_REQ);
> > +CACHE_SHOW(tx_req_stall, CACHE_CHANNEL_RD, CACHE_EVNT_TX_REQ_STALL);
> > +CACHE_SHOW(rx_req_stall, CACHE_CHANNEL_RD, CACHE_EVNT_RX_REQ_STALL);
> > +CACHE_SHOW(rx_eviction, CACHE_CHANNEL_RD, CACHE_EVNT_EVICTIONS);
> > +CACHE_SHOW(data_write_port_contention, CACHE_CHANNEL_WR,
> > +          CACHE_EVNT_DATA_WR_PORT_CONTEN);
> > +CACHE_SHOW(tag_write_port_contention, CACHE_CHANNEL_WR,
> > +          CACHE_EVNT_TAG_WR_PORT_CONTEN);
> > +
> > +static struct attribute *cache_attrs[] = {
> > +       &perf_obj_attr_read_hit.attr,
> > +       &perf_obj_attr_read_miss.attr,
> > +       &perf_obj_attr_write_hit.attr,
> > +       &perf_obj_attr_write_miss.attr,
> > +       &perf_obj_attr_hold_request.attr,
> > +       &perf_obj_attr_data_write_port_contention.attr,
> > +       &perf_obj_attr_tag_write_port_contention.attr,
> > +       &perf_obj_attr_tx_req_stall.attr,
> > +       &perf_obj_attr_rx_req_stall.attr,
> > +       &perf_obj_attr_rx_eviction.attr,
> > +       &perf_obj_attr_cache_freeze.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group cache_attr_group = {
> > +       .attrs = cache_attrs,
> > +};
> > +
> > +static const struct attribute_group *cache_attr_groups[] = {
> > +       &cache_attr_group,
> > +       NULL,
> > +};
> > +
> > +static int create_perf_cache_obj(struct perf_object *perf_dev)
> > +{
> > +       struct perf_object *pobj;
> > +
> > +       pobj = create_perf_obj(perf_dev->feature, &perf_dev->kobj,
> > +                              PERF_OBJ_ROOT_ID, cache_attr_groups, "cache");
> > +       if (IS_ERR(pobj))
> > +               return PTR_ERR(pobj);
> > +
> > +       list_add(&pobj->node, &perf_dev->children);
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * Counter Sysfs Interfaces for VT-D / IOMMU.
> > + */
> > +static ssize_t vtd_freeze_show(struct perf_object *pobj, char *buf)
> > +{
> > +       void __iomem *base = pobj->feature->ioaddr;
> > +       u64 v;
> > +
> > +       v = readq(base + VTD_CTRL);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(VTD_FREEZE_CNTR, v));
> > +}
> > +
> > +static ssize_t vtd_freeze_store(struct perf_object *pobj,
> > +                               const char *buf, size_t n)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       bool state;
> > +       u64 v;
> > +
> > +       if (strtobool(buf, &state))
> > +               return -EINVAL;
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       v = readq(base + VTD_CTRL);
> > +       v &= ~VTD_FREEZE_CNTR;
> > +       v |= FIELD_PREP(VTD_FREEZE_CNTR, state ? 1 : 0);
> > +       writeq(v, base + VTD_CTRL);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return n;
> > +}
> > +static PERF_OBJ_ATTR(vtd_freeze, freeze, 0644,
> > +                    vtd_freeze_show, vtd_freeze_store);
> > +
> > +static struct attribute *iommu_top_attrs[] = {
> > +       &perf_obj_attr_vtd_freeze.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group iommu_top_attr_group = {
> > +       .attrs = iommu_top_attrs,
> > +};
> > +
> > +static ssize_t read_iommu_sip_counter(struct perf_object *pobj,
> > +                                     u8 event, char *buf)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       u64 v, count;
> > +
> > +       if (event > VTD_SIP_EVNT_MAX)
> > +               return -EINVAL;
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       v = readq(base + VTD_SIP_CTRL);
> > +       v &= ~VTD_SIP_CTRL_EVNT;
> > +       v |= FIELD_PREP(VTD_SIP_CTRL_EVNT, event);
> > +       writeq(v, base + VTD_SIP_CTRL);
> > +
> > +       if (readq_poll_timeout(base + VTD_SIP_CNTR, v,
> > +                              FIELD_GET(VTD_SIP_CNTR_EVNT, v) == event,
> > +                              1, PERF_TIMEOUT)) {
> > +               dev_err(&feature->pdev->dev, "timeout, unmatched VTd SIP event type in counter registers\n");
> > +               mutex_unlock(&pdata->lock);
> > +               return -ETIMEDOUT;
> > +       }
> > +
> > +       v = readq(base + VTD_SIP_CNTR);
> > +       count = FIELD_GET(VTD_SIP_CNTR_EVNT_CNTR, v);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> > +}
> > +
> > +#define VTD_SIP_SHOW(name, event)                                      \
> > +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> > +{                                                                      \
> > +       return read_iommu_sip_counter(pobj, event, buf);                \
> > +}                                                                      \
> > +static PERF_OBJ_ATTR_RO(name)
> > +
> > +VTD_SIP_SHOW(iotlb_4k_hit, VTD_SIP_EVNT_IOTLB_4K_HIT);
> > +VTD_SIP_SHOW(iotlb_2m_hit, VTD_SIP_EVNT_IOTLB_2M_HIT);
> > +VTD_SIP_SHOW(iotlb_1g_hit, VTD_SIP_EVNT_IOTLB_1G_HIT);
> > +VTD_SIP_SHOW(slpwc_l3_hit, VTD_SIP_EVNT_SLPWC_L3_HIT);
> > +VTD_SIP_SHOW(slpwc_l4_hit, VTD_SIP_EVNT_SLPWC_L4_HIT);
> > +VTD_SIP_SHOW(rcc_hit, VTD_SIP_EVNT_RCC_HIT);
> > +VTD_SIP_SHOW(iotlb_4k_miss, VTD_SIP_EVNT_IOTLB_4K_MISS);
> > +VTD_SIP_SHOW(iotlb_2m_miss, VTD_SIP_EVNT_IOTLB_2M_MISS);
> > +VTD_SIP_SHOW(iotlb_1g_miss, VTD_SIP_EVNT_IOTLB_1G_MISS);
> > +VTD_SIP_SHOW(slpwc_l3_miss, VTD_SIP_EVNT_SLPWC_L3_MISS);
> > +VTD_SIP_SHOW(slpwc_l4_miss, VTD_SIP_EVNT_SLPWC_L4_MISS);
> > +VTD_SIP_SHOW(rcc_miss, VTD_SIP_EVNT_RCC_MISS);
> > +
> > +static struct attribute *iommu_sip_attrs[] = {
> > +       &perf_obj_attr_iotlb_4k_hit.attr,
> > +       &perf_obj_attr_iotlb_2m_hit.attr,
> > +       &perf_obj_attr_iotlb_1g_hit.attr,
> > +       &perf_obj_attr_slpwc_l3_hit.attr,
> > +       &perf_obj_attr_slpwc_l4_hit.attr,
> > +       &perf_obj_attr_rcc_hit.attr,
> > +       &perf_obj_attr_iotlb_4k_miss.attr,
> > +       &perf_obj_attr_iotlb_2m_miss.attr,
> > +       &perf_obj_attr_iotlb_1g_miss.attr,
> > +       &perf_obj_attr_slpwc_l3_miss.attr,
> > +       &perf_obj_attr_slpwc_l4_miss.attr,
> > +       &perf_obj_attr_rcc_miss.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group iommu_sip_attr_group = {
> > +       .attrs = iommu_sip_attrs,
> > +};
> > +
> > +static const struct attribute_group *iommu_top_attr_groups[] = {
> > +       &iommu_top_attr_group,
> > +       &iommu_sip_attr_group,
> > +       NULL,
> > +};
> > +
> > +static ssize_t read_iommu_counter(struct perf_object *pobj, u8 event, char *buf)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       u64 v, count;
> > +
> > +       if (event > VTD_EVNT_MAX)
> > +               return -EINVAL;
> > +
> > +       event += pobj->id;
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       v = readq(base + VTD_CTRL);
> > +       v &= ~VTD_CTRL_EVNT;
> > +       v |= FIELD_PREP(VTD_CTRL_EVNT, event);
> > +       writeq(v, base + VTD_CTRL);
> > +
> > +       if (readq_poll_timeout(base + VTD_CNTR, v,
> > +                              FIELD_GET(VTD_CNTR_EVNT, v) == event, 1,
> > +                              PERF_TIMEOUT)) {
> > +               dev_err(&feature->pdev->dev, "timeout, unmatched VTd event type in counter registers\n");
> > +               mutex_unlock(&pdata->lock);
> > +               return -ETIMEDOUT;
> > +       }
> > +
> > +       v = readq(base + VTD_CNTR);
> > +       count = FIELD_GET(VTD_CNTR_EVNT_CNTR, v);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> > +}
> > +
> > +#define VTD_SHOW(name, base_event)                                     \
> > +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> > +{                                                                      \
> > +       return read_iommu_counter(pobj, base_event, buf);               \
> > +}                                                                      \
> > +static PERF_OBJ_ATTR_RO(name)
> > +
> > +VTD_SHOW(read_transaction, VTD_EVNT_AFU_MEM_RD_TRANS);
> > +VTD_SHOW(write_transaction, VTD_EVNT_AFU_MEM_WR_TRANS);
> > +VTD_SHOW(devtlb_read_hit, VTD_EVNT_AFU_DEVTLB_RD_HIT);
> > +VTD_SHOW(devtlb_write_hit, VTD_EVNT_AFU_DEVTLB_WR_HIT);
> > +VTD_SHOW(devtlb_4k_fill, VTD_EVNT_DEVTLB_4K_FILL);
> > +VTD_SHOW(devtlb_2m_fill, VTD_EVNT_DEVTLB_2M_FILL);
> > +VTD_SHOW(devtlb_1g_fill, VTD_EVNT_DEVTLB_1G_FILL);
> > +
> > +static struct attribute *iommu_attrs[] = {
> > +       &perf_obj_attr_read_transaction.attr,
> > +       &perf_obj_attr_write_transaction.attr,
> > +       &perf_obj_attr_devtlb_read_hit.attr,
> > +       &perf_obj_attr_devtlb_write_hit.attr,
> > +       &perf_obj_attr_devtlb_4k_fill.attr,
> > +       &perf_obj_attr_devtlb_2m_fill.attr,
> > +       &perf_obj_attr_devtlb_1g_fill.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group iommu_attr_group = {
> > +       .attrs = iommu_attrs,
> > +};
> > +
> > +static const struct attribute_group *iommu_attr_groups[] = {
> > +       &iommu_attr_group,
> > +       NULL,
> > +};
> > +
> > +#define PERF_MAX_PORT_NUM      1
> > +
> > +static int create_perf_iommu_obj(struct perf_object *perf_dev)
> > +{
> > +       struct dfl_feature *feature = perf_dev->feature;
> > +       struct device *dev = &feature->pdev->dev;
> > +       struct perf_object *pobj, *obj;
> > +       void __iomem *base;
> > +       u64 v;
> > +       int i;
> > +
> > +       /* check if iommu is not supported on this device. */
> > +       base = dfl_get_feature_ioaddr_by_id(dev, FME_FEATURE_ID_HEADER);
> > +       v = readq(base + FME_HDR_CAP);
> > +       if (!FIELD_GET(FME_CAP_IOMMU_AVL, v))
> > +               return 0;
> > +
> > +       pobj = create_perf_obj(feature, &perf_dev->kobj, PERF_OBJ_ROOT_ID,
> > +                              iommu_top_attr_groups, "iommu");
> > +       if (IS_ERR(pobj))
> > +               return PTR_ERR(pobj);
> > +
> > +       list_add(&pobj->node, &perf_dev->children);
> > +
> > +       for (i = 0; i < PERF_MAX_PORT_NUM; i++) {
> > +               obj = create_perf_obj(feature, &pobj->kobj, i,
> > +                                     iommu_attr_groups, "afu");
> > +               if (IS_ERR(obj))
> > +                       return PTR_ERR(obj);
> > +
> > +               list_add(&obj->node, &pobj->children);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * Counter Sysfs Interfaces for Fabric
> > + */
> > +static bool fabric_pobj_is_enabled(struct perf_object *pobj)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       void __iomem *base = feature->ioaddr;
> > +       u64 v;
> > +
> > +       v = readq(base + FAB_CTRL);
> > +
> > +       if (FIELD_GET(FAB_PORT_FILTER, v) == FAB_PORT_FILTER_DISABLE)
> > +               return pobj->id == PERF_OBJ_ROOT_ID;
> > +
> > +       return pobj->id == FIELD_GET(FAB_PORT_ID, v);
> > +}
> > +
> > +static ssize_t read_fabric_counter(struct perf_object *pobj,
> > +                                  u8 event, char *buf)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       u64 v, count = 0;
> > +
> > +       if (event > FAB_EVNT_MAX)
> > +               return -EINVAL;
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       /* if it is disabled, force the counter to return zero. */
> > +       if (!fabric_pobj_is_enabled(pobj))
> > +               goto exit;
> > +
> > +       v = readq(base + FAB_CTRL);
> > +       v &= ~FAB_CTRL_EVNT;
> > +       v |= FIELD_PREP(FAB_CTRL_EVNT, event);
> > +       writeq(v, base + FAB_CTRL);
> > +
> > +       if (readq_poll_timeout(base + FAB_CNTR, v,
> > +                              FIELD_GET(FAB_CNTR_EVNT, v) == event,
> > +                              1, PERF_TIMEOUT)) {
> > +               dev_err(&feature->pdev->dev, "timeout, unmatched fab event type in counter registers.\n");
> > +               mutex_unlock(&pdata->lock);
> > +               return -ETIMEDOUT;
> > +       }
> > +
> > +       v = readq(base + FAB_CNTR);
> > +       count = FIELD_GET(FAB_CNTR_EVNT_CNTR, v);
> > +exit:
> > +       mutex_unlock(&pdata->lock);
> > +       return scnprintf(buf, PAGE_SIZE, "0x%llx\n", (unsigned long long)count);
> > +}
> > +
> > +#define FAB_SHOW(name, event)                                          \
> > +static ssize_t name##_show(struct perf_object *pobj, char *buf)                \
> > +{                                                                      \
> > +       return read_fabric_counter(pobj, event, buf);                   \
> > +}                                                                      \
> > +static PERF_OBJ_ATTR_RO(name)
> > +
> > +FAB_SHOW(pcie0_read, FAB_EVNT_PCIE0_RD);
> > +FAB_SHOW(pcie0_write, FAB_EVNT_PCIE0_WR);
> > +FAB_SHOW(pcie1_read, FAB_EVNT_PCIE1_RD);
> > +FAB_SHOW(pcie1_write, FAB_EVNT_PCIE1_WR);
> > +FAB_SHOW(upi_read, FAB_EVNT_UPI_RD);
> > +FAB_SHOW(upi_write, FAB_EVNT_UPI_WR);
> > +FAB_SHOW(mmio_read, FAB_EVNT_MMIO_RD);
> > +FAB_SHOW(mmio_write, FAB_EVNT_MMIO_WR);
> > +
> > +static ssize_t fab_enable_show(struct perf_object *pobj, char *buf)
> > +{
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)!!fabric_pobj_is_enabled(pobj));
> > +}
> > +
> > +/*
> > + * If enable one port or all port event counter in fabric, other
> > + * fabric event counter originally enabled will be disable automatically.
> > + */
> > +static ssize_t fab_enable_store(struct perf_object *pobj,
> > +                               const char *buf, size_t n)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       bool state;
> > +       u64 v;
> > +
> > +       if (strtobool(buf, &state) || !state)
> > +               return -EINVAL;
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       /* if it is already enabled. */
> > +       if (fabric_pobj_is_enabled(pobj))
> > +               return n;
> > +
> > +       mutex_lock(&pdata->lock);
> > +       v = readq(base + FAB_CTRL);
> > +       v &= ~(FAB_PORT_FILTER | FAB_PORT_ID);
> > +
> > +       if (pobj->id == PERF_OBJ_ROOT_ID) {
> > +               v |= FIELD_PREP(FAB_PORT_FILTER, FAB_PORT_FILTER_DISABLE);
> > +       } else {
> > +               v |= FIELD_PREP(FAB_PORT_FILTER, FAB_PORT_FILTER_ENABLE);
> > +               v |= FIELD_PREP(FAB_PORT_ID, pobj->id);
> > +       }
> > +       writeq(v, base + FAB_CTRL);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return n;
> > +}
> > +static PERF_OBJ_ATTR(fab_enable, enable, 0644,
> > +                    fab_enable_show, fab_enable_store);
> > +
> > +static struct attribute *fabric_attrs[] = {
> > +       &perf_obj_attr_pcie0_read.attr,
> > +       &perf_obj_attr_pcie0_write.attr,
> > +       &perf_obj_attr_pcie1_read.attr,
> > +       &perf_obj_attr_pcie1_write.attr,
> > +       &perf_obj_attr_upi_read.attr,
> > +       &perf_obj_attr_upi_write.attr,
> > +       &perf_obj_attr_mmio_read.attr,
> > +       &perf_obj_attr_mmio_write.attr,
> > +       &perf_obj_attr_fab_enable.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group fabric_attr_group = {
> > +       .attrs = fabric_attrs,
> > +};
> > +
> > +static const struct attribute_group *fabric_attr_groups[] = {
> > +       &fabric_attr_group,
> > +       NULL,
> > +};
> > +
> > +static ssize_t fab_freeze_show(struct perf_object *pobj, char *buf)
> > +{
> > +       void __iomem *base = pobj->feature->ioaddr;
> > +       u64 v;
> > +
> > +       v = readq(base + FAB_CTRL);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n",
> > +                        (unsigned int)FIELD_GET(FAB_FREEZE_CNTR, v));
> > +}
> > +
> > +static ssize_t fab_freeze_store(struct perf_object *pobj,
> > +                               const char *buf, size_t n)
> > +{
> > +       struct dfl_feature *feature = pobj->feature;
> > +       struct dfl_feature_platform_data *pdata;
> > +       void __iomem *base = feature->ioaddr;
> > +       bool state;
> > +       u64 v;
> > +
> > +       if (strtobool(buf, &state))
> > +               return -EINVAL;
> > +
> > +       pdata = dev_get_platdata(&feature->pdev->dev);
> > +
> > +       mutex_lock(&pdata->lock);
> > +       v = readq(base + FAB_CTRL);
> > +       v &= ~FAB_FREEZE_CNTR;
> > +       v |= FIELD_PREP(FAB_FREEZE_CNTR, state ? 1 : 0);
> > +       writeq(v, base + FAB_CTRL);
> > +       mutex_unlock(&pdata->lock);
> > +
> > +       return n;
> > +}
> > +static PERF_OBJ_ATTR(fab_freeze, freeze, 0644,
> > +                    fab_freeze_show, fab_freeze_store);
> 
> PERF_OBJ_ATTR_RW ?  Also in a few other places, wherever '0644' shows up.

PERF_OBJ_ATTR is used as it can define its own file name.
Let me see if we can improve this in the next version.

Thanks for the review!

Hao

> 
> 
> > +
> > +static struct attribute *fabric_top_attrs[] = {
> > +       &perf_obj_attr_fab_freeze.attr,
> > +       NULL,
> > +};
> > +
> > +static struct attribute_group fabric_top_attr_group = {
> > +       .attrs = fabric_top_attrs,
> > +};
> > +
> > +static const struct attribute_group *fabric_top_attr_groups[] = {
> > +       &fabric_attr_group,
> > +       &fabric_top_attr_group,
> > +       NULL,
> > +};
> > +
> > +static int create_perf_fabric_obj(struct perf_object *perf_dev)
> > +{
> > +       struct perf_object *pobj, *obj;
> > +       int i;
> > +
> > +       pobj = create_perf_obj(perf_dev->feature, &perf_dev->kobj,
> > +                              PERF_OBJ_ROOT_ID, fabric_top_attr_groups,
> > +                              "fabric");
> > +       if (IS_ERR(pobj))
> > +               return PTR_ERR(pobj);
> > +
> > +       list_add(&pobj->node, &perf_dev->children);
> > +
> > +       for (i = 0; i < PERF_MAX_PORT_NUM; i++) {
> > +               obj = create_perf_obj(perf_dev->feature, &pobj->kobj, i,
> > +                                     fabric_attr_groups, "port");
> > +               if (IS_ERR(obj))
> > +                       return PTR_ERR(obj);
> > +
> > +               list_add(&obj->node, &pobj->children);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int fme_perf_init(struct platform_device *pdev,
> > +                        struct dfl_feature *feature)
> > +{
> > +       struct perf_object *perf_dev;
> > +       int ret;
> > +
> > +       perf_dev = create_perf_dev(feature);
> > +       if (IS_ERR(perf_dev))
> > +               return PTR_ERR(perf_dev);
> > +
> > +       ret = create_perf_fabric_obj(perf_dev);
> > +       if (ret)
> > +               goto done;
> > +
> > +       if (feature->id == FME_FEATURE_ID_GLOBAL_IPERF) {
> > +               /*
> > +                * Cache and IOMMU(VT-D) performance counters are not supported
> > +                * on discreted solutions e.g. Intel Programmable Acceleration
> > +                * Card based on PCIe.
> > +                */
> > +               ret = create_perf_cache_obj(perf_dev);
> > +               if (ret)
> > +                       goto done;
> > +
> > +               ret = create_perf_iommu_obj(perf_dev);
> > +               if (ret)
> > +                       goto done;
> > +       }
> > +
> > +       feature->priv = perf_dev;
> > +       return 0;
> > +
> > +done:
> > +       destroy_perf_obj(perf_dev);
> > +       return ret;
> > +}
> > +
> > +static void fme_perf_uinit(struct platform_device *pdev,
> > +                          struct dfl_feature *feature)
> > +{
> > +       struct perf_object *perf_dev = feature->priv;
> > +
> > +       destroy_perf_obj(perf_dev);
> > +}
> > +
> > +const struct dfl_feature_id fme_perf_id_table[] = {
> > +       {.id = FME_FEATURE_ID_GLOBAL_IPERF,},
> > +       {.id = FME_FEATURE_ID_GLOBAL_DPERF,},
> > +       {0,}
> > +};
> > +
> > +const struct dfl_feature_ops fme_perf_ops = {
> > +       .init = fme_perf_init,
> > +       .uinit = fme_perf_uinit,
> > +};
> > diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
> > index 5fbe3f5..dc71048 100644
> > --- a/drivers/fpga/dfl-fme.h
> > +++ b/drivers/fpga/dfl-fme.h
> > @@ -39,5 +39,7 @@ struct dfl_fme {
> >  extern const struct dfl_feature_id fme_pr_mgmt_id_table[];
> >  extern const struct dfl_feature_ops fme_global_err_ops;
> >  extern const struct dfl_feature_id fme_global_err_id_table[];
> > +extern const struct dfl_feature_ops fme_perf_ops;
> > +extern const struct dfl_feature_id fme_perf_id_table[];
> >
> >  #endif /* __DFL_FME_H */
> > diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> > index 65f91ef..637692a 100644
> > --- a/drivers/fpga/dfl.c
> > +++ b/drivers/fpga/dfl.c
> > @@ -507,6 +507,7 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
> >                 struct dfl_feature *feature = &pdata->features[index];
> >
> >                 /* save resource information for each feature */
> > +               feature->pdev = fdev;
> >                 feature->id = finfo->fid;
> >                 feature->resource_index = index;
> >                 feature->ioaddr = finfo->ioaddr;
> > diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
> > index 6c32080..bf23436 100644
> > --- a/drivers/fpga/dfl.h
> > +++ b/drivers/fpga/dfl.h
> > @@ -191,6 +191,7 @@ struct dfl_feature_driver {
> >  /**
> >   * struct dfl_feature - sub feature of the feature devices
> >   *
> > + * @pdev: parent platform device.
> >   * @id: sub feature id.
> >   * @resource_index: each sub feature has one mmio resource for its registers.
> >   *                 this index is used to find its mmio resource from the
> > @@ -200,6 +201,7 @@ struct dfl_feature_driver {
> >   * @priv: priv data of this feature.
> >   */
> >  struct dfl_feature {
> > +       struct platform_device *pdev;
> >         u64 id;
> >         int resource_index;
> >         void __iomem *ioaddr;
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR
  2019-05-16 17:35   ` Alan Tull
@ 2019-05-17  3:50     ` Wu Hao
  0 siblings, 0 replies; 42+ messages in thread
From: Wu Hao @ 2019-05-17  3:50 UTC (permalink / raw)
  To: Alan Tull
  Cc: Scott Wood, Moritz Fischer, linux-fpga, linux-kernel, linux-api,
	Ananda Ravuri, Xu Yilun

On Thu, May 16, 2019 at 12:35:27PM -0500, Alan Tull wrote:
> On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> It looks like this addressed the review comments.  Adding my Ack.  Is
> there anything else on this patch?

Nothing else, just addressed the review comments. : )

Thanks for the review and ack.

Hao

> 
> Alan
> 
> >
> > In early partial reconfiguration private feature, it only
> > supports 32bit data width when writing data to hardware for
> > PR. 512bit data width PR support is an important optimization
> > for some specific solutions (e.g. XEON with FPGA integrated),
> > it allows driver to use AVX512 instruction to improve the
> > performance of partial reconfiguration. e.g. programming one
> > 100MB bitstream image via this 512bit data width PR hardware
> > only takes ~300ms, but 32bit revision requires ~3s per test
> > result.
> >
> > Please note now this optimization is only done on revision 2
> > of this PR private feature which is only used in integrated
> > solution that AVX512 is always supported. This revision 2
> > hardware doesn't support 32bit PR.
> >
> > Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> 
> Acked-by: Alan Tull <atull@kernel.org>
> 
> 
> > ---
> > v2: check AVX512 support using cpu_feature_enabled()
> >     fix other comments from Scott Wood <swood@redhat.com>
> > ---
> >  drivers/fpga/dfl-fme-main.c |   3 ++
> >  drivers/fpga/dfl-fme-mgr.c  | 113 +++++++++++++++++++++++++++++++++++++-------
> >  drivers/fpga/dfl-fme-pr.c   |  43 +++++++++++------
> >  drivers/fpga/dfl-fme.h      |   2 +
> >  drivers/fpga/dfl.h          |   5 ++
> >  5 files changed, 135 insertions(+), 31 deletions(-)
> >
> > diff --git a/drivers/fpga/dfl-fme-main.c b/drivers/fpga/dfl-fme-main.c
> > index 086ad24..076d74f 100644
> > --- a/drivers/fpga/dfl-fme-main.c
> > +++ b/drivers/fpga/dfl-fme-main.c
> > @@ -21,6 +21,8 @@
> >  #include "dfl.h"
> >  #include "dfl-fme.h"
> >
> > +#define DRV_VERSION    "0.8"
> > +
> >  static ssize_t ports_num_show(struct device *dev,
> >                               struct device_attribute *attr, char *buf)
> >  {
> > @@ -277,3 +279,4 @@ static int fme_remove(struct platform_device *pdev)
> >  MODULE_AUTHOR("Intel Corporation");
> >  MODULE_LICENSE("GPL v2");
> >  MODULE_ALIAS("platform:dfl-fme");
> > +MODULE_VERSION(DRV_VERSION);
> > diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c
> > index b3f7eee..d1a4ba5 100644
> > --- a/drivers/fpga/dfl-fme-mgr.c
> > +++ b/drivers/fpga/dfl-fme-mgr.c
> > @@ -22,14 +22,18 @@
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >  #include <linux/fpga/fpga-mgr.h>
> >
> > +#include "dfl.h"
> >  #include "dfl-fme-pr.h"
> >
> > +#define DRV_VERSION    "0.8"
> > +
> >  /* FME Partial Reconfiguration Sub Feature Register Set */
> >  #define FME_PR_DFH             0x0
> >  #define FME_PR_CTRL            0x8
> >  #define FME_PR_STS             0x10
> >  #define FME_PR_DATA            0x18
> >  #define FME_PR_ERR             0x20
> > +#define FME_PR_512_DATA                0x40 /* Data Register for 512bit datawidth PR */
> >  #define FME_PR_INTFC_ID_L      0xA8
> >  #define FME_PR_INTFC_ID_H      0xB0
> >
> > @@ -67,8 +71,43 @@
> >  #define PR_WAIT_TIMEOUT   8000000
> >  #define PR_HOST_STATUS_IDLE    0
> >
> > +#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512)
> > +
> > +#include <linux/cpufeature.h>
> > +#include <asm/fpu/api.h>
> > +
> > +static inline int is_cpu_avx512_enabled(void)
> > +{
> > +       return cpu_feature_enabled(X86_FEATURE_AVX512F);
> > +}
> > +
> > +static inline void copy512(const void *src, void __iomem *dst)
> > +{
> > +       kernel_fpu_begin();
> > +
> > +       asm volatile("vmovdqu64 (%0), %%zmm0;"
> > +                    "vmovntdq %%zmm0, (%1);"
> > +                    :
> > +                    : "r"(src), "r"(dst)
> > +                    : "memory");
> > +
> > +       kernel_fpu_end();
> > +}
> > +#else
> > +static inline int is_cpu_avx512_enabled(void)
> > +{
> > +       return 0;
> > +}
> > +
> > +static inline void copy512(const void *src, void __iomem *dst)
> > +{
> > +       WARN_ON_ONCE(1);
> > +}
> > +#endif
> > +
> >  struct fme_mgr_priv {
> >         void __iomem *ioaddr;
> > +       unsigned int pr_datawidth;
> >         u64 pr_error;
> >  };
> >
> > @@ -169,7 +208,7 @@ static int fme_mgr_write(struct fpga_manager *mgr,
> >         struct fme_mgr_priv *priv = mgr->priv;
> >         void __iomem *fme_pr = priv->ioaddr;
> >         u64 pr_ctrl, pr_status, pr_data;
> > -       int delay = 0, pr_credit, i = 0;
> > +       int ret = 0, delay = 0, pr_credit;
> >
> >         dev_dbg(dev, "start request\n");
> >
> > @@ -181,9 +220,9 @@ static int fme_mgr_write(struct fpga_manager *mgr,
> >
> >         /*
> >          * driver can push data to PR hardware using PR_DATA register once HW
> > -        * has enough pr_credit (> 1), pr_credit reduces one for every 32bit
> > -        * pr data write to PR_DATA register. If pr_credit <= 1, driver needs
> > -        * to wait for enough pr_credit from hardware by polling.
> > +        * has enough pr_credit (> 1), pr_credit reduces one for every pr data
> > +        * width write to PR_DATA register. If pr_credit <= 1, driver needs to
> > +        * wait for enough pr_credit from hardware by polling.
> >          */
> >         pr_status = readq(fme_pr + FME_PR_STS);
> >         pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
> > @@ -192,7 +231,8 @@ static int fme_mgr_write(struct fpga_manager *mgr,
> >                 while (pr_credit <= 1) {
> >                         if (delay++ > PR_WAIT_TIMEOUT) {
> >                                 dev_err(dev, "PR_CREDIT timeout\n");
> > -                               return -ETIMEDOUT;
> > +                               ret = -ETIMEDOUT;
> > +                               goto done;
> >                         }
> >                         udelay(1);
> >
> > @@ -200,21 +240,27 @@ static int fme_mgr_write(struct fpga_manager *mgr,
> >                         pr_credit = FIELD_GET(FME_PR_STS_PR_CREDIT, pr_status);
> >                 }
> >
> > -               if (count < 4) {
> > -                       dev_err(dev, "Invalid PR bitstream size\n");
> > -                       return -EINVAL;
> > +               WARN_ON(count < priv->pr_datawidth);
> > +
> > +               switch (priv->pr_datawidth) {
> > +               case 4:
> > +                       pr_data = FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
> > +                                            *(u32 *)buf);
> > +                       writeq(pr_data, fme_pr + FME_PR_DATA);
> > +                       break;
> > +               case 64:
> > +                       copy512(buf, fme_pr + FME_PR_512_DATA);
> > +                       break;
> > +               default:
> > +                       WARN_ON_ONCE(1);
> >                 }
> > -
> > -               pr_data = 0;
> > -               pr_data |= FIELD_PREP(FME_PR_DATA_PR_DATA_RAW,
> > -                                     *(((u32 *)buf) + i));
> > -               writeq(pr_data, fme_pr + FME_PR_DATA);
> > -               count -= 4;
> > +               buf += priv->pr_datawidth;
> > +               count -= priv->pr_datawidth;
> >                 pr_credit--;
> > -               i++;
> >         }
> >
> > -       return 0;
> > +done:
> > +       return ret;
> >  }
> >
> >  static int fme_mgr_write_complete(struct fpga_manager *mgr,
> > @@ -279,6 +325,36 @@ static void fme_mgr_get_compat_id(void __iomem *fme_pr,
> >         id->id_h = readq(fme_pr + FME_PR_INTFC_ID_H);
> >  }
> >
> > +static u8 fme_mgr_get_pr_datawidth(struct device *dev, void __iomem *fme_pr)
> > +{
> > +       u8 revision = dfl_feature_revision(fme_pr);
> > +
> > +       if (revision < 2) {
> > +               /*
> > +                * revision 0 and 1 only support 32bit data width partial
> > +                * reconfiguration, so pr_datawidth is 4 (Byte).
> > +                */
> > +               return 4;
> > +       } else if (revision == 2) {
> > +               /*
> > +                * revision 2 hardware has optimization to support 512bit data
> > +                * width partial reconfiguration with AVX512 instructions. So
> > +                * pr_datawidth is 64 (Byte). As revision 2 hardware is only
> > +                * used in integrated solution, CPU supports AVX512 instructions
> > +                * for sure, but it still needs to check here as AVX512 could be
> > +                * disabled in kernel (e.g. using clearcpuid boot option).
> > +                */
> > +               if (is_cpu_avx512_enabled())
> > +                       return 64;
> > +
> > +               dev_err(dev, "revision 2: AVX512 is disabled\n");
> > +               return 0;
> > +       }
> > +
> > +       dev_err(dev, "revision %d is not supported yet\n", revision);
> > +       return 0;
> > +}
> > +
> >  static int fme_mgr_probe(struct platform_device *pdev)
> >  {
> >         struct dfl_fme_mgr_pdata *pdata = dev_get_platdata(&pdev->dev);
> > @@ -302,6 +378,10 @@ static int fme_mgr_probe(struct platform_device *pdev)
> >                         return PTR_ERR(priv->ioaddr);
> >         }
> >
> > +       priv->pr_datawidth = fme_mgr_get_pr_datawidth(dev, priv->ioaddr);
> > +       if (!priv->pr_datawidth)
> > +               return -ENODEV;
> > +
> >         compat_id = devm_kzalloc(dev, sizeof(*compat_id), GFP_KERNEL);
> >         if (!compat_id)
> >                 return -ENOMEM;
> > @@ -342,3 +422,4 @@ static int fme_mgr_remove(struct platform_device *pdev)
> >  MODULE_AUTHOR("Intel Corporation");
> >  MODULE_LICENSE("GPL v2");
> >  MODULE_ALIAS("platform:dfl-fme-mgr");
> > +MODULE_VERSION(DRV_VERSION);
> > diff --git a/drivers/fpga/dfl-fme-pr.c b/drivers/fpga/dfl-fme-pr.c
> > index 3c71dc3..cd94ba8 100644
> > --- a/drivers/fpga/dfl-fme-pr.c
> > +++ b/drivers/fpga/dfl-fme-pr.c
> > @@ -83,7 +83,7 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
> >         if (copy_from_user(&port_pr, argp, minsz))
> >                 return -EFAULT;
> >
> > -       if (port_pr.argsz < minsz || port_pr.flags)
> > +       if (port_pr.argsz < minsz || port_pr.flags || !port_pr.buffer_size)
> >                 return -EINVAL;
> >
> >         /* get fme header region */
> > @@ -101,15 +101,25 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
> >                        port_pr.buffer_size))
> >                 return -EFAULT;
> >
> > +       mutex_lock(&pdata->lock);
> > +       fme = dfl_fpga_pdata_get_private(pdata);
> > +       /* fme device has been unregistered. */
> > +       if (!fme) {
> > +               ret = -EINVAL;
> > +               goto unlock_exit;
> > +       }
> > +
> >         /*
> >          * align PR buffer per PR bandwidth, as HW ignores the extra padding
> >          * data automatically.
> >          */
> > -       length = ALIGN(port_pr.buffer_size, 4);
> > +       length = ALIGN(port_pr.buffer_size, fme->pr_datawidth);
> >
> >         buf = vmalloc(length);
> > -       if (!buf)
> > -               return -ENOMEM;
> > +       if (!buf) {
> > +               ret = -ENOMEM;
> > +               goto unlock_exit;
> > +       }
> >
> >         if (copy_from_user(buf,
> >                            (void __user *)(unsigned long)port_pr.buffer_address,
> > @@ -127,18 +137,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
> >
> >         info->flags |= FPGA_MGR_PARTIAL_RECONFIG;
> >
> > -       mutex_lock(&pdata->lock);
> > -       fme = dfl_fpga_pdata_get_private(pdata);
> > -       /* fme device has been unregistered. */
> > -       if (!fme) {
> > -               ret = -EINVAL;
> > -               goto unlock_exit;
> > -       }
> > -
> >         region = dfl_fme_region_find(fme, port_pr.port_id);
> >         if (!region) {
> >                 ret = -EINVAL;
> > -               goto unlock_exit;
> > +               goto free_exit;
> >         }
> >
> >         fpga_image_info_free(region->info);
> > @@ -159,10 +161,10 @@ static int fme_pr(struct platform_device *pdev, unsigned long arg)
> >                 fpga_bridges_put(&region->bridge_list);
> >
> >         put_device(&region->dev);
> > -unlock_exit:
> > -       mutex_unlock(&pdata->lock);
> >  free_exit:
> >         vfree(buf);
> > +unlock_exit:
> > +       mutex_unlock(&pdata->lock);
> >         return ret;
> >  }
> >
> > @@ -388,6 +390,17 @@ static int pr_mgmt_init(struct platform_device *pdev,
> >         mutex_lock(&pdata->lock);
> >         priv = dfl_fpga_pdata_get_private(pdata);
> >
> > +       /*
> > +        * Initialize PR data width.
> > +        * Only revision 2 supports 512bit datawidth for better performance,
> > +        * other revisions use default 32bit datawidth. This is used for
> > +        * buffer alignment.
> > +        */
> > +       if (dfl_feature_revision(feature->ioaddr) == 2)
> > +               priv->pr_datawidth = 64;
> > +       else
> > +               priv->pr_datawidth = 4;
> > +
> >         /* Initialize the region and bridge sub device list */
> >         INIT_LIST_HEAD(&priv->region_list);
> >         INIT_LIST_HEAD(&priv->bridge_list);
> > diff --git a/drivers/fpga/dfl-fme.h b/drivers/fpga/dfl-fme.h
> > index 5394a21..de20755 100644
> > --- a/drivers/fpga/dfl-fme.h
> > +++ b/drivers/fpga/dfl-fme.h
> > @@ -21,12 +21,14 @@
> >  /**
> >   * struct dfl_fme - dfl fme private data
> >   *
> > + * @pr_datawidth: data width for partial reconfiguration.
> >   * @mgr: FME's FPGA manager platform device.
> >   * @region_list: linked list of FME's FPGA regions.
> >   * @bridge_list: linked list of FME's FPGA bridges.
> >   * @pdata: fme platform device's pdata.
> >   */
> >  struct dfl_fme {
> > +       int pr_datawidth;
> >         struct platform_device *mgr;
> >         struct list_head region_list;
> >         struct list_head bridge_list;
> > diff --git a/drivers/fpga/dfl.h b/drivers/fpga/dfl.h
> > index a8b869e..8851c6c 100644
> > --- a/drivers/fpga/dfl.h
> > +++ b/drivers/fpga/dfl.h
> > @@ -331,6 +331,11 @@ static inline bool dfl_feature_is_port(void __iomem *base)
> >                 (FIELD_GET(DFH_ID, v) == DFH_ID_FIU_PORT);
> >  }
> >
> > +static inline u8 dfl_feature_revision(void __iomem *base)
> > +{
> > +       return (u8)FIELD_GET(DFH_REVISION, readq(base + DFH));
> > +}
> > +
> >  /**
> >   * struct dfl_fpga_enum_info - DFL FPGA enumeration information
> >   *
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces.
  2019-05-16 17:53     ` Alan Tull
@ 2019-05-17  4:11       ` Wu Hao
  2019-05-20 18:21         ` Alan Tull
  0 siblings, 1 reply; 42+ messages in thread
From: Wu Hao @ 2019-05-17  4:11 UTC (permalink / raw)
  To: Alan Tull; +Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Thu, May 16, 2019 at 12:53:00PM -0500, Alan Tull wrote:
> On Thu, May 16, 2019 at 12:36 PM Alan Tull <atull@kernel.org> wrote:
> >
> > On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> Hi Hao,
> 
> Most of this patchset looks ready to go upstream or nearly so with
> pretty straightforward changes .  Patches 17 and 18 need minor changes
> and please change the scnprintf in the other patches.  The patches
> that had nontrivial changes are the power and thermal ones involving
> hwmon.  I'm hoping to send up the patchset minus the hwmon patches in
> the next version if there's no unforseen issues.  If the hwmon patches
> are ready then also, that's great, but otherwise those patches don't
> need to hold up all the rest of the patchset.  How's that sound?

Hi Alan

Thanks for your time for reviewing this patchset.

This sounds good to me. Only thing here is, I need to split the patch which
updates documentation into 2 patches (to remove hwmon description in doc),
but for sure, it should be very easy. :)

Thanks
Hao

> 
> Alan
> 
> > >
> > > This patch adds virtualization support description for DFL based
> > > FPGA devices (based on PCIe SRIOV), and introductions to new
> > > interfaces added by new dfl private feature drivers.
> > >
> > > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > > Signed-off-by: Wu Hao <hao.wu@intel.com>
> >
> > Acked-by: Alan Tull <atull@kernel.org>
> >
> > Thanks,
> > Alan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces.
  2019-05-17  4:11       ` Wu Hao
@ 2019-05-20 18:21         ` Alan Tull
  0 siblings, 0 replies; 42+ messages in thread
From: Alan Tull @ 2019-05-20 18:21 UTC (permalink / raw)
  To: Wu Hao; +Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Xu Yilun

On Thu, May 16, 2019 at 11:27 PM Wu Hao <hao.wu@intel.com> wrote:
>
> On Thu, May 16, 2019 at 12:53:00PM -0500, Alan Tull wrote:
> > On Thu, May 16, 2019 at 12:36 PM Alan Tull <atull@kernel.org> wrote:
> > >
> > > On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:
> >
> > Hi Hao,
> >
> > Most of this patchset looks ready to go upstream or nearly so with
> > pretty straightforward changes .  Patches 17 and 18 need minor changes
> > and please change the scnprintf in the other patches.  The patches
> > that had nontrivial changes are the power and thermal ones involving
> > hwmon.  I'm hoping to send up the patchset minus the hwmon patches in
> > the next version if there's no unforseen issues.  If the hwmon patches
> > are ready then also, that's great, but otherwise those patches don't
> > need to hold up all the rest of the patchset.  How's that sound?
>
> Hi Alan
>
> Thanks for your time for reviewing this patchset.
>
> This sounds good to me. Only thing here is, I need to split the patch which
> updates documentation into 2 patches (to remove hwmon description in doc),
> but for sure, it should be very easy. :)

Yes that sounds good.

Thanks,
Alan


>
> Thanks
> Hao
>
> >
> > Alan
> >
> > > >
> > > > This patch adds virtualization support description for DFL based
> > > > FPGA devices (based on PCIe SRIOV), and introductions to new
> > > > interfaces added by new dfl private feature drivers.
> > > >
> > > > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > > > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > >
> > > Acked-by: Alan Tull <atull@kernel.org>
> > >
> > > Thanks,
> > > Alan

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2019-05-20 18:22 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-29  8:55 [PATCH v2 00/18] add new features for FPGA DFL drivers Wu Hao
2019-04-29  8:55 ` [PATCH v2 01/18] fpga: dfl-fme-mgr: fix FME_PR_INTFC_ID register address Wu Hao
2019-04-29  8:55 ` [PATCH v2 02/18] fpga: dfl: fme: remove copy_to_user() in ioctl for PR Wu Hao
2019-05-07 17:26   ` Moritz Fischer
2019-05-08 17:58     ` Alan Tull
2019-04-29  8:55 ` [PATCH v2 03/18] fpga: dfl: fme: align PR buffer size per PR datawidth Wu Hao
2019-05-07 17:27   ` Moritz Fischer
2019-04-29  8:55 ` [PATCH v2 04/18] fpga: dfl: fme: support 512bit data width PR Wu Hao
2019-05-16 17:35   ` Alan Tull
2019-05-17  3:50     ` Wu Hao
2019-04-29  8:55 ` [PATCH v2 05/18] Documentation: fpga: dfl: add descriptions for virtualization and new interfaces Wu Hao
2019-05-16 17:36   ` Alan Tull
2019-05-16 17:53     ` Alan Tull
2019-05-17  4:11       ` Wu Hao
2019-05-20 18:21         ` Alan Tull
2019-04-29  8:55 ` [PATCH v2 06/18] fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support Wu Hao
2019-05-07 17:33   ` Moritz Fischer
2019-04-29  8:55 ` [PATCH v2 07/18] fpga: dfl: pci: enable SRIOV support Wu Hao
2019-05-07 17:35   ` Moritz Fischer
2019-04-29  8:55 ` [PATCH v2 08/18] fpga: dfl: afu: add AFU state related sysfs interfaces Wu Hao
2019-04-29  8:55 ` [PATCH v2 09/18] fpga: dfl: afu: add userclock " Wu Hao
2019-04-29  8:55 ` [PATCH v2 10/18] fpga: dfl: add id_table for dfl private feature driver Wu Hao
2019-04-29  8:55 ` [PATCH v2 11/18] fpga: dfl: afu: export __port_enable/disable function Wu Hao
2019-04-29  8:55 ` [PATCH v2 12/18] fpga: dfl: afu: add error reporting support Wu Hao
2019-05-09 14:41   ` Alan Tull
2019-04-29  8:55 ` [PATCH v2 13/18] fpga: dfl: afu: add STP (SignalTap) support Wu Hao
2019-04-29  8:55 ` [PATCH v2 14/18] fpga: dfl: fme: add capability sysfs interfaces Wu Hao
2019-04-29  8:55 ` [PATCH v2 15/18] fpga: dfl: fme: add thermal management support Wu Hao
2019-05-07 18:20   ` Alan Tull
2019-05-07 18:35     ` Guenter Roeck
2019-05-08  6:07       ` Wu Hao
2019-05-07 18:30   ` Moritz Fischer
2019-05-08  6:11     ` Wu Hao
2019-04-29  8:55 ` [PATCH v2 16/18] fpga: dfl: fme: add power " Wu Hao
2019-05-07 18:23   ` Alan Tull
2019-05-07 18:36     ` Guenter Roeck
2019-04-29  8:55 ` [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support Wu Hao
2019-05-09 16:27   ` Alan Tull
2019-05-10  2:23     ` Wu Hao
2019-04-29  8:55 ` [PATCH v2 18/18] fpga: dfl: fme: add performance " Wu Hao
2019-05-16 17:28   ` Alan Tull
2019-05-17  3:48     ` Wu Hao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).