linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics
@ 2020-05-18 11:08 Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 1/4] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP Vaibhav Jain
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Vaibhav Jain @ 2020-05-18 11:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm
  Cc: Vaibhav Jain, Aneesh Kumar K . V, Michael Ellerman

The patch-set proposes to add support for fetching and reporting
performance statistics for PAPR compliant NVDIMMs as described in
documentation for H_SCM_PERFORMANCE_STATS hcall Ref[1]. The patch-set
also implements mechanisms to expose NVDIMM performance stats via
sysfs and newly introduced PDSMs[2] for libndctl.

This patch-set combined with corresponding ndctl and libndctl changes
proposed at Ref[3] should enable user to fetch PAPR compliant NVDIMMs
using following command:

 # ndctl list -D --stats
[
  {
    "dev":"nmem0",
    "stats":{
      "Controller Reset Count":2,
      "Controller Reset Elapsed Time":603331,
      "Power-on Seconds":603931,
      "Life Remaining":"100%",
      "Critical Resource Utilization":"0%",
      "Host Load Count":5781028,
      "Host Store Count":8966800,
      "Host Load Duration":975895365,
      "Host Store Duration":716230690,
      "Media Read Count":0,
      "Media Write Count":6313,
      "Media Read Duration":0,
      "Media Write Duration":9679615,
      "Cache Read Hit Count":5781028,
      "Cache Write Hit Count":8442479,
      "Fast Write Count":8969912
    }
  }
]

The patchset is dependent on existing patch-set "[PATCH v7 0/5]
powerpc/papr_scm: Add support for reporting nvdimm health" available
at Ref[2] that adds support for reporting PAPR compliant NVDIMMs in
'papr_scm' kernel module.

Structure of the patch-set
==========================

The patch-set starts with implementing functionality in papr_scm
module to issue H_SCM_PERFORMANCE_STATS hcall, fetch & parse dimm
performance stats and exposing them as a PAPR specific libnvdimm
attribute named 'perf_stats'

Patch-2 introduces a new PDSM named FETCH_PERF_STATS that can be
issued by libndctl asking papr_scm to issue the
H_SCM_PERFORMANCE_STATS hcall using helpers introduced earlier and
storing the results in a dimm specific perf-stats-buffer.

Patch-3 introduces a new PDSM named READ_PERF_STATS that can be
issued by libndctl to read the perf-stats-buffer in an incremental
manner to workaround the 256-bytes envelop limitation of libnvdimm.

Finally Patch-4 introduces a new PDSM named GET_PERF_STAT that can be
issued by libndctl to read values of a specific NVDIMM performance
stat like "Life Remaining".

References
==========
[1] Documentation/powerpc/papr_hcals.rst

[2] https://lore.kernel.org/linux-nvdimm/20200508104922.72565-1-vaibhav@linux.ibm.com/

[3] https://github.com/vaibhav92/ndctl/tree/papr_scm_stats_v1

Vaibhav Jain (4):
  powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
  powerpc/papr_scm: Add support for PAPR_SCM_PDSM_FETCH_PERF_STATS
  powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_READ_PERF_STATS
  powerpc/papr_scm: Add support for PDSM GET_PERF_STAT

 Documentation/ABI/testing/sysfs-bus-papr-scm  |  27 ++
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h |  60 +++
 arch/powerpc/platforms/pseries/papr_scm.c     | 391 ++++++++++++++++++
 3 files changed, 478 insertions(+)

-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 1/4] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
  2020-05-18 11:08 [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Vaibhav Jain
@ 2020-05-18 11:08 ` Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 2/4] powerpc/papr_scm: Add support for PAPR_SCM_PDSM_FETCH_PERF_STATS Vaibhav Jain
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Vaibhav Jain @ 2020-05-18 11:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm
  Cc: Vaibhav Jain, Aneesh Kumar K . V, Michael Ellerman

Update papr_scm.c to query dimm performance statistics from PHYP via
H_SCM_PERFORMANCE_STATS hcall and export them to userspace as PAPR
specific NVDIMM attribute 'perf_stats' in sysfs. The patch also
provide a sysfs ABI documentation for the stats being reported and
their meanings.

During NVDIMM probe time in papr_scm_nvdimm_init() a special variant
of H_SCM_PERFORMANCE_STATS hcall is issued to check if collection of
performance statistics is supported or not. If yes then a per-nvdimm
performance stats buffer is of size as returned by PHYP is allocated
and stored along with its length in two newly introduced NVDIMM
private struct members 'perf_stats' and 'len_stat_buffer'.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 Documentation/ABI/testing/sysfs-bus-papr-scm |  27 ++++
 arch/powerpc/platforms/pseries/papr_scm.c    | 156 +++++++++++++++++++
 2 files changed, 183 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-scm b/Documentation/ABI/testing/sysfs-bus-papr-scm
index 6143d06072f1..ad06b3e9c315 100644
--- a/Documentation/ABI/testing/sysfs-bus-papr-scm
+++ b/Documentation/ABI/testing/sysfs-bus-papr-scm
@@ -25,3 +25,30 @@ Description:
 				  NVDIMM have been scrubbed.
 		* "locked"	: Indicating that NVDIMM contents cant
 				  be modified until next power cycle.
+
+What:		/sys/bus/nd/devices/nmemX/papr/perf_stats
+Date:		May, 2020
+KernelVersion:	v5.8
+Contact:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
+Description:
+		(RO) Report various performance stats related to papr-scm NVDIMM
+		device.  Each stat is reported on a new line with each line
+		composed of a stat-identifier followed by it value. Below are
+		currently known dimm performance stats which are reported:
+
+		* "CtlResCt" : Controller Reset Count
+		* "CtlResTm" : Controller Reset Elapsed Time
+		* "PonSecs " : Power-on Seconds
+		* "MemLife " : Life Remaining
+		* "CritRscU" : Critical Resource Utilization
+		* "HostLCnt" : Host Load Count
+		* "HostSCnt" : Host Store Count
+		* "HostSDur" : Host Store Duration
+		* "HostLDur" : Host Load Duration
+		* "MedRCnt " : Media Read Count
+		* "MedWCnt " : Media Write Count
+		* "MedRDur " : Media Read Duration
+		* "MedWDur " : Media Write Duration
+		* "CchRHCnt" : Cache Read Hit Count
+		* "CchWHCnt" : Cache Write Hit Count
+		* "FastWCnt" : Fast Write Count
\ No newline at end of file
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index c59bf17ad054..fd9a12275315 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -62,6 +62,24 @@
 					PAPR_SCM_DIMM_HEALTH_FATAL |	\
 					PAPR_SCM_DIMM_HEALTH_UNHEALTHY)
 
+#define PAPR_SCM_PERF_STATS_EYECATCHER __stringify(SCMSTATS)
+#define PAPR_SCM_PERF_STATS_VERSION 0x1
+
+/* Struct holding a single performance metric */
+struct papr_scm_perf_stat {
+	u8 statistic_id[8];
+	u64 statistic_value;
+};
+
+/* Struct exchanged between kernel and PHYP for fetching drc perf stats */
+struct papr_scm_perf_stats {
+	u8 eye_catcher[8];
+	u32 stats_version;		/* Should be 0x01 */
+	u32 num_statistics;		/* Number of stats following */
+	/* zero or more performance matrics */
+	struct papr_scm_perf_stat scm_statistics[];
+} __packed;
+
 /* private struct associated with each region */
 struct papr_scm_priv {
 	struct platform_device *pdev;
@@ -89,6 +107,12 @@ struct papr_scm_priv {
 
 	/* Health information for the dimm */
 	struct nd_papr_pdsm_health health;
+
+	/* length of the stat buffer as expected by phyp */
+	size_t len_stat_buffer;
+
+	/* Cached version of all performance state */
+	struct papr_scm_perf_stats *perf_stats;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -194,6 +218,75 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 	return drc_pmem_bind(p);
 }
 
+/*
+ * Query the Dimm performance stats from PHYP and copy them (if returned) to
+ * provided struct papr_scm_perf_stats instance 'stats' of 'size' in bytes.
+ * The value of R4 is copied to 'out' if the pointer is provided.
+ */
+static int drc_pmem_query_stats(struct papr_scm_priv *p,
+				struct papr_scm_perf_stats *buff_stats,
+				size_t size, unsigned int num_stats,
+				uint64_t *out)
+{
+	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	struct papr_scm_perf_stat *stats;
+	s64 rc, i;
+
+	/* Setup the out buffer */
+	if (buff_stats) {
+		memcpy(buff_stats->eye_catcher,
+		       PAPR_SCM_PERF_STATS_EYECATCHER, 8);
+		buff_stats->stats_version =
+			cpu_to_be32(PAPR_SCM_PERF_STATS_VERSION);
+		buff_stats->num_statistics =
+			cpu_to_be32(num_stats);
+	} else {
+		/* In case of no out buffer ignore the size */
+		size = 0;
+	}
+
+	/*
+	 * Do the HCALL asking PHYP for info and if R4 was requested
+	 * return its value in 'out' variable.
+	 */
+	rc = plpar_hcall(H_SCM_PERFORMANCE_STATS, ret, p->drc_index,
+			 __pa(buff_stats), size);
+	if (out)
+		*out =  ret[0];
+
+	if (rc == H_PARTIAL) {
+		dev_err(&p->pdev->dev,
+			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
+		return -ENOENT;
+	} else if (rc != H_SUCCESS) {
+		dev_err(&p->pdev->dev,
+			"Failed to query performance stats, Err:%lld\n", rc);
+		return -ENXIO;
+	}
+
+	/* Successfully fetched the requested stats from phyp */
+	if (size != 0) {
+		buff_stats->num_statistics =
+			be32_to_cpu(buff_stats->num_statistics);
+
+		/* Transform the stats buffer values from BE to cpu native */
+		for (i = 0, stats = buff_stats->scm_statistics;
+		     i < buff_stats->num_statistics; ++i) {
+			stats[i].statistic_value =
+				be64_to_cpu(stats[i].statistic_value);
+		}
+		dev_dbg(&p->pdev->dev,
+			"Performance stats returned %d stats\n",
+			buff_stats->num_statistics);
+	} else {
+		/* Handle case where stat buffer size was requested */
+		dev_dbg(&p->pdev->dev,
+			"Performance stats size %ld\n", ret[0]);
+	}
+
+	return 0;
+}
+
 /*
  * Issue hcall to retrieve dimm health info and populate papr_scm_priv with the
  * health information.
@@ -563,6 +656,47 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 	return *cmd_rc;
 }
 
+static ssize_t perf_stats_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	int index, rc;
+	struct seq_buf s;
+	struct nvdimm *dimm = to_nvdimm(dev);
+	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
+	struct papr_scm_perf_stat *stats = p->perf_stats->scm_statistics;
+
+	if (!p->len_stat_buffer)
+		return -ENOENT;
+
+	seq_buf_init(&s, buf, PAGE_SIZE);
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		return rc;
+
+	/* Ask phyp to return all dimm perf stats */
+	rc = drc_pmem_query_stats(p, p->perf_stats, p->len_stat_buffer, 0,
+				  NULL);
+	if (!rc) {
+		/*
+		 * Go through the returned output buffer and print stats and
+		 * values. Since statistic_id is essentially a char string of
+		 * 8 bytes, simply use the string format specifier to print it.
+		 */
+		for (index = 0; index < p->perf_stats->num_statistics;
+		     ++index) {
+			seq_buf_printf(&s, "%.8s = 0x%016llX\n",
+				       stats[index].statistic_id,
+				       stats[index].statistic_value);
+		}
+	}
+
+	mutex_unlock(&p->health_mutex);
+	return rc ? rc : seq_buf_used(&s);
+}
+DEVICE_ATTR_RO(perf_stats);
+
 static ssize_t flags_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
@@ -615,6 +749,7 @@ DEVICE_ATTR_RO(flags);
 /* papr_scm specific dimm attributes */
 static struct attribute *papr_scm_nd_attributes[] = {
 	&dev_attr_flags.attr,
+	&dev_attr_perf_stats.attr,
 	NULL,
 };
 
@@ -635,6 +770,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 	struct nd_region_desc ndr_desc;
 	unsigned long dimm_flags;
 	int target_nid, online_nid;
+	u64 stat_size;
 
 	p->bus_desc.ndctl = papr_scm_ndctl;
 	p->bus_desc.module = THIS_MODULE;
@@ -698,6 +834,26 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 		dev_info(dev, "Region registered with target node %d and online node %d",
 			 target_nid, online_nid);
 
+	/* Try retriving the stat buffer and see if its supported */
+	if (!drc_pmem_query_stats(p, NULL, 0, 0, &stat_size)) {
+		/* Allocate the buffer for phyp where stats are written */
+		p->perf_stats = kzalloc(stat_size, GFP_KERNEL);
+
+		/* Failed allocation is non fatal and results in limited data */
+		if (!p->perf_stats)
+			dev_dbg(&p->pdev->dev,
+				"Unable to allocate %llu bytes for perf-state",
+				stat_size);
+		else
+			p->len_stat_buffer = (size_t)stat_size;
+	} else {
+		dev_dbg(&p->pdev->dev, "Unable to retrieve performace stats\n");
+	}
+
+	/* Check if perf-stats buffer was allocated */
+	if (!p->len_stat_buffer)
+		dev_info(&p->pdev->dev, "Limited dimm info available\n");
+
 	return 0;
 
 err:	nvdimm_bus_unregister(p->bus);
-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 2/4] powerpc/papr_scm: Add support for PAPR_SCM_PDSM_FETCH_PERF_STATS
  2020-05-18 11:08 [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 1/4] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP Vaibhav Jain
@ 2020-05-18 11:08 ` Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 3/4] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_READ_PERF_STATS Vaibhav Jain
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Vaibhav Jain @ 2020-05-18 11:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm
  Cc: Vaibhav Jain, Aneesh Kumar K . V, Michael Ellerman

Add support for pdsm PAPR_SCM_PDSM_FETCH_PERF_STATS that issues HCALL
H_SCM_PERFORMANCE_STATS to PHYP to fetch all the NVDIMM performance
stats and store them in per nvdimm 'struct papr_scm_priv' as member
'perf_stats'. A further PDSM request (introduced later) is needed to
read the contents of this performance stats buffer.

A new uapi struct 'nd_psdm_perf_stats_size' is introduced to be used
by libndctl to retrieve the size of buffer needed to store all NVDIMM
performance stats.

The patch updates papr_scm_service_pdsm() to route
PAPR_SCM_PDSM_FETCH_PERF_STATS to newly introduced
papr_scm_fetch_perf_stats() which then issues the HCALL and copies the
needed size to the PDSM payload.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 13 ++++
 arch/powerpc/platforms/pseries/papr_scm.c     | 70 +++++++++++++++++++
 2 files changed, 83 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
index db0cf550dabe..40ec55d06f4c 100644
--- a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
@@ -114,6 +114,7 @@ struct nd_pdsm_cmd_pkg {
 enum papr_scm_pdsm {
 	PAPR_SCM_PDSM_MIN = 0x0,
 	PAPR_SCM_PDSM_HEALTH,
+	PAPR_SCM_PDSM_FETCH_PERF_STATS,
 	PAPR_SCM_PDSM_MAX,
 };
 
@@ -170,4 +171,16 @@ struct nd_papr_pdsm_health_v1 {
 /* Current version number for the dimm health struct */
 #define ND_PAPR_PDSM_HEALTH_VERSION 1
 
+/*
+ * Return the maximum buffer size needed to hold all performance state.
+ * max_stats_size: The buffer size needed to hold all stat entries
+ */
+struct nd_pdsm_fetch_perf_stats_v1 {
+	__u32 max_stats_size;
+	__u8 reserved[4];
+} __packed;
+
+#define nd_pdsm_fetch_perf_stats nd_pdsm_fetch_perf_stats_v1
+#define ND_PDSM_FETCH_PERF_STATS_VERSION 1
+
 #endif /* _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index fd9a12275315..f8b37a830aed 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -525,6 +525,73 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 	return 0;
 }
 
+/* Return the size in bytes for returning all perf stats to libndctl */
+static int papr_scm_fetch_perf_stats(struct papr_scm_priv *p,
+				     struct nd_pdsm_cmd_pkg *pkg)
+{
+	int rc = 0;
+	size_t copysize = sizeof(struct nd_pdsm_fetch_perf_stats);
+	struct nd_pdsm_fetch_perf_stats *sz =
+		(struct nd_pdsm_fetch_perf_stats *)pdsm_cmd_to_payload(pkg);
+
+	/*
+	 * If the requested payload version is greater than one we know
+	 * about, return the payload version we know about and let
+	 * caller/userspace handle.
+	 */
+	if (pkg->payload_version > ND_PDSM_FETCH_PERF_STATS_VERSION)
+		pkg->payload_version = ND_PDSM_FETCH_PERF_STATS_VERSION;
+
+	if (pkg->hdr.nd_size_out < copysize) {
+		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
+			pkg->hdr.nd_size_out, copysize);
+		rc = -ENOSPC;
+		goto out;
+	}
+
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		goto out;
+
+	if (!p->len_stat_buffer) {
+		rc = -ENOENT;
+		goto out_unlock;
+	}
+
+	/* Setup the buffer and request phyp for all dimm perf stats data */
+	rc = drc_pmem_query_stats(p, p->perf_stats, p->len_stat_buffer, 0,
+				  NULL);
+	if (rc)
+		goto out_unlock;
+
+	dev_dbg(&p->pdev->dev, "Copying payload size=%lu version=0x%x\n",
+		copysize, pkg->payload_version);
+
+	/*
+	 * Put the buffer size needed in the payload buffer subtracting the
+	 * perf_stat header size.
+	 */
+	if (p->len_stat_buffer > sizeof(struct papr_scm_perf_stats))
+		sz->max_stats_size = p->len_stat_buffer -
+			sizeof(struct papr_scm_perf_stats);
+	else
+		sz->max_stats_size = 0;
+
+	pkg->hdr.nd_fw_size = copysize;
+
+out_unlock:
+	mutex_unlock(&p->health_mutex);
+out:
+	/*
+	 * Put the error in out package and return success from function
+	 * so that errors if any are propogated back to userspace.
+	 */
+	pkg->cmd_status = rc;
+	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
+
+	return 0;
+}
+
 /* Fetch the DIMM health info and populate it in provided package. */
 static int papr_scm_get_health(struct papr_scm_priv *p,
 			       struct nd_pdsm_cmd_pkg *pkg)
@@ -594,6 +661,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 	case PAPR_SCM_PDSM_HEALTH:
 		return papr_scm_get_health(p, call_pkg);
 
+	case PAPR_SCM_PDSM_FETCH_PERF_STATS:
+		return papr_scm_fetch_perf_stats(p, call_pkg);
+
 	default:
 		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
 			call_pkg->hdr.nd_command);
-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 3/4] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_READ_PERF_STATS
  2020-05-18 11:08 [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 1/4] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 2/4] powerpc/papr_scm: Add support for PAPR_SCM_PDSM_FETCH_PERF_STATS Vaibhav Jain
@ 2020-05-18 11:08 ` Vaibhav Jain
  2020-05-18 11:08 ` [RFC PATCH 4/4] powerpc/papr_scm: Add support for PDSM GET_PERF_STAT Vaibhav Jain
  2020-10-21 16:52 ` [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Michal Suchánek
  4 siblings, 0 replies; 6+ messages in thread
From: Vaibhav Jain @ 2020-05-18 11:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm
  Cc: Vaibhav Jain, Aneesh Kumar K . V, Michael Ellerman

Implement support for pdsm READ_PERF_STATS to be used by libndctl to
fetch all NVDIMM performance statistics. The stats are to be exchanged
via newly introduced 'struct nd_pdsm_get_perf_stats' which is
allocated and sent by libndctl to papr_scm. The struct contains
members 'in_offset' and 'in_length' to provide incremental access to
performance statistics data buffer and workaround 'libnvdimm' limit of
256 bytes evelope size.

The patch introduces new function 'papr_scm_read_perf_stats()' to
service this pdsm and copy the requested chunk of performance stats to
the libndctl provided payload buffer for the given offset and length.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 35 +++++++
 arch/powerpc/platforms/pseries/papr_scm.c     | 91 +++++++++++++++++++
 2 files changed, 126 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
index 40ec55d06f4c..2db4ffdff285 100644
--- a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
@@ -115,6 +115,7 @@ enum papr_scm_pdsm {
 	PAPR_SCM_PDSM_MIN = 0x0,
 	PAPR_SCM_PDSM_HEALTH,
 	PAPR_SCM_PDSM_FETCH_PERF_STATS,
+	PAPR_SCM_PDSM_READ_PERF_STATS,
 	PAPR_SCM_PDSM_MAX,
 };
 
@@ -183,4 +184,38 @@ struct nd_pdsm_fetch_perf_stats_v1 {
 #define nd_pdsm_fetch_perf_stats nd_pdsm_fetch_perf_stats_v1
 #define ND_PDSM_FETCH_PERF_STATS_VERSION 1
 
+/*
+ * Holds a single performance stat. papr_scm owns a buffer that holds an array
+ * of all the available stats and their values. Access to the buffer is provided
+ * via PERF_STAT_SIZE and READ_PERF_STATS psdm.
+ * id : id of the performance stat. Usually acsii encode stat name.
+ * val : Non normalized value of the id.
+ */
+
+struct nd_pdsm_perf_stat {
+	__u64 id;
+	__u64 val;
+};
+
+/*
+ * Returns a chunk of performance stats buffer data to libndctl.
+ * This is needed to overcome the 256 byte envelope size limit enforced by
+ * libnvdimm.
+ * in_offset: The starting offset to perf stats data buffer.
+ * in_length: Length of data to be copied to 'stats_data'
+ * stats_data: Holds the chunk of requested perf stats data buffer.
+ *
+ * Note: To prevent races in reading performance stats, in_offset and in_length
+ * should multiple of 16-Bytes. If they are not then papr_scm will return an
+ * -EINVAL error.
+ */
+struct nd_pdsm_read_perf_stats_v1 {
+	__u32 in_offset;
+	__u32 in_length;
+	struct nd_pdsm_perf_stat stats_data[];
+} __packed;
+
+#define nd_pdsm_read_perf_stats nd_pdsm_read_perf_stats_v1
+#define ND_PDSM_READ_PERF_STATS_VERSION 1
+
 #endif /* _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f8b37a830aed..06744d7fe727 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -525,6 +525,94 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 	return 0;
 }
 
+/*
+ * Read the contents of dimm performance statistics buffer at the given
+ * 'in_offset' and copy 'in_length' number of bytes to the pkg payload.
+ * Both 'in_offset' and 'in_length' are expected to be in multiples of
+ * 16-Bytes to prevent a read/write race that may cause malformed values
+ * top be returned as performance statistics buffer content.
+ */
+static int papr_scm_read_perf_stats(struct papr_scm_priv *p,
+				    struct nd_pdsm_cmd_pkg *pkg)
+{
+	int rc;
+	struct nd_pdsm_read_perf_stats *stats =
+		(struct nd_pdsm_read_perf_stats *)pdsm_cmd_to_payload(pkg);
+	const size_t copysize = sizeof(struct nd_pdsm_read_perf_stats);
+	off_t offset;
+
+	/*
+	 * If the requested payload version is greater than one we know
+	 * about, return the payload version we know about and let
+	 * caller/userspace handle.
+	 */
+	if (pkg->payload_version > ND_PDSM_READ_PERF_STATS_VERSION)
+		pkg->payload_version = ND_PDSM_READ_PERF_STATS_VERSION;
+
+	if (pkg->hdr.nd_size_out < copysize) {
+		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
+			pkg->hdr.nd_size_out, copysize);
+		rc = -ENOSPC;
+		goto out;
+	}
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		goto out;
+
+	if (!p->len_stat_buffer) {
+		dev_dbg(&p->pdev->dev, "Perf stats: req for unsupported device");
+		rc = -ENOENT;
+		goto mutex_unlock_out;
+	}
+
+	/* calculate offset skipping the perf_stats buffer header */
+	offset = stats->in_offset + sizeof(*p->perf_stats);
+	/* Cap the copy length to extend of stats buffer */
+	stats->in_length = min(stats->in_length,
+			       (__u32)(p->len_stat_buffer - offset));
+
+	/*
+	 * Ensure that offset and length are valid and multiples of 16 bytes.
+	 * PDSM FETCH_PERF_STATS can interleave in between PDSM READ_PERF_STAT.
+	 * Since this is a read/write race hence malformed performance stats
+	 * buffer contents that may be returned.
+	 * A 16-Byte read alignment constraint forces a read granularity of
+	 * same the size of each performance stat and they are guaranteed to
+	 * remain stable during 'health_mutex' lock context.
+	 */
+	if (offset >= p->len_stat_buffer || (offset % 16) ||
+	    (stats->in_length % 16)) {
+		dev_dbg(&p->pdev->dev,
+			"Perf stats: Invalid offset(0x%lx) or length(0x%x)",
+			offset, stats->in_length);
+		rc = -EINVAL;
+		goto mutex_unlock_out;
+	}
+
+	/* Put the stats buffer data in the payload buffer */
+	memcpy(stats->stats_data,
+	       (void *)p->perf_stats + offset, stats->in_length);
+
+	pkg->hdr.nd_fw_size = stats->in_length;
+
+	dev_dbg(&p->pdev->dev, "Copying payload size=%u version=0x%x\n",
+		stats->in_length, pkg->payload_version);
+
+mutex_unlock_out:
+	mutex_unlock(&p->health_mutex);
+out:
+	/*
+	 * Put the error in out package and return success from function
+	 * so that errors if any are propogated back to userspace.
+	 */
+	pkg->cmd_status = rc;
+	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
+
+	return 0;
+}
+
 /* Return the size in bytes for returning all perf stats to libndctl */
 static int papr_scm_fetch_perf_stats(struct papr_scm_priv *p,
 				     struct nd_pdsm_cmd_pkg *pkg)
@@ -664,6 +752,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 	case PAPR_SCM_PDSM_FETCH_PERF_STATS:
 		return papr_scm_fetch_perf_stats(p, call_pkg);
 
+	case PAPR_SCM_PDSM_READ_PERF_STATS:
+		return papr_scm_read_perf_stats(p, call_pkg);
+
 	default:
 		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
 			call_pkg->hdr.nd_command);
-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 4/4] powerpc/papr_scm: Add support for PDSM GET_PERF_STAT
  2020-05-18 11:08 [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Vaibhav Jain
                   ` (2 preceding siblings ...)
  2020-05-18 11:08 ` [RFC PATCH 3/4] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_READ_PERF_STATS Vaibhav Jain
@ 2020-05-18 11:08 ` Vaibhav Jain
  2020-10-21 16:52 ` [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Michal Suchánek
  4 siblings, 0 replies; 6+ messages in thread
From: Vaibhav Jain @ 2020-05-18 11:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm
  Cc: Vaibhav Jain, Aneesh Kumar K . V, Michael Ellerman

This patch adds support for retrieving a singled NVDIMM performance
stat from PHYP via PDSM GET_PERF_STAT_VERSION. A new uapi 'struct
nd_pdsm_get_perf_stat' is introduced that holds a single performance
stat and is populated by newly introduced papr_scm_get_perf_stat() by
issuing an H_SCM_PERFORMANCE_STATS to PHYP.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 12 +++
 arch/powerpc/platforms/pseries/papr_scm.c     | 74 +++++++++++++++++++
 2 files changed, 86 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
index 2db4ffdff285..473c4bbddb2f 100644
--- a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
@@ -116,6 +116,7 @@ enum papr_scm_pdsm {
 	PAPR_SCM_PDSM_HEALTH,
 	PAPR_SCM_PDSM_FETCH_PERF_STATS,
 	PAPR_SCM_PDSM_READ_PERF_STATS,
+	PAPR_SCM_PDSM_GET_PERF_STAT,
 	PAPR_SCM_PDSM_MAX,
 };
 
@@ -218,4 +219,15 @@ struct nd_pdsm_read_perf_stats_v1 {
 #define nd_pdsm_read_perf_stats nd_pdsm_read_perf_stats_v1
 #define ND_PDSM_READ_PERF_STATS_VERSION 1
 
+/*
+ * Fetch the value of single nvdimm performance stat id of which is
+ * stored in 'stat.id'
+ */
+struct nd_pdsm_get_perf_stat_v1 {
+	struct nd_pdsm_perf_stat stat;
+} __packed;
+
+#define nd_pdsm_get_perf_stat nd_pdsm_get_perf_stat_v1
+#define ND_PDSM_GET_PERF_STAT_VERSION 1
+
 #endif /* _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 06744d7fe727..284d04f0a094 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -525,6 +525,77 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 	return 0;
 }
 
+static int papr_scm_get_perf_stat(struct papr_scm_priv *p,
+				  struct nd_pdsm_cmd_pkg *pkg)
+{
+	int rc;
+	struct nd_pdsm_get_perf_stat *stat =
+		(struct nd_pdsm_get_perf_stat *)pdsm_cmd_to_payload(pkg);
+	const size_t copysize = sizeof(struct nd_pdsm_get_perf_stat);
+	struct papr_scm_perf_stats *stats_req;
+	ssize_t stat_size;
+
+	/*
+	 * If the requested payload version is greater than one we know
+	 * about, return the payload version we know about and let
+	 * caller/userspace handle.
+	 */
+	if (pkg->payload_version > ND_PDSM_GET_PERF_STAT_VERSION)
+		pkg->payload_version = ND_PDSM_GET_PERF_STAT_VERSION;
+
+	if (pkg->hdr.nd_size_out < copysize) {
+		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
+			pkg->hdr.nd_size_out, copysize);
+		rc = -ENOSPC;
+		goto out;
+	}
+
+	if (!READ_ONCE(p->len_stat_buffer)) {
+		dev_dbg(&p->pdev->dev, "Perf stat: req for unsupported device");
+		rc = -ENOENT;
+		goto out;
+	}
+
+	/* Allocate and setup a PERFORMANCE_STATS request buffer */
+	stat_size = sizeof(struct papr_scm_perf_stats) +
+		sizeof(struct papr_scm_perf_stat);
+	stats_req = kzalloc(stat_size, GFP_KERNEL);
+	if (!stats_req) {
+		dev_err(&p->pdev->dev, "Perf stat: Unable to allocate memory\n");
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	/* Copy the single request statistic_id into the request buffer */
+	memcpy(&stats_req->scm_statistics[0].statistic_id, &stat->stat.id,
+	       sizeof(stats_req->scm_statistics[0].statistic_id));
+
+	/* Fetch the stat from PHYP */
+	rc = drc_pmem_query_stats(p, stats_req, stat_size, 1, NULL);
+	if (rc)
+		goto out;
+
+	/* Copy the value of stat to the return payload */
+	memcpy(&stat->stat.id, &stats_req->scm_statistics[0].statistic_id,
+	       sizeof(stat->stat.id));
+	stat->stat.val = stats_req->scm_statistics[0].statistic_value;
+
+	pkg->hdr.nd_fw_size = copysize;
+
+	dev_dbg(&p->pdev->dev, "Copying payload size=%u version=0x%x\n",
+		pkg->hdr.nd_fw_size, pkg->payload_version);
+
+out:
+	/*
+	 * Put the error in out package and return success from function
+	 * so that errors if any are propogated back to userspace.
+	 */
+	pkg->cmd_status = rc;
+	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
+
+	return 0;
+}
+
 /*
  * Read the contents of dimm performance statistics buffer at the given
  * 'in_offset' and copy 'in_length' number of bytes to the pkg payload.
@@ -755,6 +826,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 	case PAPR_SCM_PDSM_READ_PERF_STATS:
 		return papr_scm_read_perf_stats(p, call_pkg);
 
+	case PAPR_SCM_PDSM_GET_PERF_STAT:
+		return papr_scm_get_perf_stat(p, call_pkg);
+
 	default:
 		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
 			call_pkg->hdr.nd_command);
-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics
  2020-05-18 11:08 [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Vaibhav Jain
                   ` (3 preceding siblings ...)
  2020-05-18 11:08 ` [RFC PATCH 4/4] powerpc/papr_scm: Add support for PDSM GET_PERF_STAT Vaibhav Jain
@ 2020-10-21 16:52 ` Michal Suchánek
  4 siblings, 0 replies; 6+ messages in thread
From: Michal Suchánek @ 2020-10-21 16:52 UTC (permalink / raw)
  To: Vaibhav Jain; +Cc: linuxppc-dev, linux-nvdimm, Aneesh Kumar K . V

Hello,

apparently this has not received any (public) comments.

Maybe resend without the RFC status?

Clearly the kernel interface must be defined first, and then ndctl can
follow and make use of it.

Thanks

Michal

On Mon, May 18, 2020 at 04:38:10PM +0530, Vaibhav Jain wrote:
> The patch-set proposes to add support for fetching and reporting
> performance statistics for PAPR compliant NVDIMMs as described in
> documentation for H_SCM_PERFORMANCE_STATS hcall Ref[1]. The patch-set
> also implements mechanisms to expose NVDIMM performance stats via
> sysfs and newly introduced PDSMs[2] for libndctl.
> 
> This patch-set combined with corresponding ndctl and libndctl changes
> proposed at Ref[3] should enable user to fetch PAPR compliant NVDIMMs
> using following command:
> 
>  # ndctl list -D --stats
> [
>   {
>     "dev":"nmem0",
>     "stats":{
>       "Controller Reset Count":2,
>       "Controller Reset Elapsed Time":603331,
>       "Power-on Seconds":603931,
>       "Life Remaining":"100%",
>       "Critical Resource Utilization":"0%",
>       "Host Load Count":5781028,
>       "Host Store Count":8966800,
>       "Host Load Duration":975895365,
>       "Host Store Duration":716230690,
>       "Media Read Count":0,
>       "Media Write Count":6313,
>       "Media Read Duration":0,
>       "Media Write Duration":9679615,
>       "Cache Read Hit Count":5781028,
>       "Cache Write Hit Count":8442479,
>       "Fast Write Count":8969912
>     }
>   }
> ]
> 
> The patchset is dependent on existing patch-set "[PATCH v7 0/5]
> powerpc/papr_scm: Add support for reporting nvdimm health" available
> at Ref[2] that adds support for reporting PAPR compliant NVDIMMs in
> 'papr_scm' kernel module.
> 
> Structure of the patch-set
> ==========================
> 
> The patch-set starts with implementing functionality in papr_scm
> module to issue H_SCM_PERFORMANCE_STATS hcall, fetch & parse dimm
> performance stats and exposing them as a PAPR specific libnvdimm
> attribute named 'perf_stats'
> 
> Patch-2 introduces a new PDSM named FETCH_PERF_STATS that can be
> issued by libndctl asking papr_scm to issue the
> H_SCM_PERFORMANCE_STATS hcall using helpers introduced earlier and
> storing the results in a dimm specific perf-stats-buffer.
> 
> Patch-3 introduces a new PDSM named READ_PERF_STATS that can be
> issued by libndctl to read the perf-stats-buffer in an incremental
> manner to workaround the 256-bytes envelop limitation of libnvdimm.
> 
> Finally Patch-4 introduces a new PDSM named GET_PERF_STAT that can be
> issued by libndctl to read values of a specific NVDIMM performance
> stat like "Life Remaining".
> 
> References
> ==========
> [1] Documentation/powerpc/papr_hcals.rst
> 
> [2] https://lore.kernel.org/linux-nvdimm/20200508104922.72565-1-vaibhav@linux.ibm.com/
> 
> [3] https://github.com/vaibhav92/ndctl/tree/papr_scm_stats_v1
> 
> Vaibhav Jain (4):
>   powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
>   powerpc/papr_scm: Add support for PAPR_SCM_PDSM_FETCH_PERF_STATS
>   powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_READ_PERF_STATS
>   powerpc/papr_scm: Add support for PDSM GET_PERF_STAT
> 
>  Documentation/ABI/testing/sysfs-bus-papr-scm  |  27 ++
>  arch/powerpc/include/uapi/asm/papr_scm_pdsm.h |  60 +++
>  arch/powerpc/platforms/pseries/papr_scm.c     | 391 ++++++++++++++++++
>  3 files changed, 478 insertions(+)
> 
> -- 
> 2.26.2
> 
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-21 16:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-18 11:08 [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Vaibhav Jain
2020-05-18 11:08 ` [RFC PATCH 1/4] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP Vaibhav Jain
2020-05-18 11:08 ` [RFC PATCH 2/4] powerpc/papr_scm: Add support for PAPR_SCM_PDSM_FETCH_PERF_STATS Vaibhav Jain
2020-05-18 11:08 ` [RFC PATCH 3/4] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_READ_PERF_STATS Vaibhav Jain
2020-05-18 11:08 ` [RFC PATCH 4/4] powerpc/papr_scm: Add support for PDSM GET_PERF_STAT Vaibhav Jain
2020-10-21 16:52 ` [RFC PATCH 0/4] powerpc/papr_scm: Add support for reporting NVDIMM performance statistics Michal Suchánek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).