From: Vaibhav Jain <vaibhav@linux.ibm.com>
To: qemu-devel@nongnu.org, kvm-ppc@vger.kernel.org,
qemu-ppc@nongnu.org, david@gibson.dropbear.id.au, mst@redhat.com,
imammedo@redhat.com, xiaoguangrong.eric@gmail.com
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>,
groug@kaod.org, shivaprasadbhat@gmail.com,
aneesh.kumar@linux.ibm.com, bharata@linux.vnet.ibm.com
Subject: [RFC-PATCH] ppc/spapr: Add support for H_SCM_PERFORMANCE_STATS hcall
Date: Thu, 15 Apr 2021 13:23:43 +0530 [thread overview]
Message-ID: <20210415075343.482855-1-vaibhav@linux.ibm.com> (raw)
Add support for H_SCM_PERFORMANCE_STATS described at [1] for
spapr nvdimms. This enables guest to fetch performance stats[2] like
expected life of an nvdimm ('MemLife ') etc and display them to the
user. Linux kernel support for fetching these performance stats and
exposing them to the user-space was done via [3].
The hcall semantics mandate that each nvdimm performance stats is
uniquely identied by a 8-byte ascii string (e.g 'MemLife ') and its
value be a 8-byte integer. These performance-stats are exchanged with
the guest in via a guest allocated buffer called
'requestAndResultBuffer' or rr-buffer for short. This buffer contains
a header descibed by 'struct papr_scm_perf_stats' followed by an array
of performance-stats described by 'struct papr_scm_perf_stat'. The
hypervisor is expected to validate the rr-buffer header and then based
on the request copy the needed performance-stats to the array of
'struct papr_scm_perf_stat' following the header.
The patch proposes a new function h_scm_performance_stats() that
services the H_SCM_PERFORMANCE_STATS hcall. After verifying the
validity of the rr-buffer header via scm_perf_check_rr_buffer() it
proceeds to fill the rr-buffer with requested performance-stats. The
value of individual stats is retrived from individual accessor
function for the stat which are indexed in the array
'nvdimm_perf_stats'.
References:
[1] "Hypercall Op-codes (hcalls)"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/powerpc/papr_hcalls.rst#n269
[2] Sysfs attribute documentation for papr_scm
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-bus-papr-pmem#n36
[3] "powerpc/papr_scm: Fetch nvdimm performance stats from PHYP"
https://lore.kernel.org/r/20200731064153.182203-2-vaibhav@linux.ibm.com
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
hw/ppc/spapr_nvdimm.c | 243 +++++++++++++++++++++++++++++++++++++++++
include/hw/ppc/spapr.h | 19 +++-
2 files changed, 261 insertions(+), 1 deletion(-)
diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
index 252204e25f..4830eae4a4 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -35,6 +35,11 @@
/* SCM device is unable to persist memory contents */
#define PAPR_PMEM_UNARMED PPC_BIT(0)
+/* Maximum output buffer size needed to return all nvdimm_perf_stats */
+#define SCM_STATS_MAX_OUTPUT_BUFFER (sizeof(struct papr_scm_perf_stats) + \
+ sizeof(struct papr_scm_perf_stat) * \
+ ARRAY_SIZE(nvdimm_perf_stats))
+
bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
uint64_t size, Error **errp)
{
@@ -502,6 +507,243 @@ static target_ulong h_scm_health(PowerPCCPU *cpu, SpaprMachineState *spapr,
return H_SUCCESS;
}
+static int perf_stat_noop(SpaprDrc *drc, uint8_t unused[8], uint64_t *val)
+{
+ *val = 0;
+ return H_SUCCESS;
+}
+
+static int perf_stat_memlife(SpaprDrc *drc, uint8_t unused[8], uint64_t *val)
+{
+ /* Assume full life available of an NVDIMM right now */
+ *val = 100;
+ return H_SUCCESS;
+}
+
+/*
+ * Holds all supported performance stats accessors. Each performance-statistic
+ * is uniquely identified by a 8-byte ascii string for example: 'MemLife '
+ * which indicate in percentage how much usage life of an nvdimm is remaining.
+ * 'NoopStat' which is primarily used to test support for retriving performance
+ * stats and also to replace unknown stats present in the rr-buffer.
+ *
+ */
+static const struct {
+ char stat_id[8];
+ int (*stat_getval)(SpaprDrc *drc, uint8_t id[8], uint64_t *val);
+} nvdimm_perf_stats[] = {
+ { "NoopStat", perf_stat_noop},
+ { "MemLife ", perf_stat_memlife},
+};
+
+/*
+ * Given a nvdimm drc and stat-name return its value. In case given stat-name
+ * isnt supported then return H_PARTIAL.
+ */
+static int nvdimm_stat_getval(SpaprDrc *drc, uint8_t id[8], uint64_t *val)
+{
+ int index;
+
+ /* Lookup the stats-id in the nvdimm_perf_stats table */
+ for (index = 0; index < ARRAY_SIZE(nvdimm_perf_stats); ++index) {
+
+ if (memcmp(nvdimm_perf_stats[index].stat_id, &id[0], 8) == 0 &&
+ nvdimm_perf_stats[index].stat_getval) {
+
+ return nvdimm_perf_stats[index].stat_getval(drc, id, val);
+ }
+ }
+
+ return H_PARTIAL;
+}
+
+/*
+ * Given a request & result buffer header verify its contents. Also
+ * verify that buffer & buffer-size provided by the guest is accessible and
+ * large enough to hold the requested stats. The size of buffer needed to
+ * hold the requested 'num_stat' number of stats is returned in 'size'.
+ */
+static int scm_perf_check_rr_buffer(struct papr_scm_perf_stats *header,
+ hwaddr addr, size_t *size,
+ uint32_t *num_stats)
+{
+ size_t expected_buffsize;
+
+ /* Verify the header eyecather and version */
+ if (memcmp(&header->eye_catcher, SCM_STATS_EYECATCHER,
+ sizeof(header->eye_catcher))) {
+ return H_BAD_DATA;
+ }
+ if (be32_to_cpu(header->stats_version) != 0x1) {
+ return H_NOT_AVAILABLE;
+ }
+
+ /* verify that rr buffer has enough space */
+ *num_stats = be32_to_cpu(header->num_statistics);
+ if (*num_stats == 0) { /* Return all stats */
+ expected_buffsize = SCM_STATS_MAX_OUTPUT_BUFFER;
+ } else { /* Return a subset of stats */
+ expected_buffsize = sizeof(struct papr_scm_perf_stats) +
+ (*num_stats) * sizeof(struct papr_scm_perf_stat);
+
+ }
+
+ if (*size < expected_buffsize) {
+ return H_P3;
+ }
+ *size = expected_buffsize;
+
+ /* verify that rr buffer is writable */
+ if (!address_space_access_valid(&address_space_memory, addr, *size,
+ true, MEMTXATTRS_UNSPECIFIED)) {
+ return H_PRIVILEGE;
+ }
+
+ return H_SUCCESS;
+}
+
+/*
+ * For a given DRC index (R3) return one ore more performance stats of an nvdimm
+ * device in guest allocated Request-and-result buffer (rr-buffer) (R4) of
+ * given 'size' (R5). The rr-buffer consists of a header described by
+ * 'struct papr_scm_perf_stats' that embeds the 'stats_version' and
+ * 'num_statistics' fields. This is followed by an array of
+ * 'struct papr_scm_perf_stat'. Based on the request type the writes the
+ * performance into the array of 'struct papr_scm_perf_stat' embedded inside
+ * the rr-buffer provided by the guest.
+ * Special cases handled are:
+ * 'size' == 0 : Return the maximum possible size of rr-buffer
+ * 'size' != 0 && 'num_statistics == 0' : Return all possible performance stats
+ *
+ * In case there was an error fetching a specific stats (e.g stat-id unknown or
+ * any other error) then return the stat-id in R4 and also replace its stat
+ * entry in rr-buffer with 'NoopStat'
+ */
+static target_ulong h_scm_performance_stats(PowerPCCPU *cpu,
+ SpaprMachineState *spapr,
+ target_ulong opcode,
+ target_ulong *args)
+{
+ const uint32_t drc_index = args[0];
+ const hwaddr addr = args[1];
+ size_t size = args[2];
+ int index;
+ MemTxResult result;
+ uint32_t num_stats;
+ uint8_t stat_id[8];
+ unsigned long rc;
+ uint64_t stat_val, invalid_stat = 0;
+ struct papr_scm_perf_stats perfstats;
+ struct papr_scm_perf_stat *stats, *stat;
+ SpaprDrc *drc = spapr_drc_by_index(drc_index);
+
+ /* Ensure that the drc is valid & is valid PMEM dimm and is plugged in */
+ if (!drc || !drc->dev ||
+ spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PMEM) {
+ return H_PARAMETER;
+ }
+
+ /* Guest requested max buffer size for output buffer */
+ if (size == 0) {
+ args[0] = SCM_STATS_MAX_OUTPUT_BUFFER;
+ return H_SUCCESS;
+ }
+
+ /* Read and verify rr-buffer header */
+ result = address_space_read(&address_space_memory, addr,
+ MEMTXATTRS_UNSPECIFIED, &perfstats,
+ sizeof(perfstats));
+ if (result != MEMTX_OK) {
+ return H_PRIVILEGE;
+ }
+
+ /* Verify that the rr-buffer is valid */
+ rc = scm_perf_check_rr_buffer(&perfstats, addr, &size, &num_stats);
+ if (rc != H_SUCCESS) {
+ return rc;
+ }
+
+ /* allocate enough buffer space locally for holding all stats */
+ stats = g_malloc0(size - sizeof(struct papr_scm_perf_stats));
+
+ if (num_stats == 0) { /* Return all supported stats */
+
+ for (index = 1; index < ARRAY_SIZE(nvdimm_perf_stats); ++index) {
+ stat = &stats[index - 1];
+ memcpy(stat_id, &nvdimm_perf_stats[index].stat_id, 8);
+ rc = nvdimm_stat_getval(drc, stat_id, &stat_val);
+
+ /* On error add noop stat to rr buffer & save last inval stat-id */
+ if (rc != H_SUCCESS) {
+ if (!invalid_stat) {
+ memcpy(&invalid_stat, &stat_id[0], 8);
+ }
+ memcpy(&stat_id, nvdimm_perf_stats[0].stat_id, 8);
+ stat_val = 0;
+ }
+
+ memcpy(&stat->statistic_id, stat_id, 8);
+ stat->statistic_value = cpu_to_be64(stat_val);
+ }
+ /* Number of stats returned == perf_stats array size - noop-stat */
+ num_stats = ARRAY_SIZE(nvdimm_perf_stats) - 1;
+
+ } else { /* Return a subset of requested stats */
+
+ /* copy the rr-buffer from the guest memory */
+ result = address_space_read(&address_space_memory,
+ addr + sizeof(struct papr_scm_perf_stats),
+ MEMTXATTRS_UNSPECIFIED, stats,
+ size - sizeof(struct papr_scm_perf_stats));
+
+ if (result != MEMTX_OK) {
+ g_free(stats);
+ return H_PRIVILEGE;
+ }
+
+ for (index = 0; index < num_stats; ++index) {
+ stat = &stats[index];
+ memcpy(&stat_id, &stats->statistic_id, 8);
+ rc = nvdimm_stat_getval(drc, stat_id, &stat_val);
+
+ /* On error add noop stat to rr buffer & save last inval stat-id */
+ if (rc != H_SUCCESS) {
+ if (!invalid_stat) {
+ memcpy(&invalid_stat, &stat_id[0], 8);
+ }
+ memcpy(&stat_id, nvdimm_perf_stats[0].stat_id, 8);
+ stat_val = 0;
+ }
+
+ memcpy(&stat->statistic_id, stat_id, 8);
+ stat->statistic_value = cpu_to_be64(stat_val);
+ }
+ }
+
+ /* Update and copy the local rr-buffer header and stats back to guest */
+ perfstats.num_statistics = cpu_to_be32(num_stats);
+ result = address_space_write(&address_space_memory, addr,
+ MEMTXATTRS_UNSPECIFIED, &perfstats,
+ sizeof(struct papr_scm_perf_stats));
+ if (result == MEMTX_OK) {
+ result = address_space_write(&address_space_memory,
+ addr + sizeof(struct papr_scm_perf_stats),
+ MEMTXATTRS_UNSPECIFIED, stats,
+ size - sizeof(struct papr_scm_perf_stats));
+ }
+
+ /* Cleanup the stats buffer */
+ g_free(stats);
+
+ if (result) {
+ return H_PRIVILEGE;
+ }
+
+ /* Check if there was a failure in fetching any stat */
+ args[0] = invalid_stat;
+ return invalid_stat ? H_PARTIAL : H_SUCCESS;
+}
+
static void spapr_scm_register_types(void)
{
/* qemu/scm specific hcalls */
@@ -511,6 +753,7 @@ static void spapr_scm_register_types(void)
spapr_register_hypercall(H_SCM_UNBIND_MEM, h_scm_unbind_mem);
spapr_register_hypercall(H_SCM_UNBIND_ALL, h_scm_unbind_all);
spapr_register_hypercall(H_SCM_HEALTH, h_scm_health);
+ spapr_register_hypercall(H_SCM_PERFORMANCE_STATS, h_scm_performance_stats);
}
type_init(spapr_scm_register_types)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index d2b5a9bdf9..4b71b58e00 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -326,6 +326,7 @@ struct SpaprMachineState {
#define H_P8 -61
#define H_P9 -62
#define H_OVERLAP -68
+#define H_BAD_DATA -70
#define H_UNSUPPORTED_FLAG -256
#define H_MULTI_THREADS_ACTIVE -9005
@@ -539,8 +540,9 @@ struct SpaprMachineState {
#define H_SCM_UNBIND_MEM 0x3F0
#define H_SCM_UNBIND_ALL 0x3FC
#define H_SCM_HEALTH 0x400
+#define H_SCM_PERFORMANCE_STATS 0x418
-#define MAX_HCALL_OPCODE H_SCM_HEALTH
+#define MAX_HCALL_OPCODE H_SCM_PERFORMANCE_STATS
/* The hcalls above are standardized in PAPR and implemented by pHyp
* as well.
@@ -787,6 +789,21 @@ OBJECT_DECLARE_SIMPLE_TYPE(SpaprTceTable, SPAPR_TCE_TABLE)
DECLARE_INSTANCE_CHECKER(IOMMUMemoryRegion, SPAPR_IOMMU_MEMORY_REGION,
TYPE_SPAPR_IOMMU_MEMORY_REGION)
+/* Defs and structs exchanged with guest when reporting drc perf stats */
+#define SCM_STATS_EYECATCHER "SCMSTATS"
+
+struct QEMU_PACKED papr_scm_perf_stat {
+ uint8_t statistic_id[8];
+ uint64_t statistic_value;
+};
+
+struct QEMU_PACKED papr_scm_perf_stats {
+ uint8_t eye_catcher[8]; /* Should be “SCMSTATS” */
+ uint32_t stats_version; /* Should be 0x01 */
+ uint32_t num_statistics; /* Number of stats following */
+ struct papr_scm_perf_stat scm_statistics[]; /* Performance matrics */
+};
+
struct SpaprTceTable {
DeviceState parent;
uint32_t liobn;
--
2.30.2
next reply other threads:[~2021-04-15 7:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-15 7:53 Vaibhav Jain [this message]
2021-04-19 4:45 ` [RFC-PATCH] ppc/spapr: Add support for H_SCM_PERFORMANCE_STATS hcall David Gibson
2021-04-20 9:36 ` Vaibhav Jain
2021-04-21 5:19 ` David Gibson
2021-05-06 3:19 ` Vaibhav Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210415075343.482855-1-vaibhav@linux.ibm.com \
--to=vaibhav@linux.ibm.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=bharata@linux.vnet.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=groug@kaod.org \
--cc=imammedo@redhat.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=shivaprasadbhat@gmail.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).