NVDIMM Device and Persistent Memory development
 help / color / Atom feed
* [ndctl PATCH 0/2] papr: Implement initial support for injecting smart errors
@ 2021-07-12 17:31 Vaibhav Jain
  2021-07-12 17:31 ` [ndctl PATCH 1/2] libndctl, intel: Indicate supported smart-inject types Vaibhav Jain
  2021-07-12 17:31 ` [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart Vaibhav Jain
  0 siblings, 2 replies; 5+ messages in thread
From: Vaibhav Jain @ 2021-07-12 17:31 UTC (permalink / raw)
  To: nvdimm
  Cc: Vaibhav Jain, Dan Williams, Vishal Verma, Aneesh Kumar K . V,
	Ira Weiny, Shivaprasad G Bhat

The patch series implements limited support for injecting smart errors for PAPR
NVDIMMs via ndctl-inject-smart(1) command. SMART errors are emulating in
papr_scm module as presently PAPR doesn't support injecting smart errors on an
NVDIMM. Currently support for injecting 'fatal' health state and 'dirty'
shutdown state is implemented. With the proposed ndctl patched and with
corresponding kernel patch [1] following command flow is expected:

$ sudo ndctl list -DH -d nmem0
...
      "health_state":"ok",
      "shutdown_state":"clean",
...
 # inject unsafe shutdown and fatal health error
$ sudo ndctl inject-smart nmem0 -Uf
...
      "health_state":"fatal",
      "shutdown_state":"dirty",
...
 # uninject all errors
$ sudo ndctl inject-smart nmem0 -N
...
      "health_state":"ok",
      "shutdown_state":"clean",
...

Structure of the patch series
=============================

* First patch updates 'inject-smart' code to not always assume support for
  injecting all smart-errors. It also updates 'intel.c' to explicitly indicate
  the type of smart-inject errors supported.

* Update 'papr.c' to add support for injecting smart 'fatal' health and
  'dirty-shutdown' errors.

Vaibhav Jain (2):
  libndctl, intel: Indicate supported smart-inject types
  libndctl/papr: Add limited support for inject-smart

 ndctl/inject-smart.c  | 33 ++++++++++++++++++-----
 ndctl/lib/intel.c     |  7 ++++-
 ndctl/lib/papr.c      | 61 +++++++++++++++++++++++++++++++++++++++++++
 ndctl/lib/papr_pdsm.h | 17 ++++++++++++
 ndctl/libndctl.h      |  8 ++++++
 5 files changed, 118 insertions(+), 8 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ndctl PATCH 1/2] libndctl, intel: Indicate supported smart-inject types
  2021-07-12 17:31 [ndctl PATCH 0/2] papr: Implement initial support for injecting smart errors Vaibhav Jain
@ 2021-07-12 17:31 ` Vaibhav Jain
  2021-07-12 17:31 ` [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart Vaibhav Jain
  1 sibling, 0 replies; 5+ messages in thread
From: Vaibhav Jain @ 2021-07-12 17:31 UTC (permalink / raw)
  To: nvdimm
  Cc: Vaibhav Jain, Dan Williams, Vishal Verma, Aneesh Kumar K . V,
	Ira Weiny, Shivaprasad G Bhat

Presently the inject-smart code assumes support injecting all
smart-errors namely media-temperature, controller-temperature,
spares-remaining, fatal-health and unsafe-shutdown. This assumption
may break in case of other non-Intel NVDIMM types namely PAPR NVDIMMs
which presently only have support for injecting unsafe-shutdown and
fatal health events.

Trying to inject-smart errors on PAPR NVDIMMs causes problems as
smart_inject() prematurely exits when trying to inject
media-temperature smart-error errors out.

To fix this issue the patch proposes extending the definition of
dimm_op 'smart_inject_supported' to return bitmap of flags indicating
the type of smart-error injections supported by an NVDIMM. These types
are indicated by the newly introduced defines ND_SMART_INJECT_* . A
dimm-ops provide can return an bitmap composed of these flags back
from its implementation of 'smart_inject_supported' to indicate to
dimm_inject_smart() which type of smart-error injection it
supports. In case of an error the dimm-op is still expected to return
a negative error code back to the caller.

The patch updates intel_dimm_smart_inject_supported() to return a
bitmap composed of all ND_SMART_INJECT_* flags to indicate support for
all smart-error types.

Finally the patch also updates smart_inject() to test for specific
ND_START_INJECT_* flags before sending a smart-inject command via
dimm-provider.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 ndctl/inject-smart.c | 33 ++++++++++++++++++++++++++-------
 ndctl/lib/intel.c    |  7 ++++++-
 ndctl/libndctl.h     |  8 ++++++++
 3 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/ndctl/inject-smart.c b/ndctl/inject-smart.c
index 9077bca256e4..ef0620f55531 100644
--- a/ndctl/inject-smart.c
+++ b/ndctl/inject-smart.c
@@ -393,18 +393,26 @@ out:
 	} \
 }
 
-static int smart_inject(struct ndctl_dimm *dimm)
+static int smart_inject(struct ndctl_dimm *dimm, unsigned int inject_types)
 {
 	const char *name = ndctl_dimm_get_devname(dimm);
 	struct ndctl_cmd *si_cmd = NULL;
 	int rc = -EOPNOTSUPP;
 
-	send_inject_val(media_temperature)
-	send_inject_val(ctrl_temperature)
-	send_inject_val(spares)
-	send_inject_bool(fatal)
-	send_inject_bool(unsafe_shutdown)
+	if (inject_types & ND_SMART_INJECT_MEDIA_TEMPERATURE)
+		send_inject_val(media_temperature);
 
+	if (inject_types & ND_SMART_INJECT_CTRL_TEMPERATURE)
+		send_inject_val(ctrl_temperature);
+
+	if (inject_types & ND_SMART_INJECT_SPARES_REMAINING)
+		send_inject_val(spares);
+
+	if (inject_types & ND_SMART_INJECT_HEALTH_STATE)
+		send_inject_bool(fatal);
+
+	if (inject_types & ND_SMART_INJECT_UNCLEAN_SHUTDOWN)
+		send_inject_bool(unsafe_shutdown);
 out:
 	ndctl_cmd_unref(si_cmd);
 	return rc;
@@ -415,8 +423,10 @@ static int dimm_inject_smart(struct ndctl_dimm *dimm)
 	struct json_object *jhealth;
 	struct json_object *jdimms;
 	struct json_object *jdimm;
+	unsigned int supported_types;
 	int rc;
 
+	/* Get supported smart injection types */
 	rc = ndctl_dimm_smart_inject_supported(dimm);
 	switch (rc) {
 	case -ENOTTY:
@@ -431,6 +441,15 @@ static int dimm_inject_smart(struct ndctl_dimm *dimm)
 		error("%s: smart injection not supported by either platform firmware or the kernel.",
 			ndctl_dimm_get_devname(dimm));
 		return rc;
+	default:
+		if (rc < 0) {
+			error("%s: Unknown error %d while checking for smart injection support",
+			      ndctl_dimm_get_devname(dimm), rc);
+			return rc;
+		}
+		/* Assigning to an unsigned type since rc < 0 */
+		supported_types = rc;
+		break;
 	}
 
 	if (sctx.op_mask & (1 << OP_SET)) {
@@ -439,7 +458,7 @@ static int dimm_inject_smart(struct ndctl_dimm *dimm)
 			goto out;
 	}
 	if (sctx.op_mask & (1 << OP_INJECT)) {
-		rc = smart_inject(dimm);
+		rc = smart_inject(dimm, supported_types);
 		if (rc)
 			goto out;
 	}
diff --git a/ndctl/lib/intel.c b/ndctl/lib/intel.c
index a3df26e6bc58..1314854553d5 100644
--- a/ndctl/lib/intel.c
+++ b/ndctl/lib/intel.c
@@ -455,7 +455,12 @@ static int intel_dimm_smart_inject_supported(struct ndctl_dimm *dimm)
 		return -EIO;
 	}
 
-	return 0;
+	/* Indicate all smart injection types are supported */
+	return ND_SMART_INJECT_SPARES_REMAINING |
+		ND_SMART_INJECT_MEDIA_TEMPERATURE |
+		ND_SMART_INJECT_CTRL_TEMPERATURE |
+		ND_SMART_INJECT_HEALTH_STATE |
+		ND_SMART_INJECT_UNCLEAN_SHUTDOWN;
 }
 
 static const char *intel_cmd_desc(int fn)
diff --git a/ndctl/libndctl.h b/ndctl/libndctl.h
index 87d07b74ef8b..3a5013007038 100644
--- a/ndctl/libndctl.h
+++ b/ndctl/libndctl.h
@@ -69,6 +69,13 @@ extern "C" {
 #define ND_EVENT_HEALTH_STATE		(1 << 3)
 #define ND_EVENT_UNCLEAN_SHUTDOWN	(1 << 4)
 
+/* Flags indicating support for various smart injection types */
+#define ND_SMART_INJECT_SPARES_REMAINING	(1 << 0)
+#define ND_SMART_INJECT_MEDIA_TEMPERATURE	(1 << 1)
+#define ND_SMART_INJECT_CTRL_TEMPERATURE	(1 << 2)
+#define ND_SMART_INJECT_HEALTH_STATE		(1 << 3)
+#define ND_SMART_INJECT_UNCLEAN_SHUTDOWN	(1 << 4)
+
 size_t ndctl_min_namespace_size(void);
 size_t ndctl_sizeof_namespace_index(void);
 size_t ndctl_sizeof_namespace_label(void);
@@ -309,6 +316,7 @@ int ndctl_cmd_smart_inject_spares(struct ndctl_cmd *cmd, bool enable,
 		unsigned int spares);
 int ndctl_cmd_smart_inject_fatal(struct ndctl_cmd *cmd, bool enable);
 int ndctl_cmd_smart_inject_unsafe_shutdown(struct ndctl_cmd *cmd, bool enable);
+/* Returns a bitmap of ND_SMART_INJECT_* supported */
 int ndctl_dimm_smart_inject_supported(struct ndctl_dimm *dimm);
 
 struct ndctl_cmd *ndctl_dimm_cmd_new_vendor_specific(struct ndctl_dimm *dimm,
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart
  2021-07-12 17:31 [ndctl PATCH 0/2] papr: Implement initial support for injecting smart errors Vaibhav Jain
  2021-07-12 17:31 ` [ndctl PATCH 1/2] libndctl, intel: Indicate supported smart-inject types Vaibhav Jain
@ 2021-07-12 17:31 ` Vaibhav Jain
  2021-07-13  4:43   ` Aneesh Kumar K.V
  1 sibling, 1 reply; 5+ messages in thread
From: Vaibhav Jain @ 2021-07-12 17:31 UTC (permalink / raw)
  To: nvdimm
  Cc: Vaibhav Jain, Dan Williams, Vishal Verma, Aneesh Kumar K . V,
	Ira Weiny, Shivaprasad G Bhat

Implements support for ndctl inject-smart command by providing an
implementation of 'smart_inject*' dimm-ops callbacks. Presently only
support for injecting unsafe-shutdown and fatal-health states is
available.

The patch also introduce various PAPR PDSM structures that are used to
communicate the inject-smart errors to the papr_scm kernel
module. This is done via SMART_INJECT PDSM which sends a payload of
type 'struct nd_papr_pdsm_smart_inject'.

The patch depends on the kernel PAPR PDSM implementation for
PDSM_SMART_INJECT posted at [1].

[1] : https://lore.kernel.org/nvdimm/20210712084819.1150350-1-vaibhav@linux.ibm.com
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 ndctl/lib/papr.c      | 61 +++++++++++++++++++++++++++++++++++++++++++
 ndctl/lib/papr_pdsm.h | 17 ++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/ndctl/lib/papr.c b/ndctl/lib/papr.c
index 42ff200dc588..b797e1e5fe8b 100644
--- a/ndctl/lib/papr.c
+++ b/ndctl/lib/papr.c
@@ -221,6 +221,41 @@ static unsigned int papr_smart_get_shutdown_state(struct ndctl_cmd *cmd)
 	return health.dimm_bad_shutdown;
 }
 
+static int papr_smart_inject_supported(struct ndctl_dimm *dimm)
+{
+	if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_CALL))
+		return -EOPNOTSUPP;
+
+	if (!test_dimm_dsm(dimm, PAPR_PDSM_SMART_INJECT))
+		return -EIO;
+
+	return ND_SMART_INJECT_HEALTH_STATE | ND_SMART_INJECT_UNCLEAN_SHUTDOWN;
+}
+
+static int papr_smart_inject_valid(struct ndctl_cmd *cmd)
+{
+	if (cmd->type != ND_CMD_CALL ||
+	    to_pdsm(cmd)->cmd_status != 0 ||
+	    to_pdsm_cmd(cmd) != PAPR_PDSM_SMART_INJECT)
+		return -EINVAL;
+
+	return 0;
+}
+
+static struct ndctl_cmd *papr_new_smart_inject(struct ndctl_dimm *dimm)
+{
+	struct ndctl_cmd *cmd;
+
+	cmd = allocate_cmd(dimm, PAPR_PDSM_SMART_INJECT,
+			sizeof(struct nd_papr_pdsm_smart_inject));
+	if (!cmd)
+		return NULL;
+	/* Set the input payload size */
+	to_ndcmd(cmd)->nd_size_in = ND_PDSM_HDR_SIZE +
+		sizeof(struct nd_papr_pdsm_smart_inject);
+	return cmd;
+}
+
 static unsigned int papr_smart_get_life_used(struct ndctl_cmd *cmd)
 {
 	struct nd_papr_pdsm_health health;
@@ -255,11 +290,37 @@ static unsigned int papr_smart_get_shutdown_count(struct ndctl_cmd *cmd)
 
 	return (health.extension_flags & PDSM_DIMM_DSC_VALID) ?
 		(health.dimm_dsc) : 0;
+}
+
+static int papr_cmd_smart_inject_fatal(struct ndctl_cmd *cmd, bool enable)
+{
+	if (papr_smart_inject_valid(cmd) < 0)
+		return -EINVAL;
+
+	to_payload(cmd)->inject.flags |= PDSM_SMART_INJECT_HEALTH_FATAL;
+	to_payload(cmd)->inject.fatal_enable = enable;
 
+	return 0;
+}
+
+static int papr_cmd_smart_inject_unsafe_shutdown(struct ndctl_cmd *cmd,
+						 bool enable)
+{
+	if (papr_smart_inject_valid(cmd) < 0)
+		return -EINVAL;
+
+	to_payload(cmd)->inject.flags |= PDSM_SMART_INJECT_BAD_SHUTDOWN;
+	to_payload(cmd)->inject.unsafe_shutdown_enable = enable;
+
+	return 0;
 }
 
 struct ndctl_dimm_ops * const papr_dimm_ops = &(struct ndctl_dimm_ops) {
 	.cmd_is_supported = papr_cmd_is_supported,
+	.new_smart_inject = papr_new_smart_inject,
+	.smart_inject_supported = papr_smart_inject_supported,
+	.smart_inject_fatal = papr_cmd_smart_inject_fatal,
+	.smart_inject_unsafe_shutdown = papr_cmd_smart_inject_unsafe_shutdown,
 	.smart_get_flags = papr_smart_get_flags,
 	.get_firmware_status =  papr_get_firmware_status,
 	.xlat_firmware_status = papr_xlat_firmware_status,
diff --git a/ndctl/lib/papr_pdsm.h b/ndctl/lib/papr_pdsm.h
index f45b1e40c075..20ac20f89acd 100644
--- a/ndctl/lib/papr_pdsm.h
+++ b/ndctl/lib/papr_pdsm.h
@@ -121,12 +121,29 @@ struct nd_papr_pdsm_health {
 enum papr_pdsm {
 	PAPR_PDSM_MIN = 0x0,
 	PAPR_PDSM_HEALTH,
+	PAPR_PDSM_SMART_INJECT,
 	PAPR_PDSM_MAX,
 };
+/* Flags for injecting specific smart errors */
+#define PDSM_SMART_INJECT_HEALTH_FATAL		(1 << 0)
+#define PDSM_SMART_INJECT_BAD_SHUTDOWN		(1 << 1)
+
+struct nd_papr_pdsm_smart_inject {
+	union {
+		struct {
+			/* One or more of PDSM_SMART_INJECT_ */
+			__u32 flags;
+			__u8 fatal_enable;
+			__u8 unsafe_shutdown_enable;
+		};
+		__u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
+	};
+};
 
 /* Maximal union that can hold all possible payload types */
 union nd_pdsm_payload {
 	struct nd_papr_pdsm_health health;
+	struct nd_papr_pdsm_smart_inject inject;
 	__u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
 } __attribute__((packed));
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart
  2021-07-12 17:31 ` [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart Vaibhav Jain
@ 2021-07-13  4:43   ` Aneesh Kumar K.V
  2021-07-13  7:51     ` Vaibhav Jain
  0 siblings, 1 reply; 5+ messages in thread
From: Aneesh Kumar K.V @ 2021-07-13  4:43 UTC (permalink / raw)
  To: Vaibhav Jain, nvdimm
  Cc: Dan Williams, Vishal Verma, Ira Weiny, Shivaprasad G Bhat

On 7/12/21 11:01 PM, Vaibhav Jain wrote:
> Implements support for ndctl inject-smart command by providing an
> implementation of 'smart_inject*' dimm-ops callbacks. Presently only
> support for injecting unsafe-shutdown and fatal-health states is
> available.
> 
> The patch also introduce various PAPR PDSM structures that are used to
> communicate the inject-smart errors to the papr_scm kernel
> module. This is done via SMART_INJECT PDSM which sends a payload of
> type 'struct nd_papr_pdsm_smart_inject'.
> 
> The patch depends on the kernel PAPR PDSM implementation for
> PDSM_SMART_INJECT posted at [1].
> 
> [1] : https://lore.kernel.org/nvdimm/20210712084819.1150350-1-vaibhav@linux.ibm.com
> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>   ndctl/lib/papr.c      | 61 +++++++++++++++++++++++++++++++++++++++++++
>   ndctl/lib/papr_pdsm.h | 17 ++++++++++++
>   2 files changed, 78 insertions(+)
> 
> diff --git a/ndctl/lib/papr.c b/ndctl/lib/papr.c
> index 42ff200dc588..b797e1e5fe8b 100644
> --- a/ndctl/lib/papr.c
> +++ b/ndctl/lib/papr.c
> @@ -221,6 +221,41 @@ static unsigned int papr_smart_get_shutdown_state(struct ndctl_cmd *cmd)
>   	return health.dimm_bad_shutdown;
>   }
>   
> +static int papr_smart_inject_supported(struct ndctl_dimm *dimm)
> +{
> +	if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_CALL))
> +		return -EOPNOTSUPP;
> +
> +	if (!test_dimm_dsm(dimm, PAPR_PDSM_SMART_INJECT))
> +		return -EIO;
> +
> +	return ND_SMART_INJECT_HEALTH_STATE | ND_SMART_INJECT_UNCLEAN_SHUTDOWN;
> +}
> +

with ndtest PAPR_SCM_FAMILY driver, should we test more inject types? if 
so should the supported inject types be fetched from the driver?

> +static int papr_smart_inject_valid(struct ndctl_cmd *cmd)
> +{
> +	if (cmd->type != ND_CMD_CALL ||
> +	    to_pdsm(cmd)->cmd_status != 0 ||
> +	    to_pdsm_cmd(cmd) != PAPR_PDSM_SMART_INJECT)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +
> +static struct ndctl_cmd *papr_new_smart_inject(struct ndctl_dimm *dimm)
> +{
> +	struct ndctl_cmd *cmd;
> +
> +	cmd = allocate_cmd(dimm, PAPR_PDSM_SMART_INJECT,
> +			sizeof(struct nd_papr_pdsm_smart_inject));
> +	if (!cmd)
> +		return NULL;
> +	/* Set the input payload size */
> +	to_ndcmd(cmd)->nd_size_in = ND_PDSM_HDR_SIZE +
> +		sizeof(struct nd_papr_pdsm_smart_inject);
> +	return cmd;
> +}
> +
>   static unsigned int papr_smart_get_life_used(struct ndctl_cmd *cmd)
>   {
>   	struct nd_papr_pdsm_health health;
> @@ -255,11 +290,37 @@ static unsigned int papr_smart_get_shutdown_count(struct ndctl_cmd *cmd)
>   
>   	return (health.extension_flags & PDSM_DIMM_DSC_VALID) ?
>   		(health.dimm_dsc) : 0;
> +}
> +
> +static int papr_cmd_smart_inject_fatal(struct ndctl_cmd *cmd, bool enable)
> +{
> +	if (papr_smart_inject_valid(cmd) < 0)
> +		return -EINVAL;
> +
> +	to_payload(cmd)->inject.flags |= PDSM_SMART_INJECT_HEALTH_FATAL;
> +	to_payload(cmd)->inject.fatal_enable = enable;
>   
> +	return 0;
> +}
> +
> +static int papr_cmd_smart_inject_unsafe_shutdown(struct ndctl_cmd *cmd,
> +						 bool enable)
> +{
> +	if (papr_smart_inject_valid(cmd) < 0)
> +		return -EINVAL;
> +
> +	to_payload(cmd)->inject.flags |= PDSM_SMART_INJECT_BAD_SHUTDOWN;
> +	to_payload(cmd)->inject.unsafe_shutdown_enable = enable;
> +
> +	return 0;
>   }
>   
>   struct ndctl_dimm_ops * const papr_dimm_ops = &(struct ndctl_dimm_ops) {
>   	.cmd_is_supported = papr_cmd_is_supported,
> +	.new_smart_inject = papr_new_smart_inject,
> +	.smart_inject_supported = papr_smart_inject_supported,
> +	.smart_inject_fatal = papr_cmd_smart_inject_fatal,
> +	.smart_inject_unsafe_shutdown = papr_cmd_smart_inject_unsafe_shutdown,
>   	.smart_get_flags = papr_smart_get_flags,
>   	.get_firmware_status =  papr_get_firmware_status,
>   	.xlat_firmware_status = papr_xlat_firmware_status,
> diff --git a/ndctl/lib/papr_pdsm.h b/ndctl/lib/papr_pdsm.h
> index f45b1e40c075..20ac20f89acd 100644
> --- a/ndctl/lib/papr_pdsm.h
> +++ b/ndctl/lib/papr_pdsm.h
> @@ -121,12 +121,29 @@ struct nd_papr_pdsm_health {
>   enum papr_pdsm {
>   	PAPR_PDSM_MIN = 0x0,
>   	PAPR_PDSM_HEALTH,
> +	PAPR_PDSM_SMART_INJECT,
>   	PAPR_PDSM_MAX,
>   };
> +/* Flags for injecting specific smart errors */
> +#define PDSM_SMART_INJECT_HEALTH_FATAL		(1 << 0)
> +#define PDSM_SMART_INJECT_BAD_SHUTDOWN		(1 << 1)
> +
> +struct nd_papr_pdsm_smart_inject {
> +	union {
> +		struct {
> +			/* One or more of PDSM_SMART_INJECT_ */
> +			__u32 flags;
> +			__u8 fatal_enable;
> +			__u8 unsafe_shutdown_enable;
> +		};
> +		__u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
> +	};
> +};
>   
>   /* Maximal union that can hold all possible payload types */
>   union nd_pdsm_payload {
>   	struct nd_papr_pdsm_health health;
> +	struct nd_papr_pdsm_smart_inject inject;
>   	__u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
>   } __attribute__((packed));
>   
> 


-aneesh


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart
  2021-07-13  4:43   ` Aneesh Kumar K.V
@ 2021-07-13  7:51     ` Vaibhav Jain
  0 siblings, 0 replies; 5+ messages in thread
From: Vaibhav Jain @ 2021-07-13  7:51 UTC (permalink / raw)
  To: Aneesh Kumar K.V, nvdimm
  Cc: Dan Williams, Vishal Verma, Ira Weiny, Shivaprasad G Bhat

Thanks Aneesh for looking into this patch

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:

> On 7/12/21 11:01 PM, Vaibhav Jain wrote:
>> Implements support for ndctl inject-smart command by providing an
>> implementation of 'smart_inject*' dimm-ops callbacks. Presently only
>> support for injecting unsafe-shutdown and fatal-health states is
>> available.
>> 
>> The patch also introduce various PAPR PDSM structures that are used to
>> communicate the inject-smart errors to the papr_scm kernel
>> module. This is done via SMART_INJECT PDSM which sends a payload of
>> type 'struct nd_papr_pdsm_smart_inject'.
>> 
>> The patch depends on the kernel PAPR PDSM implementation for
>> PDSM_SMART_INJECT posted at [1].
>> 
>> [1] : https://lore.kernel.org/nvdimm/20210712084819.1150350-1-vaibhav@linux.ibm.com
>> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>>   ndctl/lib/papr.c      | 61 +++++++++++++++++++++++++++++++++++++++++++
>>   ndctl/lib/papr_pdsm.h | 17 ++++++++++++
>>   2 files changed, 78 insertions(+)
>> 
>> diff --git a/ndctl/lib/papr.c b/ndctl/lib/papr.c
>> index 42ff200dc588..b797e1e5fe8b 100644
>> --- a/ndctl/lib/papr.c
>> +++ b/ndctl/lib/papr.c
>> @@ -221,6 +221,41 @@ static unsigned int papr_smart_get_shutdown_state(struct ndctl_cmd *cmd)
>>   	return health.dimm_bad_shutdown;
>>   }
>>   
>> +static int papr_smart_inject_supported(struct ndctl_dimm *dimm)
>> +{
>> +	if (!ndctl_dimm_is_cmd_supported(dimm, ND_CMD_CALL))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (!test_dimm_dsm(dimm, PAPR_PDSM_SMART_INJECT))
>> +		return -EIO;
>> +
>> +	return ND_SMART_INJECT_HEALTH_STATE | ND_SMART_INJECT_UNCLEAN_SHUTDOWN;
>> +}
>> +
>
> with ndtest PAPR_SCM_FAMILY driver, should we test more inject types?

Presently a commmon PDSM structure 'struct nd_papr_pdsm_smart_inject'
used between ndtest and papr_scm. If we want to add support for more
inject types in ndtest then that structure would need to be extended.

However even with that, libndctl still shares common dimm-ops
callback for papr_scm & ndtest which only supports injecting smart fatal
health and dirty-shutdown at the moment. So with only ndtest supporting an inject
type for example temprature-threshold, not sure which libndctl code
patch we will be testing.

> if 
> so should the supported inject types be fetched from the driver?
>
Good suggestion.
Surely that can be a implemented in future once papr_scm and ndtest
starts supporting more smart inject types.

>> +static int papr_smart_inject_valid(struct ndctl_cmd *cmd)
>> +{
>> +	if (cmd->type != ND_CMD_CALL ||
>> +	    to_pdsm(cmd)->cmd_status != 0 ||
>> +	    to_pdsm_cmd(cmd) != PAPR_PDSM_SMART_INJECT)
>> +		return -EINVAL;
>> +
>> +	return 0;
>> +}
>> +
>> +static struct ndctl_cmd *papr_new_smart_inject(struct ndctl_dimm *dimm)
>> +{
>> +	struct ndctl_cmd *cmd;
>> +
>> +	cmd = allocate_cmd(dimm, PAPR_PDSM_SMART_INJECT,
>> +			sizeof(struct nd_papr_pdsm_smart_inject));
>> +	if (!cmd)
>> +		return NULL;
>> +	/* Set the input payload size */
>> +	to_ndcmd(cmd)->nd_size_in = ND_PDSM_HDR_SIZE +
>> +		sizeof(struct nd_papr_pdsm_smart_inject);
>> +	return cmd;
>> +}
>> +
>>   static unsigned int papr_smart_get_life_used(struct ndctl_cmd *cmd)
>>   {
>>   	struct nd_papr_pdsm_health health;
>> @@ -255,11 +290,37 @@ static unsigned int papr_smart_get_shutdown_count(struct ndctl_cmd *cmd)
>>   
>>   	return (health.extension_flags & PDSM_DIMM_DSC_VALID) ?
>>   		(health.dimm_dsc) : 0;
>> +}
>> +
>> +static int papr_cmd_smart_inject_fatal(struct ndctl_cmd *cmd, bool enable)
>> +{
>> +	if (papr_smart_inject_valid(cmd) < 0)
>> +		return -EINVAL;
>> +
>> +	to_payload(cmd)->inject.flags |= PDSM_SMART_INJECT_HEALTH_FATAL;
>> +	to_payload(cmd)->inject.fatal_enable = enable;
>>   
>> +	return 0;
>> +}
>> +
>> +static int papr_cmd_smart_inject_unsafe_shutdown(struct ndctl_cmd *cmd,
>> +						 bool enable)
>> +{
>> +	if (papr_smart_inject_valid(cmd) < 0)
>> +		return -EINVAL;
>> +
>> +	to_payload(cmd)->inject.flags |= PDSM_SMART_INJECT_BAD_SHUTDOWN;
>> +	to_payload(cmd)->inject.unsafe_shutdown_enable = enable;
>> +
>> +	return 0;
>>   }
>>   
>>   struct ndctl_dimm_ops * const papr_dimm_ops = &(struct ndctl_dimm_ops) {
>>   	.cmd_is_supported = papr_cmd_is_supported,
>> +	.new_smart_inject = papr_new_smart_inject,
>> +	.smart_inject_supported = papr_smart_inject_supported,
>> +	.smart_inject_fatal = papr_cmd_smart_inject_fatal,
>> +	.smart_inject_unsafe_shutdown = papr_cmd_smart_inject_unsafe_shutdown,
>>   	.smart_get_flags = papr_smart_get_flags,
>>   	.get_firmware_status =  papr_get_firmware_status,
>>   	.xlat_firmware_status = papr_xlat_firmware_status,
>> diff --git a/ndctl/lib/papr_pdsm.h b/ndctl/lib/papr_pdsm.h
>> index f45b1e40c075..20ac20f89acd 100644
>> --- a/ndctl/lib/papr_pdsm.h
>> +++ b/ndctl/lib/papr_pdsm.h
>> @@ -121,12 +121,29 @@ struct nd_papr_pdsm_health {
>>   enum papr_pdsm {
>>   	PAPR_PDSM_MIN = 0x0,
>>   	PAPR_PDSM_HEALTH,
>> +	PAPR_PDSM_SMART_INJECT,
>>   	PAPR_PDSM_MAX,
>>   };
>> +/* Flags for injecting specific smart errors */
>> +#define PDSM_SMART_INJECT_HEALTH_FATAL		(1 << 0)
>> +#define PDSM_SMART_INJECT_BAD_SHUTDOWN		(1 << 1)
>> +
>> +struct nd_papr_pdsm_smart_inject {
>> +	union {
>> +		struct {
>> +			/* One or more of PDSM_SMART_INJECT_ */
>> +			__u32 flags;
>> +			__u8 fatal_enable;
>> +			__u8 unsafe_shutdown_enable;
>> +		};
>> +		__u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
>> +	};
>> +};
>>   
>>   /* Maximal union that can hold all possible payload types */
>>   union nd_pdsm_payload {
>>   	struct nd_papr_pdsm_health health;
>> +	struct nd_papr_pdsm_smart_inject inject;
>>   	__u8 buf[ND_PDSM_PAYLOAD_MAX_SIZE];
>>   } __attribute__((packed));
>>   
>> 
>
>
> -aneesh
>
>

-- 
Cheers
~ Vaibhav

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12 17:31 [ndctl PATCH 0/2] papr: Implement initial support for injecting smart errors Vaibhav Jain
2021-07-12 17:31 ` [ndctl PATCH 1/2] libndctl, intel: Indicate supported smart-inject types Vaibhav Jain
2021-07-12 17:31 ` [ndctl PATCH 2/2] libndctl/papr: Add limited support for inject-smart Vaibhav Jain
2021-07-13  4:43   ` Aneesh Kumar K.V
2021-07-13  7:51     ` Vaibhav Jain

NVDIMM Device and Persistent Memory development

Archives are clonable:
	git clone --mirror https://lore.kernel.org/nvdimm/0 nvdimm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 nvdimm nvdimm/ https://lore.kernel.org/nvdimm \
		nvdimm@lists.linux.dev
	public-inbox-index nvdimm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/dev.linux.lists.nvdimm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git