linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [ndctl PATCH] papr: Check for command type in papr_xlat_firmware_status()
@ 2020-07-21 11:43 Vaibhav Jain
  2020-07-21 18:27 ` Vishal Verma
  0 siblings, 1 reply; 4+ messages in thread
From: Vaibhav Jain @ 2020-07-21 11:43 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Vaibhav Jain, Aneesh Kumar K . V

We recently discovered intermittent failures while reading label-area
of PAPR-NVDimms and the command 'read-labels' would in such a case
generated empty output like below:

$ sudo ndctl read-labels -j nmem0
[
]
read 0 nmem

Upon investigation we found that this was caused by a spurious error
code returned from ndctl_cmd_submit_xlat() when its called from
ndctl_dimm_read_label_extent() while trying to read the label-area
contents of the NVDIMM.

Digging further it was relieved that ndctl_cmd_submit_xlat() would
always call papr_xlat_firmware_status() via pointer
'papr_dimm_ops->xlat_firmware_status' to retrieve translated firmware
status for all ndctl_cmds even though they arent really PAPR PDSM
commands.

In this case ndctl_cmd->type == ND_CMD_GET_CONFIG_DATA and was
represented by type 'struct nd_cmd_get_config_data_hdr' and
papr_xlat_firmware_status() incorrectly assumed it to be of type
'struct nd_pkg_pdsm' and wrongly dereferenced it returning an invalid
value.

A proper fix for this would probably need introducing a new ndctl_cmd
callback like 'ndctl_cmd.get_xlat_firmware_status' similar to one
introduced in [1]. However such a change could be disruptive, hence
the patch introduces a small workaround in papr_xlat_firmware_status()
that checks if the 'struct ndctl_cmd *' provided if of correct type
CMD_CALL and if not then it ignores it and return '0'

References:
[1]: commit fa754dd8acdb ("ndctl/dimm: Rework dimm command status
reporting")

Fixes: 151d2876c49e ("papr: Add scaffolding to issue and handle PDSM requests")
Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 ndctl/lib/papr.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/ndctl/lib/papr.c b/ndctl/lib/papr.c
index d9ce253369b3..63561f8f9797 100644
--- a/ndctl/lib/papr.c
+++ b/ndctl/lib/papr.c
@@ -56,9 +56,7 @@ static u32 papr_get_firmware_status(struct ndctl_cmd *cmd)
 
 static int papr_xlat_firmware_status(struct ndctl_cmd *cmd)
 {
-	const struct nd_pkg_pdsm *pcmd = to_pdsm(cmd);
-
-	return pcmd->cmd_status;
+	return (cmd->type == ND_CMD_CALL) ? to_pdsm(cmd)->cmd_status : 0;
 }
 
 /* Verify if the given command is supported and valid */
-- 
2.26.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [ndctl PATCH] papr: Check for command type in papr_xlat_firmware_status()
  2020-07-21 11:43 [ndctl PATCH] papr: Check for command type in papr_xlat_firmware_status() Vaibhav Jain
@ 2020-07-21 18:27 ` Vishal Verma
  2020-07-22  1:59   ` Vaibhav Jain
  0 siblings, 1 reply; 4+ messages in thread
From: Vishal Verma @ 2020-07-21 18:27 UTC (permalink / raw)
  To: Vaibhav Jain, linux-nvdimm; +Cc: Aneesh Kumar K . V

On Tue, 2020-07-21 at 17:13 +0530, Vaibhav Jain wrote:
> We recently discovered intermittent failures while reading label-area
> of PAPR-NVDimms and the command 'read-labels' would in such a case
> generated empty output like below:
> 
> $ sudo ndctl read-labels -j nmem0
> [
> ]
> read 0 nmem
> 
> Upon investigation we found that this was caused by a spurious error
> code returned from ndctl_cmd_submit_xlat() when its called from
> ndctl_dimm_read_label_extent() while trying to read the label-area
> contents of the NVDIMM.
> 
> Digging further it was relieved that ndctl_cmd_submit_xlat() would
> always call papr_xlat_firmware_status() via pointer
> 'papr_dimm_ops->xlat_firmware_status' to retrieve translated firmware
> status for all ndctl_cmds even though they arent really PAPR PDSM
> commands.
> 
> In this case ndctl_cmd->type == ND_CMD_GET_CONFIG_DATA and was
> represented by type 'struct nd_cmd_get_config_data_hdr' and
> papr_xlat_firmware_status() incorrectly assumed it to be of type
> 'struct nd_pkg_pdsm' and wrongly dereferenced it returning an invalid
> value.
> 
> A proper fix for this would probably need introducing a new ndctl_cmd
> callback like 'ndctl_cmd.get_xlat_firmware_status' similar to one
> introduced in [1]. However such a change could be disruptive, hence
> the patch introduces a small workaround in papr_xlat_firmware_status()
> that checks if the 'struct ndctl_cmd *' provided if of correct type
> CMD_CALL and if not then it ignores it and return '0'
> 
> References:
> [1]: commit fa754dd8acdb ("ndctl/dimm: Rework dimm command status
> reporting")
> 
> Fixes: 151d2876c49e ("papr: Add scaffolding to issue and handle PDSM requests")
> Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>  ndctl/lib/papr.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/ndctl/lib/papr.c b/ndctl/lib/papr.c
> index d9ce253369b3..63561f8f9797 100644
> --- a/ndctl/lib/papr.c
> +++ b/ndctl/lib/papr.c
> @@ -56,9 +56,7 @@ static u32 papr_get_firmware_status(struct ndctl_cmd *cmd)
>  
>  static int papr_xlat_firmware_status(struct ndctl_cmd *cmd)
>  {
> -	const struct nd_pkg_pdsm *pcmd = to_pdsm(cmd);
> -
> -	return pcmd->cmd_status;
> +	return (cmd->type == ND_CMD_CALL) ? to_pdsm(cmd)->cmd_status : 0;

Is this correct? -- for non ND_CMD_CALL commands this will always return a 0,
and it seems like you will lose any error state for commands
like ND_CMD_SET_CONFIG_DATA.

>  }
>  
>  /* Verify if the given command is supported and valid */

_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [ndctl PATCH] papr: Check for command type in papr_xlat_firmware_status()
  2020-07-21 18:27 ` Vishal Verma
@ 2020-07-22  1:59   ` Vaibhav Jain
  2020-07-22  4:45     ` Verma, Vishal L
  0 siblings, 1 reply; 4+ messages in thread
From: Vaibhav Jain @ 2020-07-22  1:59 UTC (permalink / raw)
  To: Vishal Verma, linux-nvdimm; +Cc: Aneesh Kumar K . V

Vishal Verma <vishal.l.verma@intel.com> writes:

<snip>
>>  static int papr_xlat_firmware_status(struct ndctl_cmd *cmd)
>>  {
>> -	const struct nd_pkg_pdsm *pcmd = to_pdsm(cmd);
>> -
>> -	return pcmd->cmd_status;
>> +	return (cmd->type == ND_CMD_CALL) ? to_pdsm(cmd)->cmd_status : 0;
>
> Is this correct? -- for non ND_CMD_CALL commands this will always return a 0,
> and it seems like you will lose any error state for commands
> like ND_CMD_SET_CONFIG_DATA.
This behaviour is similar to what ndctl_cmd_xlat_firmware_status() does
when corresponding dimm-ops are missing the 'xlat_firmware_status'
callback.

Also ndctl_cmd_submit_xlat() returns the rc from ndctl_cmd_submit()
in case ndctl_cmd_xlat_firmware_status() returns '0', which corresponds
to 'ndctl_cmd.status' field. So any error codes
returned from ndctl_cmd_submit() are still returned back to the caller
even though papr_xlat_firmware_status() returned '0'. 
>
>>  }
>>  
>>  /* Verify if the given command is supported and valid */
>
>

-- 
Cheers
~ Vaibhav
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [ndctl PATCH] papr: Check for command type in papr_xlat_firmware_status()
  2020-07-22  1:59   ` Vaibhav Jain
@ 2020-07-22  4:45     ` Verma, Vishal L
  0 siblings, 0 replies; 4+ messages in thread
From: Verma, Vishal L @ 2020-07-22  4:45 UTC (permalink / raw)
  To: linux-nvdimm, vaibhav; +Cc: aneesh.kumar

On Wed, 2020-07-22 at 07:29 +0530, Vaibhav Jain wrote:
> Vishal Verma <vishal.l.verma@intel.com> writes:
> 
> <snip>
> > >  static int papr_xlat_firmware_status(struct ndctl_cmd *cmd)
> > >  {
> > > -	const struct nd_pkg_pdsm *pcmd = to_pdsm(cmd);
> > > -
> > > -	return pcmd->cmd_status;
> > > +	return (cmd->type == ND_CMD_CALL) ? to_pdsm(cmd)->cmd_status :
> > > 0;
> > 
> > Is this correct? -- for non ND_CMD_CALL commands this will always
> > return a 0,
> > and it seems like you will lose any error state for commands
> > like ND_CMD_SET_CONFIG_DATA.
> This behaviour is similar to what ndctl_cmd_xlat_firmware_status()
> does
> when corresponding dimm-ops are missing the 'xlat_firmware_status'
> callback.
> 
> Also ndctl_cmd_submit_xlat() returns the rc from ndctl_cmd_submit()
> in case ndctl_cmd_xlat_firmware_status() returns '0', which
> corresponds
> to 'ndctl_cmd.status' field. So any error codes
> returned from ndctl_cmd_submit() are still returned back to the caller
> even though papr_xlat_firmware_status() returned '0'. 

Ah yes you're right. I'll queue this up for v69 and we can do the more
involved fix in the next cycle.

Thanks!
-Vishal
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-22  4:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-21 11:43 [ndctl PATCH] papr: Check for command type in papr_xlat_firmware_status() Vaibhav Jain
2020-07-21 18:27 ` Vishal Verma
2020-07-22  1:59   ` Vaibhav Jain
2020-07-22  4:45     ` Verma, Vishal L

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).