All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code
@ 2022-03-17 23:40 Davidlohr Bueso
  2022-03-17 23:40 ` [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment Davidlohr Bueso
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-17 23:40 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ben.widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, dave

Hello,

These are some patchlets that came up while going through the code.

Currently the return_code from a completed mbox command is handled as
either successful or not. This series teaches the driver to better deal
with the different returns from the hardware, allowing better debugging
and mapping to proper kernel errno semantics (which are left unchanged
for now) as well as more ad-hoc handling.

Patches 1 and 2 are small nits.
Patch 3, 4 and 5 implement and use the new calls.

Applies on top of linux-cxl's next branch but have not really been
very tested as I'm not sure how to actually send mbox commands but
thought I'd post regardless.

Davidlohr Bueso (5):
  cxl/mbox: Move mbox_mutex usage comment
  cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code
  cxl/mbox: Improve handling of mbox_cmd return codes
  cxl/mbox: Use new return_code handling
  cxl/mbox: Retry sending mbox command upon RETRY return_code

 drivers/cxl/core/mbox.c | 23 +++++++----
 drivers/cxl/cxlmem.h    | 90 ++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/pci.c       |  7 ++--
 3 files changed, 108 insertions(+), 12 deletions(-)

--
2.26.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment
  2022-03-17 23:40 [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code Davidlohr Bueso
@ 2022-03-17 23:40 ` Davidlohr Bueso
       [not found]   ` <CGME20220322192819uscas1p295d53f4302435e98c58de1226fb2a1db@uscas1p2.samsung.com>
  2022-03-29 22:40   ` Dan Williams
  2022-03-17 23:40 ` [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code Davidlohr Bueso
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-17 23:40 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ben.widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, dave

... this is better served in the callback that actually
grabs the lock.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/cxl/core/mbox.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index be61a0d8016b..778b04a0fb0a 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -136,7 +136,7 @@ static struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
  * @out: Caller allocated buffer for the output.
  * @out_size: Expected size of output.
  *
- * Context: Any context. Will acquire and release mbox_mutex.
+ * Context: Any context.
  * Return:
  *  * %>=0	- Number of bytes returned in @out.
  *  * %-E2BIG	- Payload is too large for hardware.
@@ -165,6 +165,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
 	if (out_size > cxlds->payload_size)
 		return -E2BIG;
 
+	/* acquire and releases the mbox_mutex */
 	rc = cxlds->mbox_send(cxlds, &mbox_cmd);
 	if (rc)
 		return rc;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code
  2022-03-17 23:40 [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code Davidlohr Bueso
  2022-03-17 23:40 ` [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment Davidlohr Bueso
@ 2022-03-17 23:40 ` Davidlohr Bueso
       [not found]   ` <CGME20220322203027uscas1p1b141b932ca4659a7b2cee1faf9dddf50@uscas1p1.samsung.com>
  2022-03-29 23:42   ` Dan Williams
  2022-03-17 23:40 ` [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes Davidlohr Bueso
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-17 23:40 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ben.widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, dave

Also mention the need for the caller to check against any
errors from the hardware in return_code.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/cxl/pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 8a7267d116b7..c77e06aff8dc 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -177,9 +177,9 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
 	mbox_cmd->return_code =
 		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
 
-	if (mbox_cmd->return_code != 0) {
+	if (mbox_cmd->return_code != CXL_MBOX_SUCCESS) {
 		dev_dbg(dev, "Mailbox operation had an error\n");
-		return 0;
+		return 0;  /* completed but caller must check return_code */
 	}
 
 	/* #7 */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes
  2022-03-17 23:40 [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code Davidlohr Bueso
  2022-03-17 23:40 ` [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment Davidlohr Bueso
  2022-03-17 23:40 ` [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code Davidlohr Bueso
@ 2022-03-17 23:40 ` Davidlohr Bueso
       [not found]   ` <CGME20220322205612uscas1p17d7f32b9b23dbadc106ae0daf69477f9@uscas1p1.samsung.com>
  2022-03-30  0:18   ` Dan Williams
  2022-03-17 23:40 ` [PATCH 4/5] cxl/mbox: Use new return_code handling Davidlohr Bueso
  2022-03-17 23:40 ` [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code Davidlohr Bueso
  4 siblings, 2 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-17 23:40 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ben.widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, dave

Upon a completed command the caller is still expected to check
the actual return_code register for to ensure it succeed. This
adds, per the spec, the potential command return codes. It maps
the hardware return code with the kernel's errno style, and by
default continues to use -ENXIO (Command completed, but device
reported an error).

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/cxl/core/mbox.c |  2 +-
 drivers/cxl/cxlmem.h    | 90 ++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/pci.c       |  2 +-
 3 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 778b04a0fb0a..d4d4a16820b7 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -171,7 +171,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
 		return rc;
 
 	/* TODO: Map return code to proper kernel style errno */
-	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
+	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
 		return -ENXIO;
 
 	/*
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 5d33ce24fe09..268597d1ff33 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -85,9 +85,97 @@ struct cxl_mbox_cmd {
 	size_t size_in;
 	size_t size_out;
 	u16 return_code;
-#define CXL_MBOX_SUCCESS 0
 };
 
+/*
+ * Per CXL 2.0 Section 8.2.8.4.5.1
+ */
+enum {
+	CXL_MBOX_CMD_RC_SUCCESS = 0,
+	CXL_MBOX_CMD_RC_BACKGROUND,
+	CXL_MBOX_CMD_RC_INVINPUT,
+	CXL_MBOX_CMD_RC_UNSUPPORTED,
+	CXL_MBOX_CMD_RC_INTERNAL,
+	CXL_MBOX_CMD_RC_RETRY,
+	CXL_MBOX_CMD_RC_BUSY,
+	CXL_MBOX_CMD_RC_MEDIADISABLED,
+	CXL_MBOX_CMD_RC_FWINPROGRESS,
+	CXL_MBOX_CMD_RC_FWOOO,
+	CXL_MBOX_CMD_RC_FWAUTH,
+	CXL_MBOX_CMD_RC_INVSLOT,
+	CXL_MBOX_CMD_RC_FWROLLBACK,
+	CXL_MBOX_CMD_RC_FWRESET,
+	CXL_MBOX_CMD_RC_INVHANDLE,
+	CXL_MBOX_CMD_RC_INVPA,
+	CXL_MBOX_CMD_RC_INJPOISON,
+	CXL_MBOX_CMD_RC_MEDIAFAILURE,
+	CXL_MBOX_CMD_RC_ABORT,
+	CXL_MBOX_CMD_RC_INVSECURITY,
+	CXL_MBOX_CMD_RC_PASSPHRASE,
+	CXL_MBOX_CMD_RC_MBOXUNSUPPORTED,
+	CXL_MBOX_CMD_RC_INVPAYLOAD,
+};
+
+#define CXL_CMD_RC(_value, errno, str)		\
+	[CXL_MBOX_CMD_RC_##_value] = {		\
+		.err = errno,			\
+		.desc =	str,			\
+	}
+
+/*
+ * mbox cmd hardware return_code handling
+ */
+struct cxl_mbox_cmd_rc {
+	int err; /* proper kernel style errno */
+	const char *desc;
+};
+
+static const struct cxl_mbox_cmd_rc cxl_mbox_cmd_rctable[] = {
+	CXL_CMD_RC(SUCCESS, 0, NULL),
+	CXL_CMD_RC(BACKGROUND, ENXIO, "background cmd started successfully"),
+	/* errors here on out */
+	CXL_CMD_RC(INVINPUT, ENXIO, "cmd input was invalid"),
+	CXL_CMD_RC(UNSUPPORTED, ENXIO, "cmd is not supported"),
+	CXL_CMD_RC(INTERNAL, ENXIO, "internal device error"),
+	CXL_CMD_RC(RETRY, ENXIO, "temporary error, retry once"),
+	CXL_CMD_RC(BUSY, ENXIO, "ongoing background operation"),
+	CXL_CMD_RC(MEDIADISABLED, ENXIO, "media access is disabled"),
+	CXL_CMD_RC(FWINPROGRESS, ENXIO,
+			"only one FW package can be transferred at a time"),
+	CXL_CMD_RC(FWOOO, ENXIO,
+			"FW package content was transferred out of order"),
+	CXL_CMD_RC(FWAUTH, ENXIO, "FW package authentication failed"),
+	CXL_CMD_RC(INVSLOT, ENXIO,
+			"FW slot specified is not supported for requested op"),
+	CXL_CMD_RC(FWROLLBACK, ENXIO,
+			"rolled back to the previous active FW"),
+	CXL_CMD_RC(FWRESET, ENXIO, "FW failed to activate, needs cold reset"),
+	CXL_CMD_RC(INVHANDLE, ENXIO,
+			"one or more Event Record Handles were invalid"),
+	CXL_CMD_RC(INVPA, ENXIO, "physical address specified is invalid"),
+	CXL_CMD_RC(INJPOISON, ENXIO,
+			"allowed poison injection has been reached"),
+	CXL_CMD_RC(MEDIAFAILURE, ENXIO, "permanent issue with the media"),
+	CXL_CMD_RC(ABORT, ENXIO, "background cmd was aborted by device"),
+	CXL_CMD_RC(INVSECURITY, ENXIO,
+		   "not valid in the current security state"),
+	CXL_CMD_RC(PASSPHRASE, ENXIO,
+			"passphrase doesn't match current passphrase"),
+	CXL_CMD_RC(MBOXUNSUPPORTED, ENXIO,
+			"not supported on the mailbox it was issued on"),
+	CXL_CMD_RC(INVPAYLOAD, ENXIO, "invalid payload length"),
+};
+
+static inline const char *cxl_mbox_cmd_rc2str(struct cxl_mbox_cmd *mbox_cmd)
+{
+	return cxl_mbox_cmd_rctable[mbox_cmd->return_code].desc;
+}
+
+static inline int cxl_mbox_cmd_rc2errno(struct cxl_mbox_cmd *mbox_cmd)
+{
+	return cxl_mbox_cmd_rctable[mbox_cmd->return_code].err;
+}
+
 /*
  * CXL 2.0 - Memory capacity multiplier
  * See Section 8.2.9.5
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index c77e06aff8dc..0c36d111232b 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -177,7 +177,7 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
 	mbox_cmd->return_code =
 		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
 
-	if (mbox_cmd->return_code != CXL_MBOX_SUCCESS) {
+	if (mbox_cmd->return_code != CXL_MBOX_CMD_RC_SUCCESS) {
 		dev_dbg(dev, "Mailbox operation had an error\n");
 		return 0;  /* completed but caller must check return_code */
 	}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] cxl/mbox: Use new return_code handling
  2022-03-17 23:40 [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code Davidlohr Bueso
                   ` (2 preceding siblings ...)
  2022-03-17 23:40 ` [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes Davidlohr Bueso
@ 2022-03-17 23:40 ` Davidlohr Bueso
       [not found]   ` <CGME20220322210159uscas1p2b7bfb832c275068e1a159cc1183057c2@uscas1p2.samsung.com>
  2022-03-30  0:24   ` Dan Williams
  2022-03-17 23:40 ` [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code Davidlohr Bueso
  4 siblings, 2 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-17 23:40 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ben.widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, dave

Use the global cxl_mbox_cmd_rc table to improve debug messaging
in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd()
to map to proper kernel style errno codes - this patch
continues to use -ENXIO only so no change in semantics.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/cxl/core/mbox.c | 7 ++++---
 drivers/cxl/pci.c       | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d4d4a16820b7..fa9e7043f158 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -170,9 +170,10 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
 	if (rc)
 		return rc;
 
-	/* TODO: Map return code to proper kernel style errno */
-	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
-		return -ENXIO;
+	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
+		int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
+		return -err;
+	}
 
 	/*
 	 * Variable sized commands can't be validated and so it's up to the
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 0c36d111232b..67ec0220826f 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -178,7 +178,8 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
 		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
 
 	if (mbox_cmd->return_code != CXL_MBOX_CMD_RC_SUCCESS) {
-		dev_dbg(dev, "Mailbox operation had an error\n");
+		dev_dbg(dev, "Mailbox operation had an error: %s\n",
+			cxl_mbox_cmd_rc2str(mbox_cmd));
 		return 0;  /* completed but caller must check return_code */
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code
  2022-03-17 23:40 [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code Davidlohr Bueso
                   ` (3 preceding siblings ...)
  2022-03-17 23:40 ` [PATCH 4/5] cxl/mbox: Use new return_code handling Davidlohr Bueso
@ 2022-03-17 23:40 ` Davidlohr Bueso
       [not found]   ` <CGME20220322211013uscas1p2fe8d81180e00417559e3971ddf5bc2fc@uscas1p2.samsung.com>
  2022-03-30  0:42   ` Dan Williams
  4 siblings, 2 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-17 23:40 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ben.widawsky, ira.weiny, vishal.l.verma,
	alison.schofield, dave

As mentioned in CXL 2.0 specs: the hardware can set a Retry
Required if the the command was not completed due to a
temporary error. An optional single retry may resolve the
issue. This is the only retry case as in general they are
not recommended for commands that return an error.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/cxl/core/mbox.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index fa9e7043f158..567987052b68 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -160,20 +160,25 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
 		.size_out = out_size,
 		.payload_out = out,
 	};
-	int rc;
+	int rc, retry = 1;
 
 	if (out_size > cxlds->payload_size)
 		return -E2BIG;
 
-	/* acquire and releases the mbox_mutex */
-	rc = cxlds->mbox_send(cxlds, &mbox_cmd);
-	if (rc)
-		return rc;
+	do {
+		/* acquire and releases the mbox_mutex */
+		rc = cxlds->mbox_send(cxlds, &mbox_cmd);
+		if (rc)
+			return rc;
 
-	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
-		int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
-		return -err;
-	}
+		if (mbox_cmd.return_code == CXL_MBOX_CMD_RC_SUCCESS)
+			break;
+
+		if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_RETRY || !retry) {
+			int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
+			return -err;
+		}
+	} while (retry--);
 
 	/*
 	 * Variable sized commands can't be validated and so it's up to the
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment
       [not found]   ` <CGME20220322192819uscas1p295d53f4302435e98c58de1226fb2a1db@uscas1p2.samsung.com>
@ 2022-03-22 19:28     ` Adam Manzanares
  0 siblings, 0 replies; 18+ messages in thread
From: Adam Manzanares @ 2022-03-22 19:28 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, dan.j.williams, ben.widawsky, ira.weiny,
	vishal.l.verma, alison.schofield

On Thu, Mar 17, 2022 at 04:40:45PM -0700, Davidlohr Bueso wrote:
> ... this is better served in the callback that actually
> grabs the lock.
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index be61a0d8016b..778b04a0fb0a 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -136,7 +136,7 @@ static struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
>   * @out: Caller allocated buffer for the output.
>   * @out_size: Expected size of output.
>   *
> - * Context: Any context. Will acquire and release mbox_mutex.
> + * Context: Any context.
>   * Return:
>   *  * %>=0	- Number of bytes returned in @out.
>   *  * %-E2BIG	- Payload is too large for hardware.
> @@ -165,6 +165,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>  	if (out_size > cxlds->payload_size)
>  		return -E2BIG;
>  
> +	/* acquire and releases the mbox_mutex */
>  	rc = cxlds->mbox_send(cxlds, &mbox_cmd);
>  	if (rc)
>  		return rc;
> 
> -- 
> 2.26.2
> 

Reviewed by: Adam Manzanares <a.manzanares@samsung.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code
       [not found]   ` <CGME20220322203027uscas1p1b141b932ca4659a7b2cee1faf9dddf50@uscas1p1.samsung.com>
@ 2022-03-22 20:30     ` Adam Manzanares
  0 siblings, 0 replies; 18+ messages in thread
From: Adam Manzanares @ 2022-03-22 20:30 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, dan.j.williams, ben.widawsky, ira.weiny,
	vishal.l.verma, alison.schofield

On Thu, Mar 17, 2022 at 04:40:46PM -0700, Davidlohr Bueso wrote:
> Also mention the need for the caller to check against any
> errors from the hardware in return_code.
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/pci.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8a7267d116b7..c77e06aff8dc 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -177,9 +177,9 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>  	mbox_cmd->return_code =
>  		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>  
> -	if (mbox_cmd->return_code != 0) {
> +	if (mbox_cmd->return_code != CXL_MBOX_SUCCESS) {
>  		dev_dbg(dev, "Mailbox operation had an error\n");
> -		return 0;
> +		return 0;  /* completed but caller must check return_code */
>  	}
>  
>  	/* #7 */
> 
> -- 
> 2.26.2
> 

Reviewed by: Adam Manzanares <a.manzanares@samsung.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes
       [not found]   ` <CGME20220322205612uscas1p17d7f32b9b23dbadc106ae0daf69477f9@uscas1p1.samsung.com>
@ 2022-03-22 20:56     ` Adam Manzanares
  2022-03-22 21:50       ` Davidlohr Bueso
  0 siblings, 1 reply; 18+ messages in thread
From: Adam Manzanares @ 2022-03-22 20:56 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, dan.j.williams, ben.widawsky, ira.weiny,
	vishal.l.verma, alison.schofield

On Thu, Mar 17, 2022 at 04:40:47PM -0700, Davidlohr Bueso wrote:
> Upon a completed command the caller is still expected to check
> the actual return_code register for to ensure it succeed. This

Is  MB Status register == return_code register here. s/for to/to
> adds, per the spec, the potential command return codes. It maps
> the hardware return code with the kernel's errno style, and by
> default continues to use -ENXIO (Command completed, but device
> reported an error).
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c |  2 +-
>  drivers/cxl/cxlmem.h    | 90 ++++++++++++++++++++++++++++++++++++++++-
>  drivers/cxl/pci.c       |  2 +-
>  3 files changed, 91 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 778b04a0fb0a..d4d4a16820b7 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -171,7 +171,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>  		return rc;
>  
>  	/* TODO: Map return code to proper kernel style errno */
> -	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
> +	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>  		return -ENXIO;
>  
>  	/*
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 5d33ce24fe09..268597d1ff33 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -85,9 +85,97 @@ struct cxl_mbox_cmd {
>  	size_t size_in;
>  	size_t size_out;
>  	u16 return_code;
> -#define CXL_MBOX_SUCCESS 0
>  };
>  
> +/*
> + * Per CXL 2.0 Section 8.2.8.4.5.1
> + */
> +enum {
> +	CXL_MBOX_CMD_RC_SUCCESS = 0,
> +	CXL_MBOX_CMD_RC_BACKGROUND,
> +	CXL_MBOX_CMD_RC_INVINPUT,
> +	CXL_MBOX_CMD_RC_UNSUPPORTED,
> +	CXL_MBOX_CMD_RC_INTERNAL,
> +	CXL_MBOX_CMD_RC_RETRY,
> +	CXL_MBOX_CMD_RC_BUSY,
> +	CXL_MBOX_CMD_RC_MEDIADISABLED,
> +	CXL_MBOX_CMD_RC_FWINPROGRESS,
> +	CXL_MBOX_CMD_RC_FWOOO,
> +	CXL_MBOX_CMD_RC_FWAUTH,
> +	CXL_MBOX_CMD_RC_INVSLOT,

Should we keep the FW prefix for INVSLOT, since this is still related to a 
FW operation?

> +	CXL_MBOX_CMD_RC_FWROLLBACK,
> +	CXL_MBOX_CMD_RC_FWRESET,
> +	CXL_MBOX_CMD_RC_INVHANDLE,
> +	CXL_MBOX_CMD_RC_INVPA,
> +	CXL_MBOX_CMD_RC_INJPOISON,

We should indicate that we hit the poison limit, something like the 
following POISONLMT

> +	CXL_MBOX_CMD_RC_MEDIAFAILURE,
> +	CXL_MBOX_CMD_RC_ABORT,
> +	CXL_MBOX_CMD_RC_INVSECURITY,
> +	CXL_MBOX_CMD_RC_PASSPHRASE,
> +	CXL_MBOX_CMD_RC_MBOXUNSUPPORTED,
> +	CXL_MBOX_CMD_RC_INVPAYLOAD,

Might as well be more descriptive here PAYLOADLEN, looking at this more now
maybe we drop all of the INV prefixes since most of the return codes are error
codes. 

> +};
> +
> +#define CXL_CMD_RC(_value, errno, str)		\
> +	[CXL_MBOX_CMD_RC_##_value] = {		\
> +		.err = errno,			\
> +		.desc =	str,			\
> +	}
> +
> +/*
> + * mbox cmd hardware return_code handling
> + */
> +struct cxl_mbox_cmd_rc {
> +	int err; /* proper kernel style errno */
> +	const char *desc;
> +};
> +
> +static const struct cxl_mbox_cmd_rc cxl_mbox_cmd_rctable[] = {
> +	CXL_CMD_RC(SUCCESS, 0, NULL),
> +	CXL_CMD_RC(BACKGROUND, ENXIO, "background cmd started successfully"),
> +	/* errors here on out */
> +	CXL_CMD_RC(INVINPUT, ENXIO, "cmd input was invalid"),
> +	CXL_CMD_RC(UNSUPPORTED, ENXIO, "cmd is not supported"),
> +	CXL_CMD_RC(INTERNAL, ENXIO, "internal device error"),
> +	CXL_CMD_RC(RETRY, ENXIO, "temporary error, retry once"),
> +	CXL_CMD_RC(BUSY, ENXIO, "ongoing background operation"),
> +	CXL_CMD_RC(MEDIADISABLED, ENXIO, "media access is disabled"),
> +	CXL_CMD_RC(FWINPROGRESS, ENXIO,
> +			"only one FW package can be transferred at a time"),
> +	CXL_CMD_RC(FWOOO, ENXIO,
> +			"FW package content was transferred out of order"),
> +	CXL_CMD_RC(FWAUTH, ENXIO, "FW package authentication failed"),
> +	CXL_CMD_RC(INVSLOT, ENXIO,
> +			"FW slot specified is not supported for requested op"),
> +	CXL_CMD_RC(FWROLLBACK, ENXIO,
> +			"rolled back to the previous active FW"),
> +	CXL_CMD_RC(FWRESET, ENXIO, "FW failed to activate, needs cold reset"),
> +	CXL_CMD_RC(INVHANDLE, ENXIO,
> +			"one or more Event Record Handles were invalid"),
> +	CXL_CMD_RC(INVPA, ENXIO, "physical address specified is invalid"),
> +	CXL_CMD_RC(INJPOISON, ENXIO,
> +			"allowed poison injection has been reached"),

injection limit

> +	CXL_CMD_RC(MEDIAFAILURE, ENXIO, "permanent issue with the media"),
> +	CXL_CMD_RC(ABORT, ENXIO, "background cmd was aborted by device"),
> +	CXL_CMD_RC(INVSECURITY, ENXIO,
> +		   "not valid in the current security state"),
> +	CXL_CMD_RC(PASSPHRASE, ENXIO,
> +			"passphrase doesn't match current passphrase"),

current set passphrase

> +	CXL_CMD_RC(MBOXUNSUPPORTED, ENXIO,
> +			"not supported on the mailbox it was issued on"),
> +	CXL_CMD_RC(INVPAYLOAD, ENXIO, "invalid payload length"),
> +};
> +
> +static inline const char *cxl_mbox_cmd_rc2str(struct cxl_mbox_cmd *mbox_cmd)
> +{
> +	return cxl_mbox_cmd_rctable[mbox_cmd->return_code].desc;
> +}
> +
> +static inline int cxl_mbox_cmd_rc2errno(struct cxl_mbox_cmd *mbox_cmd)
> +{
> +	return cxl_mbox_cmd_rctable[mbox_cmd->return_code].err;
> +}
> +
>  /*
>   * CXL 2.0 - Memory capacity multiplier
>   * See Section 8.2.9.5
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index c77e06aff8dc..0c36d111232b 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -177,7 +177,7 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>  	mbox_cmd->return_code =
>  		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>  
> -	if (mbox_cmd->return_code != CXL_MBOX_SUCCESS) {
> +	if (mbox_cmd->return_code != CXL_MBOX_CMD_RC_SUCCESS) {
>  		dev_dbg(dev, "Mailbox operation had an error\n");
>  		return 0;  /* completed but caller must check return_code */
>  	}
> 
> -- 
> 2.26.2
>

Other than the minor nits. 
Looks good.

Reviewed by: Adam Manzanares <a.manzanares@samsung.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] cxl/mbox: Use new return_code handling
       [not found]   ` <CGME20220322210159uscas1p2b7bfb832c275068e1a159cc1183057c2@uscas1p2.samsung.com>
@ 2022-03-22 21:01     ` Adam Manzanares
  0 siblings, 0 replies; 18+ messages in thread
From: Adam Manzanares @ 2022-03-22 21:01 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, dan.j.williams, ben.widawsky, ira.weiny,
	vishal.l.verma, alison.schofield

On Thu, Mar 17, 2022 at 04:40:48PM -0700, Davidlohr Bueso wrote:
> Use the global cxl_mbox_cmd_rc table to improve debug messaging
> in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd()
> to map to proper kernel style errno codes - this patch
> continues to use -ENXIO only so no change in semantics.
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c | 7 ++++---
>  drivers/cxl/pci.c       | 3 ++-
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d4d4a16820b7..fa9e7043f158 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -170,9 +170,10 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>  	if (rc)
>  		return rc;
>  
> -	/* TODO: Map return code to proper kernel style errno */
> -	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
> -		return -ENXIO;
> +	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> +		int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
> +		return -err;
> +	}
>  
>  	/*
>  	 * Variable sized commands can't be validated and so it's up to the
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 0c36d111232b..67ec0220826f 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -178,7 +178,8 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>  		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>  
>  	if (mbox_cmd->return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> -		dev_dbg(dev, "Mailbox operation had an error\n");
> +		dev_dbg(dev, "Mailbox operation had an error: %s\n",
> +			cxl_mbox_cmd_rc2str(mbox_cmd));
>  		return 0;  /* completed but caller must check return_code */
>  	}
>  
> 
> -- 
> 2.26.2
> 

Looks good.

Reviewed by: Adam Manzanares <a.manzanares@samsung.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code
       [not found]   ` <CGME20220322211013uscas1p2fe8d81180e00417559e3971ddf5bc2fc@uscas1p2.samsung.com>
@ 2022-03-22 21:10     ` Adam Manzanares
  0 siblings, 0 replies; 18+ messages in thread
From: Adam Manzanares @ 2022-03-22 21:10 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, dan.j.williams, ben.widawsky, ira.weiny,
	vishal.l.verma, alison.schofield

On Thu, Mar 17, 2022 at 04:40:49PM -0700, Davidlohr Bueso wrote:
> As mentioned in CXL 2.0 specs: the hardware can set a Retry
> Required if the the command was not completed due to a
> temporary error. An optional single retry may resolve the
> issue. This is the only retry case as in general they are
> not recommended for commands that return an error.
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index fa9e7043f158..567987052b68 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -160,20 +160,25 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,

Can you update the documentation cxl_mbox_send_cmd that the function now 
retries commands if the RC indicates it should be done?

>  		.size_out = out_size,
>  		.payload_out = out,
>  	};
> -	int rc;
> +	int rc, retry = 1;
>  
>  	if (out_size > cxlds->payload_size)
>  		return -E2BIG;
>  
> -	/* acquire and releases the mbox_mutex */
> -	rc = cxlds->mbox_send(cxlds, &mbox_cmd);
> -	if (rc)
> -		return rc;
> +	do {
> +		/* acquire and releases the mbox_mutex */
> +		rc = cxlds->mbox_send(cxlds, &mbox_cmd);
> +		if (rc)
> +			return rc;
>  
> -	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> -		int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
> -		return -err;
> -	}
> +		if (mbox_cmd.return_code == CXL_MBOX_CMD_RC_SUCCESS)
> +			break;
> +
> +		if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_RETRY || !retry) {
> +			int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
> +			return -err;
> +		}
> +	} while (retry--);
>  
>  	/*
>  	 * Variable sized commands can't be validated and so it's up to the
> 
> -- 
> 2.26.2
>

Other than the doc change for the cxl_mbox_send_cmd function.
LGTM

Reviewed by: Adam Manzanares <a.manzanares@samsung.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes
  2022-03-22 20:56     ` Adam Manzanares
@ 2022-03-22 21:50       ` Davidlohr Bueso
  0 siblings, 0 replies; 18+ messages in thread
From: Davidlohr Bueso @ 2022-03-22 21:50 UTC (permalink / raw)
  To: Adam Manzanares
  Cc: linux-cxl, dan.j.williams, ben.widawsky, ira.weiny,
	vishal.l.verma, alison.schofield

On Tue, 22 Mar 2022, Adam Manzanares wrote:

>On Thu, Mar 17, 2022 at 04:40:47PM -0700, Davidlohr Bueso wrote:
>> Upon a completed command the caller is still expected to check
>> the actual return_code register for to ensure it succeed. This
>
>Is  MB Status register == return_code register here. s/for to/to
>> adds, per the spec, the potential command return codes. It maps
>> the hardware return code with the kernel's errno style, and by
>> default continues to use -ENXIO (Command completed, but device
>> reported an error).
>>
>> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
>> ---
>>  drivers/cxl/core/mbox.c |  2 +-
>>  drivers/cxl/cxlmem.h    | 90 ++++++++++++++++++++++++++++++++++++++++-
>>  drivers/cxl/pci.c       |  2 +-
>>  3 files changed, 91 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index 778b04a0fb0a..d4d4a16820b7 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -171,7 +171,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>>		return rc;
>>
>>	/* TODO: Map return code to proper kernel style errno */
>> -	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
>> +	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>>		return -ENXIO;
>>
>>	/*
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 5d33ce24fe09..268597d1ff33 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -85,9 +85,97 @@ struct cxl_mbox_cmd {
>>	size_t size_in;
>>	size_t size_out;
>>	u16 return_code;
>> -#define CXL_MBOX_SUCCESS 0
>>  };
>>
>> +/*
>> + * Per CXL 2.0 Section 8.2.8.4.5.1
>> + */
>> +enum {
>> +	CXL_MBOX_CMD_RC_SUCCESS = 0,
>> +	CXL_MBOX_CMD_RC_BACKGROUND,
>> +	CXL_MBOX_CMD_RC_INVINPUT,
>> +	CXL_MBOX_CMD_RC_UNSUPPORTED,
>> +	CXL_MBOX_CMD_RC_INTERNAL,
>> +	CXL_MBOX_CMD_RC_RETRY,
>> +	CXL_MBOX_CMD_RC_BUSY,
>> +	CXL_MBOX_CMD_RC_MEDIADISABLED,
>> +	CXL_MBOX_CMD_RC_FWINPROGRESS,
>> +	CXL_MBOX_CMD_RC_FWOOO,
>> +	CXL_MBOX_CMD_RC_FWAUTH,
>> +	CXL_MBOX_CMD_RC_INVSLOT,
>
>Should we keep the FW prefix for INVSLOT, since this is still related to a
>FW operation?
>
>> +	CXL_MBOX_CMD_RC_FWROLLBACK,
>> +	CXL_MBOX_CMD_RC_FWRESET,
>> +	CXL_MBOX_CMD_RC_INVHANDLE,
>> +	CXL_MBOX_CMD_RC_INVPA,
>> +	CXL_MBOX_CMD_RC_INJPOISON,
>
>We should indicate that we hit the poison limit, something like the
>following POISONLMT
>
>> +	CXL_MBOX_CMD_RC_MEDIAFAILURE,
>> +	CXL_MBOX_CMD_RC_ABORT,
>> +	CXL_MBOX_CMD_RC_INVSECURITY,
>> +	CXL_MBOX_CMD_RC_PASSPHRASE,
>> +	CXL_MBOX_CMD_RC_MBOXUNSUPPORTED,
>> +	CXL_MBOX_CMD_RC_INVPAYLOAD,
>
>Might as well be more descriptive here PAYLOADLEN, looking at this more now
>maybe we drop all of the INV prefixes since most of the return codes are error
>codes.

Yes to all the above.

>
>> +};
>> +
>> +#define CXL_CMD_RC(_value, errno, str)		\
>> +	[CXL_MBOX_CMD_RC_##_value] = {		\
>> +		.err = errno,			\
>> +		.desc =	str,			\
>> +	}
>> +
>> +/*
>> + * mbox cmd hardware return_code handling
>> + */
>> +struct cxl_mbox_cmd_rc {
>> +	int err; /* proper kernel style errno */
>> +	const char *desc;
>> +};
>> +
>> +static const struct cxl_mbox_cmd_rc cxl_mbox_cmd_rctable[] = {
>> +	CXL_CMD_RC(SUCCESS, 0, NULL),
>> +	CXL_CMD_RC(BACKGROUND, ENXIO, "background cmd started successfully"),
>> +	/* errors here on out */
>> +	CXL_CMD_RC(INVINPUT, ENXIO, "cmd input was invalid"),
>> +	CXL_CMD_RC(UNSUPPORTED, ENXIO, "cmd is not supported"),
>> +	CXL_CMD_RC(INTERNAL, ENXIO, "internal device error"),
>> +	CXL_CMD_RC(RETRY, ENXIO, "temporary error, retry once"),
>> +	CXL_CMD_RC(BUSY, ENXIO, "ongoing background operation"),
>> +	CXL_CMD_RC(MEDIADISABLED, ENXIO, "media access is disabled"),
>> +	CXL_CMD_RC(FWINPROGRESS, ENXIO,
>> +			"only one FW package can be transferred at a time"),
>> +	CXL_CMD_RC(FWOOO, ENXIO,
>> +			"FW package content was transferred out of order"),
>> +	CXL_CMD_RC(FWAUTH, ENXIO, "FW package authentication failed"),
>> +	CXL_CMD_RC(INVSLOT, ENXIO,
>> +			"FW slot specified is not supported for requested op"),
>> +	CXL_CMD_RC(FWROLLBACK, ENXIO,
>> +			"rolled back to the previous active FW"),
>> +	CXL_CMD_RC(FWRESET, ENXIO, "FW failed to activate, needs cold reset"),
>> +	CXL_CMD_RC(INVHANDLE, ENXIO,
>> +			"one or more Event Record Handles were invalid"),
>> +	CXL_CMD_RC(INVPA, ENXIO, "physical address specified is invalid"),
>> +	CXL_CMD_RC(INJPOISON, ENXIO,
>> +			"allowed poison injection has been reached"),
>
>injection limit
>
>> +	CXL_CMD_RC(MEDIAFAILURE, ENXIO, "permanent issue with the media"),
>> +	CXL_CMD_RC(ABORT, ENXIO, "background cmd was aborted by device"),
>> +	CXL_CMD_RC(INVSECURITY, ENXIO,
>> +		   "not valid in the current security state"),
>> +	CXL_CMD_RC(PASSPHRASE, ENXIO,
>> +			"passphrase doesn't match current passphrase"),
>
>current set passphrase
>
>> +	CXL_CMD_RC(MBOXUNSUPPORTED, ENXIO,
>> +			"not supported on the mailbox it was issued on"),
>> +	CXL_CMD_RC(INVPAYLOAD, ENXIO, "invalid payload length"),
>> +};
>
>Other than the minor nits.
>Looks good.

Thanks for reviewing. I'll add the changes in a v2 but will wait a bit
for any further feedback.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment
  2022-03-17 23:40 ` [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment Davidlohr Bueso
       [not found]   ` <CGME20220322192819uscas1p295d53f4302435e98c58de1226fb2a1db@uscas1p2.samsung.com>
@ 2022-03-29 22:40   ` Dan Williams
  1 sibling, 0 replies; 18+ messages in thread
From: Dan Williams @ 2022-03-29 22:40 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, Ben Widawsky, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Thu, Mar 17, 2022 at 4:41 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> ... this is better served in the callback that actually
> grabs the lock.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index be61a0d8016b..778b04a0fb0a 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -136,7 +136,7 @@ static struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
>   * @out: Caller allocated buffer for the output.
>   * @out_size: Expected size of output.
>   *
> - * Context: Any context. Will acquire and release mbox_mutex.
> + * Context: Any context.
>   * Return:
>   *  * %>=0     - Number of bytes returned in @out.
>   *  * %-E2BIG  - Payload is too large for hardware.
> @@ -165,6 +165,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>         if (out_size > cxlds->payload_size)
>                 return -E2BIG;
>
> +       /* acquire and releases the mbox_mutex */

I'm not even sure this comment is necessary, because that's what
lockdep is for, and it's also not true for cxl_mock_mbox_send().
Nothing outside of the ->mbox_send() implementation should be assuming
that ->mbox_send() is enforcing serialization. So let's just drop this
second hunk.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code
  2022-03-17 23:40 ` [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code Davidlohr Bueso
       [not found]   ` <CGME20220322203027uscas1p1b141b932ca4659a7b2cee1faf9dddf50@uscas1p1.samsung.com>
@ 2022-03-29 23:42   ` Dan Williams
  2022-03-30  0:21     ` Dan Williams
  1 sibling, 1 reply; 18+ messages in thread
From: Dan Williams @ 2022-03-29 23:42 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, Ben Widawsky, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Thu, Mar 17, 2022 at 4:41 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> Also mention the need for the caller to check against any
> errors from the hardware in return_code.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/pci.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8a7267d116b7..c77e06aff8dc 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -177,9 +177,9 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>         mbox_cmd->return_code =
>                 FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>
> -       if (mbox_cmd->return_code != 0) {
> +       if (mbox_cmd->return_code != CXL_MBOX_SUCCESS) {
>                 dev_dbg(dev, "Mailbox operation had an error\n");
> -               return 0;
> +               return 0;  /* completed but caller must check return_code */

Looks like a good clarification to me.

Applied.

...but note it won't appear anywhere until after v5.18-rc1 is out.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes
  2022-03-17 23:40 ` [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes Davidlohr Bueso
       [not found]   ` <CGME20220322205612uscas1p17d7f32b9b23dbadc106ae0daf69477f9@uscas1p1.samsung.com>
@ 2022-03-30  0:18   ` Dan Williams
  1 sibling, 0 replies; 18+ messages in thread
From: Dan Williams @ 2022-03-30  0:18 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, Ben Widawsky, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Thu, Mar 17, 2022 at 4:41 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> Upon a completed command the caller is still expected to check
> the actual return_code register for to ensure it succeed. This
> adds, per the spec, the potential command return codes. It maps
> the hardware return code with the kernel's errno style, and by
> default continues to use -ENXIO (Command completed, but device
> reported an error).
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c |  2 +-
>  drivers/cxl/cxlmem.h    | 90 ++++++++++++++++++++++++++++++++++++++++-
>  drivers/cxl/pci.c       |  2 +-
>  3 files changed, 91 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 778b04a0fb0a..d4d4a16820b7 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -171,7 +171,7 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>                 return rc;
>
>         /* TODO: Map return code to proper kernel style errno */
> -       if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
> +       if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>                 return -ENXIO;
>
>         /*
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 5d33ce24fe09..268597d1ff33 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -85,9 +85,97 @@ struct cxl_mbox_cmd {
>         size_t size_in;
>         size_t size_out;
>         u16 return_code;
> -#define CXL_MBOX_SUCCESS 0
>  };
>
> +/*
> + * Per CXL 2.0 Section 8.2.8.4.5.1
> + */
> +enum {
> +       CXL_MBOX_CMD_RC_SUCCESS = 0,
> +       CXL_MBOX_CMD_RC_BACKGROUND,
> +       CXL_MBOX_CMD_RC_INVINPUT,
> +       CXL_MBOX_CMD_RC_UNSUPPORTED,
> +       CXL_MBOX_CMD_RC_INTERNAL,
> +       CXL_MBOX_CMD_RC_RETRY,
> +       CXL_MBOX_CMD_RC_BUSY,
> +       CXL_MBOX_CMD_RC_MEDIADISABLED,
> +       CXL_MBOX_CMD_RC_FWINPROGRESS,
> +       CXL_MBOX_CMD_RC_FWOOO,
> +       CXL_MBOX_CMD_RC_FWAUTH,
> +       CXL_MBOX_CMD_RC_INVSLOT,
> +       CXL_MBOX_CMD_RC_FWROLLBACK,
> +       CXL_MBOX_CMD_RC_FWRESET,
> +       CXL_MBOX_CMD_RC_INVHANDLE,
> +       CXL_MBOX_CMD_RC_INVPA,
> +       CXL_MBOX_CMD_RC_INJPOISON,
> +       CXL_MBOX_CMD_RC_MEDIAFAILURE,
> +       CXL_MBOX_CMD_RC_ABORT,
> +       CXL_MBOX_CMD_RC_INVSECURITY,
> +       CXL_MBOX_CMD_RC_PASSPHRASE,
> +       CXL_MBOX_CMD_RC_MBOXUNSUPPORTED,
> +       CXL_MBOX_CMD_RC_INVPAYLOAD,
> +};
> +
> +#define CXL_CMD_RC(_value, errno, str)         \
> +       [CXL_MBOX_CMD_RC_##_value] = {          \
> +               .err = errno,                   \
> +               .desc = str,                    \
> +       }
> +
> +/*
> + * mbox cmd hardware return_code handling
> + */
> +struct cxl_mbox_cmd_rc {
> +       int err; /* proper kernel style errno */
> +       const char *desc;
> +};
> +
> +static const struct cxl_mbox_cmd_rc cxl_mbox_cmd_rctable[] = {
> +       CXL_CMD_RC(SUCCESS, 0, NULL),
> +       CXL_CMD_RC(BACKGROUND, ENXIO, "background cmd started successfully"),
> +       /* errors here on out */
> +       CXL_CMD_RC(INVINPUT, ENXIO, "cmd input was invalid"),
> +       CXL_CMD_RC(UNSUPPORTED, ENXIO, "cmd is not supported"),
> +       CXL_CMD_RC(INTERNAL, ENXIO, "internal device error"),
> +       CXL_CMD_RC(RETRY, ENXIO, "temporary error, retry once"),
> +       CXL_CMD_RC(BUSY, ENXIO, "ongoing background operation"),
> +       CXL_CMD_RC(MEDIADISABLED, ENXIO, "media access is disabled"),
> +       CXL_CMD_RC(FWINPROGRESS, ENXIO,
> +                       "only one FW package can be transferred at a time"),
> +       CXL_CMD_RC(FWOOO, ENXIO,
> +                       "FW package content was transferred out of order"),
> +       CXL_CMD_RC(FWAUTH, ENXIO, "FW package authentication failed"),
> +       CXL_CMD_RC(INVSLOT, ENXIO,
> +                       "FW slot specified is not supported for requested op"),
> +       CXL_CMD_RC(FWROLLBACK, ENXIO,
> +                       "rolled back to the previous active FW"),
> +       CXL_CMD_RC(FWRESET, ENXIO, "FW failed to activate, needs cold reset"),
> +       CXL_CMD_RC(INVHANDLE, ENXIO,
> +                       "one or more Event Record Handles were invalid"),
> +       CXL_CMD_RC(INVPA, ENXIO, "physical address specified is invalid"),
> +       CXL_CMD_RC(INJPOISON, ENXIO,
> +                       "allowed poison injection has been reached"),
> +       CXL_CMD_RC(MEDIAFAILURE, ENXIO, "permanent issue with the media"),
> +       CXL_CMD_RC(ABORT, ENXIO, "background cmd was aborted by device"),
> +       CXL_CMD_RC(INVSECURITY, ENXIO,
> +                  "not valid in the current security state"),
> +       CXL_CMD_RC(PASSPHRASE, ENXIO,
> +                       "passphrase doesn't match current passphrase"),
> +       CXL_CMD_RC(MBOXUNSUPPORTED, ENXIO,
> +                       "not supported on the mailbox it was issued on"),
> +       CXL_CMD_RC(INVPAYLOAD, ENXIO, "invalid payload length"),
> +};

You can define the table and the enums together with Steven's trick [1]

#define CMD_CMD_RC_TABLE                                                    \
        C(SUCCESS,      0,      NULL),                                   \
        C(BACKGROUND,   -ENXIO, "background cmd started successfully"),  \
                ...
        C(PAYLOADLEN,   -ENXIO, "invalid payload length")
#undef C
#define C(a, b, c) CXL_CMD_RC_##a
enum  { CMD_CMD_RC_TABLE };
#undef C
#define C(a, b, c) { b, c }
static struct {
        int err;
        const char *desc;
} cxl_mbox_cmd_rctable[] = { OPS };
#undef C

Ideally you could align the columns so it's easy to read which
hardware code relates to which errno.

[1]: See "cpp tricks and treats": https://lwn.net/Articles/383362/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code
  2022-03-29 23:42   ` Dan Williams
@ 2022-03-30  0:21     ` Dan Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Dan Williams @ 2022-03-30  0:21 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, Ben Widawsky, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Tue, Mar 29, 2022 at 4:42 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Mar 17, 2022 at 4:41 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
> >
> > Also mention the need for the caller to check against any
> > errors from the hardware in return_code.
> >
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > ---
> >  drivers/cxl/pci.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 8a7267d116b7..c77e06aff8dc 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -177,9 +177,9 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
> >         mbox_cmd->return_code =
> >                 FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
> >
> > -       if (mbox_cmd->return_code != 0) {
> > +       if (mbox_cmd->return_code != CXL_MBOX_SUCCESS) {
> >                 dev_dbg(dev, "Mailbox operation had an error\n");
> > -               return 0;
> > +               return 0;  /* completed but caller must check return_code */
>
> Looks like a good clarification to me.
>
> Applied.
>
> ...but note it won't appear anywhere until after v5.18-rc1 is out.

...although given there are already feedback comments on patch 1 and 3
I also don't mind if you resend the full series for v2.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] cxl/mbox: Use new return_code handling
  2022-03-17 23:40 ` [PATCH 4/5] cxl/mbox: Use new return_code handling Davidlohr Bueso
       [not found]   ` <CGME20220322210159uscas1p2b7bfb832c275068e1a159cc1183057c2@uscas1p2.samsung.com>
@ 2022-03-30  0:24   ` Dan Williams
  1 sibling, 0 replies; 18+ messages in thread
From: Dan Williams @ 2022-03-30  0:24 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, Ben Widawsky, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Thu, Mar 17, 2022 at 4:41 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> Use the global cxl_mbox_cmd_rc table to improve debug messaging
> in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd()
> to map to proper kernel style errno codes - this patch
> continues to use -ENXIO only so no change in semantics.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
>  drivers/cxl/core/mbox.c | 7 ++++---
>  drivers/cxl/pci.c       | 3 ++-
>  2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d4d4a16820b7..fa9e7043f158 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -170,9 +170,10 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>         if (rc)
>                 return rc;
>
> -       /* TODO: Map return code to proper kernel style errno */
> -       if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
> -               return -ENXIO;
> +       if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> +               int err = cxl_mbox_cmd_rc2errno(&mbox_cmd);
> +               return -err;

I don't like the need for a local variable and adding the negation.
Just store the negative value directly and call the helper
cxl_mbox_cmd_rc2err() so that this can just do "return
cxl_mbox_cmd_rc2err(&mbox_cmd);".

> +       }
>
>         /*
>          * Variable sized commands can't be validated and so it's up to the
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 0c36d111232b..67ec0220826f 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -178,7 +178,8 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>                 FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>
>         if (mbox_cmd->return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> -               dev_dbg(dev, "Mailbox operation had an error\n");
> +               dev_dbg(dev, "Mailbox operation had an error: %s\n",
> +                       cxl_mbox_cmd_rc2str(mbox_cmd));

Looks good.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code
  2022-03-17 23:40 ` [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code Davidlohr Bueso
       [not found]   ` <CGME20220322211013uscas1p2fe8d81180e00417559e3971ddf5bc2fc@uscas1p2.samsung.com>
@ 2022-03-30  0:42   ` Dan Williams
  1 sibling, 0 replies; 18+ messages in thread
From: Dan Williams @ 2022-03-30  0:42 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-cxl, Ben Widawsky, Weiny, Ira, Vishal L Verma, Schofield, Alison

On Thu, Mar 17, 2022 at 4:41 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> As mentioned in CXL 2.0 specs: the hardware can set a Retry
> Required if the the command was not completed due to a
> temporary error. An optional single retry may resolve the
> issue. This is the only retry case as in general they are
> not recommended for commands that return an error.

So I disagree that the driver should be deploying this behaviour by
default. This was also tried for the ACPI NVDIMM commands and it was
helpful for Linux to not play along and get the firmware implementers
to get their act together and detail when a retry is appropriate. So,
this policy is up to the submitter. If that agent has information that
the kernel does not have about a retry being effective, it is free to
attempt the retry after the failure. Otherwise, the spec is overly
hopeful in expecting an implementation to attempt a speculative /
optional retry. Hopefully, a default Linux policy to fail commands
that end in a retry encourages fixes to devices / firmwares that want
to use optional retry as a crutch... or it encourages the discipline
to come back to the specification working group with explicit
per-command updates to define the scenarios when a retry is mandatory
and not optional.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-03-30  0:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17 23:40 [PATCH 0/5] cxl/mbox: Robustify handling of mbox_cmd.return_code Davidlohr Bueso
2022-03-17 23:40 ` [PATCH 1/5] cxl/mbox: Move mbox_mutex usage comment Davidlohr Bueso
     [not found]   ` <CGME20220322192819uscas1p295d53f4302435e98c58de1226fb2a1db@uscas1p2.samsung.com>
2022-03-22 19:28     ` Adam Manzanares
2022-03-29 22:40   ` Dan Williams
2022-03-17 23:40 ` [PATCH 2/5] cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code Davidlohr Bueso
     [not found]   ` <CGME20220322203027uscas1p1b141b932ca4659a7b2cee1faf9dddf50@uscas1p1.samsung.com>
2022-03-22 20:30     ` Adam Manzanares
2022-03-29 23:42   ` Dan Williams
2022-03-30  0:21     ` Dan Williams
2022-03-17 23:40 ` [PATCH 3/5] cxl/mbox: Improve handling of mbox_cmd return codes Davidlohr Bueso
     [not found]   ` <CGME20220322205612uscas1p17d7f32b9b23dbadc106ae0daf69477f9@uscas1p1.samsung.com>
2022-03-22 20:56     ` Adam Manzanares
2022-03-22 21:50       ` Davidlohr Bueso
2022-03-30  0:18   ` Dan Williams
2022-03-17 23:40 ` [PATCH 4/5] cxl/mbox: Use new return_code handling Davidlohr Bueso
     [not found]   ` <CGME20220322210159uscas1p2b7bfb832c275068e1a159cc1183057c2@uscas1p2.samsung.com>
2022-03-22 21:01     ` Adam Manzanares
2022-03-30  0:24   ` Dan Williams
2022-03-17 23:40 ` [PATCH 5/5] cxl/mbox: Retry sending mbox command upon RETRY return_code Davidlohr Bueso
     [not found]   ` <CGME20220322211013uscas1p2fe8d81180e00417559e3971ddf5bc2fc@uscas1p2.samsung.com>
2022-03-22 21:10     ` Adam Manzanares
2022-03-30  0:42   ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.