linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] scsi: lpfc: use unsigned type for num_sge
@ 2023-12-20 16:26 Daniel Wagner
  2023-12-22 18:04 ` Dick Kennedy
  2024-02-06  2:08 ` Martin K. Petersen
  0 siblings, 2 replies; 5+ messages in thread
From: Daniel Wagner @ 2023-12-20 16:26 UTC (permalink / raw)
  To: linux-scsi
  Cc: linux-kernel, James Smart, Dick Kennedy, Hannes Reinecke, Daniel Wagner

From: Hannes Reinecke <hare@suse.de>

LUNs going into “failed ready running” state observed on >1T and on
even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
DIF is enabled at the host.

The kernel logs:

  Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0

The host lpfc driver is failing to setup scatter/gather list
(protection data) for the IO's.

The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes
the compiler to remove the most significant bit. Use an unsigned type
instead.

Signed-off-by: Hannes Reinecke <hare@suse.de>
[dwagner: added commit message]
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index d26941b131fd..bf879d81846b 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
  *
  * Returns the number of SGEs added to the SGL.
  **/
-static int
+static uint32_t
 lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 		struct sli4_sge *sgl, int datasegcnt,
 		struct lpfc_io_buf *lpfc_cmd)
@@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 	struct scatterlist *sgde = NULL; /* s/g data entry */
 	struct sli4_sge_diseed *diseed = NULL;
 	dma_addr_t physaddr;
-	int i = 0, num_sge = 0, status;
-	uint32_t reftag;
+	int i = 0, status;
+	uint32_t reftag, num_sge = 0;
 	uint8_t txop, rxop;
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
 	uint32_t rc;
@@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
  *
  * Returns the number of SGEs added to the SGL.
  **/
-static int
+static uint32_t
 lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 		struct sli4_sge *sgl, int datacnt, int protcnt,
 		struct lpfc_io_buf *lpfc_cmd)
@@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 	uint32_t rc;
 #endif
 	uint32_t checking = 1;
-	uint32_t dma_offset = 0;
-	int num_sge = 0, j = 2;
+	uint32_t dma_offset = 0, num_sge = 0;
+	int j = 2;
 	struct sli4_hybrid_sgl *sgl_xtra = NULL;
 
 	sgpe = scsi_prot_sglist(sc);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
  2023-12-20 16:26 [PATCH] scsi: lpfc: use unsigned type for num_sge Daniel Wagner
@ 2023-12-22 18:04 ` Dick Kennedy
  2024-01-17 10:56   ` Daniel Wagner
  2024-02-06  2:08 ` Martin K. Petersen
  1 sibling, 1 reply; 5+ messages in thread
From: Dick Kennedy @ 2023-12-22 18:04 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-scsi, linux-kernel, James Smart, Hannes Reinecke


[-- Attachment #1.1: Type: text/plain, Size: 4013 bytes --]

The change is good, however, I  don't think this was really the problem.

We would  like to know if this patch really solved an issue they observed.

 A good data point to know is what adapter they're using.
 If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we
 would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff.

 But even then, this patch implies that dma_map_sg() returned a crazy
 huge amount with the MSB set.

On Wed, Dec 20, 2023 at 8:29 AM Daniel Wagner <dwagner@suse.de> wrote:

> From: Hannes Reinecke <hare@suse.de>
>
> LUNs going into “failed ready running” state observed on >1T and on
> even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
> DIF is enabled at the host.
>
> The kernel logs:
>
>   Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
>
> The host lpfc driver is failing to setup scatter/gather list
> (protection data) for the IO's.
>
> The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes
> the compiler to remove the most significant bit. Use an unsigned type
> instead.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> [dwagner: added commit message]
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index d26941b131fd..bf879d81846b 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>   *
>   * Returns the number of SGEs added to the SGL.
>   **/
> -static int
> +static uint32_t
>  lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
>                 struct sli4_sge *sgl, int datasegcnt,
>                 struct lpfc_io_buf *lpfc_cmd)
> @@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>         struct scatterlist *sgde = NULL; /* s/g data entry */
>         struct sli4_sge_diseed *diseed = NULL;
>         dma_addr_t physaddr;
> -       int i = 0, num_sge = 0, status;
> -       uint32_t reftag;
> +       int i = 0, status;
> +       uint32_t reftag, num_sge = 0;
>         uint8_t txop, rxop;
>  #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
>         uint32_t rc;
> @@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>   *
>   * Returns the number of SGEs added to the SGL.
>   **/
> -static int
> +static uint32_t
>  lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
>                 struct sli4_sge *sgl, int datacnt, int protcnt,
>                 struct lpfc_io_buf *lpfc_cmd)
> @@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>         uint32_t rc;
>  #endif
>         uint32_t checking = 1;
> -       uint32_t dma_offset = 0;
> -       int num_sge = 0, j = 2;
> +       uint32_t dma_offset = 0, num_sge = 0;
> +       int j = 2;
>         struct sli4_hybrid_sgl *sgl_xtra = NULL;
>
>         sgpe = scsi_prot_sglist(sc);
> --
> 2.43.0
>
>

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #1.2: Type: text/html, Size: 4845 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
  2023-12-22 18:04 ` Dick Kennedy
@ 2024-01-17 10:56   ` Daniel Wagner
  2024-01-31 14:41     ` Daniel Wagner
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Wagner @ 2024-01-17 10:56 UTC (permalink / raw)
  To: Dick Kennedy; +Cc: linux-scsi, linux-kernel, James Smart, Hannes Reinecke

Hi Dick,

On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> The change is good, however, I  don't think this was really the
> problem.

I tried to write the commit message based on the bug report we got. So
yes, it's possible the it is not correct as I was not really involved
and might missinterpret it.

> We would  like to know if this patch really solved an issue they
> observed.

Yes, it fixes the reported problem.

>  A good data point to know is what adapter they're using.

A bunch of different HPE cards which show this log entry: SN1700E,
SN1610E and SN1200E.

>  If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we
>  would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff.
> 
>  But even then, this patch implies that dma_map_sg() returned a crazy
>  huge amount with the MSB set.

Sure, though this seems to be the case.

One noteworthy information is that DIF needs to be enabled to trigger
it:

# cat /sys/module/lpfc/parameters/lpfc_enable_bg
1
# cat /sys/module/lpfc/parameters/lpfc_prot_guard
2

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
  2024-01-17 10:56   ` Daniel Wagner
@ 2024-01-31 14:41     ` Daniel Wagner
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Wagner @ 2024-01-31 14:41 UTC (permalink / raw)
  To: Dick Kennedy; +Cc: linux-scsi, linux-kernel, James Smart, Hannes Reinecke

On Wed, Jan 17, 2024 at 11:56:27AM +0100, Daniel Wagner wrote:
> Hi Dick,
> 
> On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> > The change is good, however, I  don't think this was really the
> > problem.
> 
> I tried to write the commit message based on the bug report we got. So
> yes, it's possible the it is not correct as I was not really involved
> and might missinterpret it.

Any chance to get this moving forward?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] scsi: lpfc: use unsigned type for num_sge
  2023-12-20 16:26 [PATCH] scsi: lpfc: use unsigned type for num_sge Daniel Wagner
  2023-12-22 18:04 ` Dick Kennedy
@ 2024-02-06  2:08 ` Martin K. Petersen
  1 sibling, 0 replies; 5+ messages in thread
From: Martin K. Petersen @ 2024-02-06  2:08 UTC (permalink / raw)
  To: linux-scsi, Daniel Wagner
  Cc: Martin K . Petersen, linux-kernel, James Smart, Dick Kennedy,
	Hannes Reinecke

On Wed, 20 Dec 2023 17:26:58 +0100, Daniel Wagner wrote:

> LUNs going into “failed ready running” state observed on >1T and on
> even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
> DIF is enabled at the host.
> 
> The kernel logs:
> 
>   Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
> 
> [...]

Applied to 6.8/scsi-fixes, thanks!

[1/1] scsi: lpfc: use unsigned type for num_sge
      https://git.kernel.org/mkp/scsi/c/d6c1b19153f9

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-06  2:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-20 16:26 [PATCH] scsi: lpfc: use unsigned type for num_sge Daniel Wagner
2023-12-22 18:04 ` Dick Kennedy
2024-01-17 10:56   ` Daniel Wagner
2024-01-31 14:41     ` Daniel Wagner
2024-02-06  2:08 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).