All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20] libsas and pm8001 fixes
@ 2022-02-10 11:41 Damien Le Moal
  2022-02-10 11:41 ` [PATCH 01/20] scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands Damien Le Moal
                   ` (20 more replies)
  0 siblings, 21 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:41 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

The first 3 patches fix a problem with libsas handling of NCQ NON DATA
commands and simplify libsas code in a couple of places.
The remaining patches are fixes for the pm8001 driver:
* All sparse warnings are addressed, fixing along the way many le32
  handling bugs for big-endian architectures
* Fix handling of NCQ NON DATA commands
* Fix NCQ error recovery (abort all task execution) that was causing a
  crash
* Simplify the code in many places

With these fixes, libzbc test suite passes all test case. This test
suite was used with an SMR drive for testing because it generates many
NCQ NON DATA commands (for zone management commands) and also generates
many NCQ command errors to check ASC/ASCQ returned by the device. With
the test suite, the error recovery path was extensively exercised.

Note that without these patches, libzbc test suite result in the
controller hanging, or in kernel crashes.

Damien Le Moal (20):
  scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands
  scsi: libsas: simplify sas_ata_qc_issue() detection of NCQ commands
  scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue()
  scsi: pm8001: fix __iomem pointer use in pm8001_phy_control()
  scsi: pm8001: Remove local variable in pm8001_pci_resume()
  scsi: pm8001: Fix pm8001_update_flash() local variable type
  scsi: pm8001: Fix command initialization in pm80XX_send_read_log()
  scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy()
  scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req()
  scsi: pm8001: fix payload initialization in
    pm80xx_set_thermal_config()
  scsi: pm8001: fix le32 values handling in
    pm80xx_set_sas_protocol_timer_config()
  scsi: pm8001: fix payload initialization in pm80xx_encrypt_update()
  scsi: pm8001: fix le32 values handling in pm80xx_chip_ssp_io_req()
  scsi: pm8001: fix le32 values handling in pm80xx_chip_sata_req()
  scsi: pm8001: fix use of struct set_phy_profile_req fields
  scsi: pm8001: simplify pm8001_get_ncq_tag()
  scsi: pm8001: fix NCQ NON DATA command task initialization
  scsi: pm8001: fix NCQ NON DATA command completion handling
  scsi: pm8001: cleanup pm8001_queue_command()
  scsi: pm8001: fix abort all task initialization

 drivers/scsi/libsas/sas_ata.c     |  12 +-
 drivers/scsi/pm8001/pm8001_ctl.c  |   5 +-
 drivers/scsi/pm8001/pm8001_hwi.c  |  23 +--
 drivers/scsi/pm8001/pm8001_init.c |   8 +-
 drivers/scsi/pm8001/pm8001_sas.c  |  48 +++----
 drivers/scsi/pm8001/pm80xx_hwi.c  | 226 ++++++++++++++++--------------
 drivers/scsi/pm8001/pm80xx_hwi.h  |   2 +-
 7 files changed, 169 insertions(+), 155 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/20] scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
@ 2022-02-10 11:41 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 02/20] scsi: libsas: simplify sas_ata_qc_issue() detection of NCQ commands Damien Le Moal
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:41 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

To detect for the DMA_NONE (no data transfer) DMA direction,
sas_ata_qc_issue() tests if the command protocol is ATA_PROT_NODATA.
This test does not include the ATA_CMD_NCQ_NON_DATA command as this
command protocol is defined as ATA_PROT_NCQ_NODATA (equal to
ATA_PROT_FLAG_NCQ) and not as ATA_PROT_NODATA.

To include both NCQ and non-NCQ commands when testing for the DMA_NONE
DMA direction, use "!ata_is_data()".

Fixes: 176ddd89171d ("scsi: libsas: Reset num_scatter if libata marks qc as NODATA")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/libsas/sas_ata.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index a315715b3622..7e0cde710fc3 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -197,7 +197,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 		task->total_xfer_len = qc->nbytes;
 		task->num_scatter = qc->n_elem;
 		task->data_dir = qc->dma_dir;
-	} else if (qc->tf.protocol == ATA_PROT_NODATA) {
+	} else if (!ata_is_data(qc->tf.protocol)) {
 		task->data_dir = DMA_NONE;
 	} else {
 		for_each_sg(qc->sg, sg, qc->n_elem, si)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/20] scsi: libsas: simplify sas_ata_qc_issue() detection of NCQ commands
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
  2022-02-10 11:41 ` [PATCH 01/20] scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 03/20] scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue() Damien Le Moal
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

To detect if a command is NCQ, there is no need to test all possible NCQ
command codes. Instead, use ata_is_ncq() to test the command protocol.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/libsas/sas_ata.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 7e0cde710fc3..99549862c9c7 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -181,14 +181,9 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 	task->task_proto = SAS_PROTOCOL_STP;
 	task->task_done = sas_ata_task_done;
 
-	if (qc->tf.command == ATA_CMD_FPDMA_WRITE ||
-	    qc->tf.command == ATA_CMD_FPDMA_READ ||
-	    qc->tf.command == ATA_CMD_FPDMA_RECV ||
-	    qc->tf.command == ATA_CMD_FPDMA_SEND ||
-	    qc->tf.command == ATA_CMD_NCQ_NON_DATA) {
-		/* Need to zero out the tag libata assigned us */
+	/* For NCQ commands, zero out the tag libata assigned us */
+	if (ata_is_ncq(qc->tf.protocol))
 		qc->tf.nsect = 0;
-	}
 
 	ata_tf_to_fis(&qc->tf, qc->dev->link->pmp, 1, (u8 *)&task->ata_task.fis);
 	task->uldd_task = qc;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/20] scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
  2022-02-10 11:41 ` [PATCH 01/20] scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands Damien Le Moal
  2022-02-10 11:42 ` [PATCH 02/20] scsi: libsas: simplify sas_ata_qc_issue() detection of NCQ commands Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:48   ` John Garry
  2022-02-10 11:42 ` [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control() Damien Le Moal
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

sas_task_alloc() sets the state flag SAS_TASK_STATE_PENDING for any
new task allocated. So there is no need to set this flag again in
sas_ata_qc_issue() after the task allocation.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/libsas/sas_ata.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 99549862c9c7..a03a841921db 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -204,7 +204,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 	}
 	task->scatter = qc->sg;
 	task->ata_task.retry_count = 1;
-	task->task_state_flags = SAS_TASK_STATE_PENDING;
 	qc->lldd_task = task;
 
 	task->ata_task.use_ncq = ata_is_ncq(qc->tf.protocol);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (2 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 03/20] scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-11  6:11   ` Christoph Hellwig
  2022-02-10 11:42 ` [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume() Damien Le Moal
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Avoid the sparse warning "warning: cast removes address space '__iomem'
of expression" by using a local unsigned long variable to store an
iomem address.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_sas.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index 32edda3e55c6..6805c7f43e41 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -234,9 +234,10 @@ int pm8001_phy_control(struct asd_sas_phy *sas_phy, enum phy_func func,
 		}
 		{
 			struct sas_phy *phy = sas_phy->phy;
-			uint32_t *qp = (uint32_t *)(((char *)
-				pm8001_ha->io_mem[2].memvirtaddr)
-				+ 0x1034 + (0x4000 * (phy_id & 3)));
+			unsigned long vaddr = (unsigned long)
+				pm8001_ha->io_mem[2].memvirtaddr;
+			uint32_t *qp = (uint32_t *)
+				(vaddr + 0x1034 + (0x4000 * (phy_id & 3)));
 
 			phy->invalid_dword_count = qp[0];
 			phy->running_disparity_error_count = qp[1];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (3 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 12:04   ` John Garry
  2022-02-10 11:42 ` [PATCH 06/20] scsi: pm8001: Fix pm8001_update_flash() local variable type Damien Le Moal
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

In pm8001_pci_resume(), the use of the u32 type for the local variable
device_state causes a sparse warning:

warning: incorrect type in assignment (different base types)
    expected unsigned int [usertype] device_state
    got restricted pci_power_t [usertype] current_state

Since this variable is used only once in the function, remove it and
use pdev->current_state directly. While at it, also add a blank line
after the last local variable declaration.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_init.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index d8a2121cb8d9..4b9a26f008a9 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -1335,13 +1335,13 @@ static int __maybe_unused pm8001_pci_resume(struct device *dev)
 	struct pm8001_hba_info *pm8001_ha;
 	int rc;
 	u8 i = 0, j;
-	u32 device_state;
 	DECLARE_COMPLETION_ONSTACK(completion);
+
 	pm8001_ha = sha->lldd_ha;
-	device_state = pdev->current_state;
 
-	pm8001_info(pm8001_ha, "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",
-		      pdev, pm8001_ha->name, device_state);
+	pm8001_info(pm8001_ha,
+		    "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",
+		    pdev, pm8001_ha->name, pdev->current_state);
 
 	rc = pci_go_44(pdev);
 	if (rc)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/20] scsi: pm8001: Fix pm8001_update_flash() local variable type
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (4 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 07/20] scsi: pm8001: Fix command initialization in pm80XX_send_read_log() Damien Le Moal
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Chnage the type of partitionSizeTmp from u32 to __be32 to suppress the
sparse warning:

warning: cast to restricted __be32

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_ctl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_ctl.c b/drivers/scsi/pm8001/pm8001_ctl.c
index 41a63c9b719b..03aff0ba3657 100644
--- a/drivers/scsi/pm8001/pm8001_ctl.c
+++ b/drivers/scsi/pm8001/pm8001_ctl.c
@@ -721,7 +721,8 @@ static int pm8001_update_flash(struct pm8001_hba_info *pm8001_ha)
 	DECLARE_COMPLETION_ONSTACK(completion);
 	u8		*ioctlbuffer;
 	struct fw_control_info	*fwControl;
-	u32		partitionSize, partitionSizeTmp;
+	__be32		partitionSizeTmp;
+	u32		partitionSize;
 	u32		loopNumber, loopcount;
 	struct pm8001_fw_image_header *image_hdr;
 	u32		sizeRead = 0;
@@ -740,7 +741,7 @@ static int pm8001_update_flash(struct pm8001_hba_info *pm8001_ha)
 	image_hdr = (struct pm8001_fw_image_header *)pm8001_ha->fw_image->data;
 	while (sizeRead < pm8001_ha->fw_image->size) {
 		partitionSizeTmp =
-			*(u32 *)((u8 *)&image_hdr->image_length + sizeRead);
+			*(__be32 *)((u8 *)&image_hdr->image_length + sizeRead);
 		partitionSize = be32_to_cpu(partitionSizeTmp);
 		loopcount = DIV_ROUND_UP(partitionSize + HEADER_LEN,
 					IOCTL_BUF_SIZE);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/20] scsi: pm8001: Fix command initialization in pm80XX_send_read_log()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (5 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 06/20] scsi: pm8001: Fix pm8001_update_flash() local variable type Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 14:32   ` John Garry
  2022-02-10 11:42 ` [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy() Damien Le Moal
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Since the sata_cmd struct is zeroed out before its fields are
initialized, there is no need for using "|=" to initialize the
ncqtag_atap_dir_m field. Using a standard assignment removes the sparse
warning:

warning: invalid assignment: |=

Also, since the ncqtag_atap_dir_m field has type __le32, use
cpu_to_le32() to generate the assigned value.

Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c | 2 +-
 drivers/scsi/pm8001/pm80xx_hwi.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 9ec310b795c3..d978f7226206 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -1865,7 +1865,7 @@ static void pm8001_send_read_log(struct pm8001_hba_info *pm8001_ha,
 
 	sata_cmd.tag = cpu_to_le32(ccb_tag);
 	sata_cmd.device_id = cpu_to_le32(pm8001_ha_dev->device_id);
-	sata_cmd.ncqtag_atap_dir_m |= ((0x1 << 7) | (0x5 << 9));
+	sata_cmd.ncqtag_atap_dir_m = cpu_to_le32((0x1 << 7) | (0x5 << 9));
 	memcpy(&sata_cmd.sata_fis, &fis, sizeof(struct host_to_dev_fis));
 
 	res = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &sata_cmd,
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 9d20f8009b89..ec6b970e05a1 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1882,7 +1882,7 @@ static void pm80xx_send_read_log(struct pm8001_hba_info *pm8001_ha,
 
 	sata_cmd.tag = cpu_to_le32(ccb_tag);
 	sata_cmd.device_id = cpu_to_le32(pm8001_ha_dev->device_id);
-	sata_cmd.ncqtag_atap_dir_m_dad |= ((0x1 << 7) | (0x5 << 9));
+	sata_cmd.ncqtag_atap_dir_m_dad = cpu_to_le32(((0x1 << 7) | (0x5 << 9)));
 	memcpy(&sata_cmd.sata_fis, &fis, sizeof(struct host_to_dev_fis));
 
 	res = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &sata_cmd,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (6 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 07/20] scsi: pm8001: Fix command initialization in pm80XX_send_read_log() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 14:42   ` John Garry
  2022-02-11  6:14   ` Christoph Hellwig
  2022-02-10 11:42 ` [PATCH 09/20] scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req() Damien Le Moal
                   ` (12 subsequent siblings)
  20 siblings, 2 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Declare the local variable destination1 as a pointer to a __le32 value
rather than a u32. This suppresses the sparse warning:

warning: incorrect type in assignment (different base types)
    expected unsigned int [usertype]
    got restricted __le32 [usertype]

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index ec6b970e05a1..37ede7c79e85 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -71,14 +71,13 @@ static void pm80xx_pci_mem_copy(struct pm8001_hba_info  *pm8001_ha, u32 soffset,
 				u32 dw_count, u32 bus_base_number)
 {
 	u32 index, value, offset;
-	u32 *destination1;
-	destination1 = (u32 *)destination;
+	__le32 *destination1 = (__le32 *)destination;
 
 	for (index = 0; index < dw_count; index += 4, destination1++) {
 		offset = (soffset + index);
 		if (offset < (64 * 1024)) {
 			value = pm8001_cr32(pm8001_ha, bus_base_number, offset);
-			*destination1 =  cpu_to_le32(value);
+			*destination1 = cpu_to_le32(value);
 		}
 	}
 	return;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/20] scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (7 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 10/20] scsi: pm8001: fix payload initialization in pm80xx_set_thermal_config() Damien Le Moal
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

The ds_ads_m field of struct ssp_ini_tm_start_req has the type __le32.
Assigning a value to it should thus use cpu_to_le32(). This fixes the
sparse warning:

warning: incorrect type in assignment (different base types)
   expected restricted __le32 [addressable] [assigned] [usertype] ds_ads_m
   got int

Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index d978f7226206..43c2ab90f711 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -4626,7 +4626,7 @@ int pm8001_chip_ssp_tm_req(struct pm8001_hba_info *pm8001_ha,
 	memcpy(sspTMCmd.lun, task->ssp_task.LUN, 8);
 	sspTMCmd.tag = cpu_to_le32(ccb->ccb_tag);
 	if (pm8001_ha->chip_id != chip_8001)
-		sspTMCmd.ds_ads_m = 0x08;
+		sspTMCmd.ds_ads_m = cpu_to_le32(0x08);
 	circularQ = &pm8001_ha->inbnd_q_tbl[0];
 	ret = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &sspTMCmd,
 			sizeof(sspTMCmd), 0);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/20] scsi: pm8001: fix payload initialization in pm80xx_set_thermal_config()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (8 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 09/20] scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 14:43   ` John Garry
  2022-02-10 11:42 ` [PATCH 11/20] scsi: pm8001: fix le32 values handling in pm80xx_set_sas_protocol_timer_config() Damien Le Moal
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

The fields of the set_ctrl_cfg_req structure have the __le32 type, so
use cpu_to_le32() to assign them. This removes the sparse warnings:

warning: incorrect type in assignment (different base types)
    expected restricted __le32
    got unsigned int

Fixes: 842784e0d15b ("pm80xx: Update For Thermal Page Code")
Fixes: f5860992db55 ("[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 37ede7c79e85..9c7383fb4bdc 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1202,9 +1202,11 @@ pm80xx_set_thermal_config(struct pm8001_hba_info *pm8001_ha)
 	else
 		page_code = THERMAL_PAGE_CODE_8H;
 
-	payload.cfg_pg[0] = (THERMAL_LOG_ENABLE << 9) |
-				(THERMAL_ENABLE << 8) | page_code;
-	payload.cfg_pg[1] = (LTEMPHIL << 24) | (RTEMPHIL << 8);
+	payload.cfg_pg[0] =
+		cpu_to_le32((THERMAL_LOG_ENABLE << 9) |
+			    (THERMAL_ENABLE << 8) | page_code);
+	payload.cfg_pg[1] =
+		cpu_to_le32((LTEMPHIL << 24) | (RTEMPHIL << 8));
 
 	pm8001_dbg(pm8001_ha, DEV,
 		   "Setting up thermal config. cfg_pg 0 0x%x cfg_pg 1 0x%x\n",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/20] scsi: pm8001: fix le32 values handling in pm80xx_set_sas_protocol_timer_config()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (9 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 10/20] scsi: pm8001: fix payload initialization in pm80xx_set_thermal_config() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 12/20] scsi: pm8001: fix payload initialization in pm80xx_encrypt_update() Damien Le Moal
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

All fields of the SASProtocolTimerConfig structure have the __le32 type.
As such, use cpu_to_le32() to initialize them. This change suppresses
many sparse warnings:

warning: incorrect type in assignment (different base types)
   expected restricted __le32 [addressable] [usertype] pageCode
   got int

Note that the check to limit the value of the STP_IDLE_TMO field is
removed as this field is initialized using the fixed (and small) value
defined by the STP_IDLE_TIME macro.

The pm8001_dbg() calls printing the values of the SASProtocolTimerConfig
structure fileds are changed to use le32_to_cpu() to present the values
in human readable form.

Fixes: a6cb3d012b98 ("[SCSI] pm80xx: thermal, sas controller config and error handling update")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 52 +++++++++++++++-----------------
 1 file changed, 25 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 9c7383fb4bdc..84322d694391 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1246,43 +1246,41 @@ pm80xx_set_sas_protocol_timer_config(struct pm8001_hba_info *pm8001_ha)
 	circularQ = &pm8001_ha->inbnd_q_tbl[0];
 	payload.tag = cpu_to_le32(tag);
 
-	SASConfigPage.pageCode        =  SAS_PROTOCOL_TIMER_CONFIG_PAGE;
-	SASConfigPage.MST_MSI         =  3 << 15;
-	SASConfigPage.STP_SSP_MCT_TMO =  (STP_MCT_TMO << 16) | SSP_MCT_TMO;
-	SASConfigPage.STP_FRM_TMO     = (SAS_MAX_OPEN_TIME << 24) |
-				(SMP_MAX_CONN_TIMER << 16) | STP_FRM_TIMER;
-	SASConfigPage.STP_IDLE_TMO    =  STP_IDLE_TIME;
-
-	if (SASConfigPage.STP_IDLE_TMO > 0x3FFFFFF)
-		SASConfigPage.STP_IDLE_TMO = 0x3FFFFFF;
-
-
-	SASConfigPage.OPNRJT_RTRY_INTVL =         (SAS_MFD << 16) |
-						SAS_OPNRJT_RTRY_INTVL;
-	SASConfigPage.Data_Cmd_OPNRJT_RTRY_TMO =  (SAS_DOPNRJT_RTRY_TMO << 16)
-						| SAS_COPNRJT_RTRY_TMO;
-	SASConfigPage.Data_Cmd_OPNRJT_RTRY_THR =  (SAS_DOPNRJT_RTRY_THR << 16)
-						| SAS_COPNRJT_RTRY_THR;
-	SASConfigPage.MAX_AIP =  SAS_MAX_AIP;
+	SASConfigPage.pageCode = cpu_to_le32(SAS_PROTOCOL_TIMER_CONFIG_PAGE);
+	SASConfigPage.MST_MSI = cpu_to_le32(3 << 15);
+	SASConfigPage.STP_SSP_MCT_TMO =
+		cpu_to_le32((STP_MCT_TMO << 16) | SSP_MCT_TMO);
+	SASConfigPage.STP_FRM_TMO =
+		cpu_to_le32((SAS_MAX_OPEN_TIME << 24) |
+			    (SMP_MAX_CONN_TIMER << 16) | STP_FRM_TIMER);
+	SASConfigPage.STP_IDLE_TMO = cpu_to_le32(STP_IDLE_TIME);
+
+	SASConfigPage.OPNRJT_RTRY_INTVL =
+		cpu_to_le32((SAS_MFD << 16) | SAS_OPNRJT_RTRY_INTVL);
+	SASConfigPage.Data_Cmd_OPNRJT_RTRY_TMO =
+		cpu_to_le32((SAS_DOPNRJT_RTRY_TMO << 16) | SAS_COPNRJT_RTRY_TMO);
+	SASConfigPage.Data_Cmd_OPNRJT_RTRY_THR =
+		cpu_to_le32((SAS_DOPNRJT_RTRY_THR << 16) | SAS_COPNRJT_RTRY_THR);
+	SASConfigPage.MAX_AIP = cpu_to_le32(SAS_MAX_AIP);
 
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.pageCode 0x%08x\n",
-		   SASConfigPage.pageCode);
+		   le32_to_cpu(SASConfigPage.pageCode));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.MST_MSI  0x%08x\n",
-		   SASConfigPage.MST_MSI);
+		   le32_to_cpu(SASConfigPage.MST_MSI));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.STP_SSP_MCT_TMO  0x%08x\n",
-		   SASConfigPage.STP_SSP_MCT_TMO);
+		   le32_to_cpu(SASConfigPage.STP_SSP_MCT_TMO));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.STP_FRM_TMO  0x%08x\n",
-		   SASConfigPage.STP_FRM_TMO);
+		   le32_to_cpu(SASConfigPage.STP_FRM_TMO));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.STP_IDLE_TMO  0x%08x\n",
-		   SASConfigPage.STP_IDLE_TMO);
+		   le32_to_cpu(SASConfigPage.STP_IDLE_TMO));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.OPNRJT_RTRY_INTVL  0x%08x\n",
-		   SASConfigPage.OPNRJT_RTRY_INTVL);
+		   le32_to_cpu(SASConfigPage.OPNRJT_RTRY_INTVL));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.Data_Cmd_OPNRJT_RTRY_TMO  0x%08x\n",
-		   SASConfigPage.Data_Cmd_OPNRJT_RTRY_TMO);
+		   le32_to_cpu(SASConfigPage.Data_Cmd_OPNRJT_RTRY_TMO));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.Data_Cmd_OPNRJT_RTRY_THR  0x%08x\n",
-		   SASConfigPage.Data_Cmd_OPNRJT_RTRY_THR);
+		   le32_to_cpu(SASConfigPage.Data_Cmd_OPNRJT_RTRY_THR));
 	pm8001_dbg(pm8001_ha, INIT, "SASConfigPage.MAX_AIP  0x%08x\n",
-		   SASConfigPage.MAX_AIP);
+		   le32_to_cpu(SASConfigPage.MAX_AIP));
 
 	memcpy(&payload.cfg_pg, &SASConfigPage,
 			 sizeof(SASProtocolTimerConfig_t));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/20] scsi: pm8001: fix payload initialization in pm80xx_encrypt_update()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (10 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 11/20] scsi: pm8001: fix le32 values handling in pm80xx_set_sas_protocol_timer_config() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 13/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_ssp_io_req() Damien Le Moal
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

All fields of the kek_mgmt_req structure have the type __le32. So make
sure to use cpu_to_le32() to initialize them. This suppresses the sparse
warning:

warning: incorrect type in assignment (different base types)
   expected restricted __le32 [addressable] [assigned] [usertype] new_curidx_ksop
   got int

Fixes: 5860992db55c ("[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 84322d694391..d37599a94003 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1406,12 +1406,13 @@ static int pm80xx_encrypt_update(struct pm8001_hba_info *pm8001_ha)
 	/* Currently only one key is used. New KEK index is 1.
 	 * Current KEK index is 1. Store KEK to NVRAM is 1.
 	 */
-	payload.new_curidx_ksop = ((1 << 24) | (1 << 16) | (1 << 8) |
-					KEK_MGMT_SUBOP_KEYCARDUPDATE);
+	payload.new_curidx_ksop =
+		cpu_to_le32(((1 << 24) | (1 << 16) | (1 << 8) |
+			     KEK_MGMT_SUBOP_KEYCARDUPDATE));
 
 	pm8001_dbg(pm8001_ha, DEV,
 		   "Saving Encryption info to flash. payload 0x%x\n",
-		   payload.new_curidx_ksop);
+		   le32_to_cpu(payload.new_curidx_ksop));
 
 	rc = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &payload,
 			sizeof(payload), 0);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_ssp_io_req()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (11 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 12/20] scsi: pm8001: fix payload initialization in pm80xx_encrypt_update() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 14/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_sata_req() Damien Le Moal
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Make sure that the __le32 fields of struct ssp_ini_io_start_req are
manipulated after applying the correct endian conversion. That is, use
cpu_to_le32() for assigning values and le32_to_cpu() for consulting a
field value. In particular, make sure that the calculations for the 4G
boundary check are done using CPU endianness and *not* little endian
values. With these fixes, many sparse warnings are removed.

While at it, add blank lines after variable declarations and in some
other places to make this code more readable.

Fixes: 0ecdf00ba6e5 ("[SCSI] pm80xx: 4G boundary fix.")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 41 +++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index d37599a94003..2626a5bb0e4c 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -4379,13 +4379,15 @@ static int pm80xx_chip_ssp_io_req(struct pm8001_hba_info *pm8001_ha,
 	struct ssp_ini_io_start_req ssp_cmd;
 	u32 tag = ccb->ccb_tag;
 	int ret;
-	u64 phys_addr, start_addr, end_addr;
+	u64 phys_addr, end_addr;
 	u32 end_addr_high, end_addr_low;
 	struct inbound_queue_table *circularQ;
 	u32 q_index, cpu_id;
 	u32 opc = OPC_INB_SSPINIIOSTART;
+
 	memset(&ssp_cmd, 0, sizeof(ssp_cmd));
 	memcpy(ssp_cmd.ssp_iu.lun, task->ssp_task.LUN, 8);
+
 	/* data address domain added for spcv; set to 0 by host,
 	 * used internally by controller
 	 * 0 for SAS 1.1 and SAS 2.0 compatible TLR
@@ -4396,7 +4398,7 @@ static int pm80xx_chip_ssp_io_req(struct pm8001_hba_info *pm8001_ha,
 	ssp_cmd.device_id = cpu_to_le32(pm8001_dev->device_id);
 	ssp_cmd.tag = cpu_to_le32(tag);
 	if (task->ssp_task.enable_first_burst)
-		ssp_cmd.ssp_iu.efb_prio_attr |= 0x80;
+		ssp_cmd.ssp_iu.efb_prio_attr = 0x80;
 	ssp_cmd.ssp_iu.efb_prio_attr |= (task->ssp_task.task_prio << 3);
 	ssp_cmd.ssp_iu.efb_prio_attr |= (task->ssp_task.task_attr & 7);
 	memcpy(ssp_cmd.ssp_iu.cdb, task->ssp_task.cmd->cmnd,
@@ -4428,21 +4430,24 @@ static int pm80xx_chip_ssp_io_req(struct pm8001_hba_info *pm8001_ha,
 			ssp_cmd.enc_esgl = cpu_to_le32(1<<31);
 		} else if (task->num_scatter == 1) {
 			u64 dma_addr = sg_dma_address(task->scatter);
+
 			ssp_cmd.enc_addr_low =
 				cpu_to_le32(lower_32_bits(dma_addr));
 			ssp_cmd.enc_addr_high =
 				cpu_to_le32(upper_32_bits(dma_addr));
 			ssp_cmd.enc_len = cpu_to_le32(task->total_xfer_len);
 			ssp_cmd.enc_esgl = 0;
+
 			/* Check 4G Boundary */
-			start_addr = cpu_to_le64(dma_addr);
-			end_addr = (start_addr + ssp_cmd.enc_len) - 1;
-			end_addr_low = cpu_to_le32(lower_32_bits(end_addr));
-			end_addr_high = cpu_to_le32(upper_32_bits(end_addr));
-			if (end_addr_high != ssp_cmd.enc_addr_high) {
+			end_addr = dma_addr + le32_to_cpu(ssp_cmd.enc_len) - 1;
+			end_addr_low = lower_32_bits(end_addr);
+			end_addr_high = upper_32_bits(end_addr);
+
+			if (end_addr_high != le32_to_cpu(ssp_cmd.enc_addr_high)) {
 				pm8001_dbg(pm8001_ha, FAIL,
 					   "The sg list address start_addr=0x%016llx data_len=0x%x end_addr_high=0x%08x end_addr_low=0x%08x has crossed 4G boundary\n",
-					   start_addr, ssp_cmd.enc_len,
+					   dma_addr,
+					   le32_to_cpu(ssp_cmd.enc_len),
 					   end_addr_high, end_addr_low);
 				pm8001_chip_make_sg(task->scatter, 1,
 					ccb->buf_prd);
@@ -4451,7 +4456,7 @@ static int pm80xx_chip_ssp_io_req(struct pm8001_hba_info *pm8001_ha,
 					cpu_to_le32(lower_32_bits(phys_addr));
 				ssp_cmd.enc_addr_high =
 					cpu_to_le32(upper_32_bits(phys_addr));
-				ssp_cmd.enc_esgl = cpu_to_le32(1<<31);
+				ssp_cmd.enc_esgl = cpu_to_le32(1U<<31);
 			}
 		} else if (task->num_scatter == 0) {
 			ssp_cmd.enc_addr_low = 0;
@@ -4459,8 +4464,10 @@ static int pm80xx_chip_ssp_io_req(struct pm8001_hba_info *pm8001_ha,
 			ssp_cmd.enc_len = cpu_to_le32(task->total_xfer_len);
 			ssp_cmd.enc_esgl = 0;
 		}
+
 		/* XTS mode. All other fields are 0 */
-		ssp_cmd.key_cmode = 0x6 << 4;
+		ssp_cmd.key_cmode = cpu_to_le32(0x6 << 4);
+
 		/* set tweak values. Should be the start lba */
 		ssp_cmd.twk_val0 = cpu_to_le32((task->ssp_task.cmd->cmnd[2] << 24) |
 						(task->ssp_task.cmd->cmnd[3] << 16) |
@@ -4482,20 +4489,22 @@ static int pm80xx_chip_ssp_io_req(struct pm8001_hba_info *pm8001_ha,
 			ssp_cmd.esgl = cpu_to_le32(1<<31);
 		} else if (task->num_scatter == 1) {
 			u64 dma_addr = sg_dma_address(task->scatter);
+
 			ssp_cmd.addr_low = cpu_to_le32(lower_32_bits(dma_addr));
 			ssp_cmd.addr_high =
 				cpu_to_le32(upper_32_bits(dma_addr));
 			ssp_cmd.len = cpu_to_le32(task->total_xfer_len);
 			ssp_cmd.esgl = 0;
+
 			/* Check 4G Boundary */
-			start_addr = cpu_to_le64(dma_addr);
-			end_addr = (start_addr + ssp_cmd.len) - 1;
-			end_addr_low = cpu_to_le32(lower_32_bits(end_addr));
-			end_addr_high = cpu_to_le32(upper_32_bits(end_addr));
-			if (end_addr_high != ssp_cmd.addr_high) {
+			end_addr = dma_addr + le32_to_cpu(ssp_cmd.len) - 1;
+			end_addr_low = lower_32_bits(end_addr);
+			end_addr_high = upper_32_bits(end_addr);
+			if (end_addr_high != le32_to_cpu(ssp_cmd.addr_high)) {
 				pm8001_dbg(pm8001_ha, FAIL,
 					   "The sg list address start_addr=0x%016llx data_len=0x%x end_addr_high=0x%08x end_addr_low=0x%08x has crossed 4G boundary\n",
-					   start_addr, ssp_cmd.len,
+					   dma_addr,
+					   le32_to_cpu(ssp_cmd.len),
 					   end_addr_high, end_addr_low);
 				pm8001_chip_make_sg(task->scatter, 1,
 					ccb->buf_prd);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_sata_req()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (12 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 13/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_ssp_io_req() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 15/20] scsi: pm8001: fix use of struct set_phy_profile_req fields Damien Le Moal
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Make sure that the __le32 fields of struct sata_cmd are manipulated
after applying the correct endian conversion. That is, use cpu_to_le32()
for assigning values and le32_to_cpu() for consulting a field value.
In particular, make sure that the calculations for the 4G boundary check
are done using CPU endianness and *not* little endian values. With these
fixes, many sparse warnings are removed.

While at it, fix some code identation and add blank lines after
variable declarations and in some other places to make this code more
readable.

Fixes: 0ecdf00ba6e5 ("[SCSI] pm80xx: 4G boundary fix.")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 82 ++++++++++++++++++--------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 2626a5bb0e4c..fbdc56351901 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -4539,7 +4539,7 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 	u32 q_index, cpu_id;
 	struct sata_start_req sata_cmd;
 	u32 hdr_tag, ncg_tag = 0;
-	u64 phys_addr, start_addr, end_addr;
+	u64 phys_addr, end_addr;
 	u32 end_addr_high, end_addr_low;
 	u32 ATAP = 0x0;
 	u32 dir;
@@ -4600,32 +4600,38 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 			pm8001_chip_make_sg(task->scatter,
 						ccb->n_elem, ccb->buf_prd);
 			phys_addr = ccb->ccb_dma_handle;
-			sata_cmd.enc_addr_low = lower_32_bits(phys_addr);
-			sata_cmd.enc_addr_high = upper_32_bits(phys_addr);
+			sata_cmd.enc_addr_low =
+				cpu_to_le32(lower_32_bits(phys_addr));
+			sata_cmd.enc_addr_high =
+				cpu_to_le32(upper_32_bits(phys_addr));
 			sata_cmd.enc_esgl = cpu_to_le32(1 << 31);
 		} else if (task->num_scatter == 1) {
 			u64 dma_addr = sg_dma_address(task->scatter);
-			sata_cmd.enc_addr_low = lower_32_bits(dma_addr);
-			sata_cmd.enc_addr_high = upper_32_bits(dma_addr);
+
+			sata_cmd.enc_addr_low =
+				cpu_to_le32(lower_32_bits(dma_addr));
+			sata_cmd.enc_addr_high =
+				cpu_to_le32(upper_32_bits(dma_addr));
 			sata_cmd.enc_len = cpu_to_le32(task->total_xfer_len);
 			sata_cmd.enc_esgl = 0;
+
 			/* Check 4G Boundary */
-			start_addr = cpu_to_le64(dma_addr);
-			end_addr = (start_addr + sata_cmd.enc_len) - 1;
-			end_addr_low = cpu_to_le32(lower_32_bits(end_addr));
-			end_addr_high = cpu_to_le32(upper_32_bits(end_addr));
-			if (end_addr_high != sata_cmd.enc_addr_high) {
+			end_addr = dma_addr + le32_to_cpu(sata_cmd.enc_len) - 1;
+			end_addr_low = lower_32_bits(end_addr);
+			end_addr_high = upper_32_bits(end_addr);
+			if (end_addr_high != le32_to_cpu(sata_cmd.enc_addr_high)) {
 				pm8001_dbg(pm8001_ha, FAIL,
 					   "The sg list address start_addr=0x%016llx data_len=0x%x end_addr_high=0x%08x end_addr_low=0x%08x has crossed 4G boundary\n",
-					   start_addr, sata_cmd.enc_len,
+					   dma_addr,
+					   le32_to_cpu(sata_cmd.enc_len),
 					   end_addr_high, end_addr_low);
 				pm8001_chip_make_sg(task->scatter, 1,
 					ccb->buf_prd);
 				phys_addr = ccb->ccb_dma_handle;
 				sata_cmd.enc_addr_low =
-					lower_32_bits(phys_addr);
+					cpu_to_le32(lower_32_bits(phys_addr));
 				sata_cmd.enc_addr_high =
-					upper_32_bits(phys_addr);
+					cpu_to_le32(upper_32_bits(phys_addr));
 				sata_cmd.enc_esgl =
 					cpu_to_le32(1 << 31);
 			}
@@ -4636,7 +4642,8 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 			sata_cmd.enc_esgl = 0;
 		}
 		/* XTS mode. All other fields are 0 */
-		sata_cmd.key_index_mode = 0x6 << 4;
+		sata_cmd.key_index_mode = cpu_to_le32(0x6 << 4);
+
 		/* set tweak values. Should be the start lba */
 		sata_cmd.twk_val0 =
 			cpu_to_le32((sata_cmd.sata_fis.lbal_exp << 24) |
@@ -4662,31 +4669,31 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 			phys_addr = ccb->ccb_dma_handle;
 			sata_cmd.addr_low = lower_32_bits(phys_addr);
 			sata_cmd.addr_high = upper_32_bits(phys_addr);
-			sata_cmd.esgl = cpu_to_le32(1 << 31);
+			sata_cmd.esgl = cpu_to_le32(1U << 31);
 		} else if (task->num_scatter == 1) {
 			u64 dma_addr = sg_dma_address(task->scatter);
+
 			sata_cmd.addr_low = lower_32_bits(dma_addr);
 			sata_cmd.addr_high = upper_32_bits(dma_addr);
 			sata_cmd.len = cpu_to_le32(task->total_xfer_len);
 			sata_cmd.esgl = 0;
+
 			/* Check 4G Boundary */
-			start_addr = cpu_to_le64(dma_addr);
-			end_addr = (start_addr + sata_cmd.len) - 1;
-			end_addr_low = cpu_to_le32(lower_32_bits(end_addr));
-			end_addr_high = cpu_to_le32(upper_32_bits(end_addr));
+			end_addr = dma_addr + le32_to_cpu(sata_cmd.len) - 1;
+			end_addr_low = lower_32_bits(end_addr);
+			end_addr_high = upper_32_bits(end_addr);
 			if (end_addr_high != sata_cmd.addr_high) {
 				pm8001_dbg(pm8001_ha, FAIL,
 					   "The sg list address start_addr=0x%016llx data_len=0x%xend_addr_high=0x%08x end_addr_low=0x%08x has crossed 4G boundary\n",
-					   start_addr, sata_cmd.len,
+					   dma_addr,
+					   le32_to_cpu(sata_cmd.len),
 					   end_addr_high, end_addr_low);
 				pm8001_chip_make_sg(task->scatter, 1,
 					ccb->buf_prd);
 				phys_addr = ccb->ccb_dma_handle;
-				sata_cmd.addr_low =
-					lower_32_bits(phys_addr);
-				sata_cmd.addr_high =
-					upper_32_bits(phys_addr);
-				sata_cmd.esgl = cpu_to_le32(1 << 31);
+				sata_cmd.addr_low = lower_32_bits(phys_addr);
+				sata_cmd.addr_high = upper_32_bits(phys_addr);
+				sata_cmd.esgl = cpu_to_le32(1U << 31);
 			}
 		} else if (task->num_scatter == 0) {
 			sata_cmd.addr_low = 0;
@@ -4694,27 +4701,28 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 			sata_cmd.len = cpu_to_le32(task->total_xfer_len);
 			sata_cmd.esgl = 0;
 		}
+
 		/* scsi cdb */
 		sata_cmd.atapi_scsi_cdb[0] =
 			cpu_to_le32(((task->ata_task.atapi_packet[0]) |
-			(task->ata_task.atapi_packet[1] << 8) |
-			(task->ata_task.atapi_packet[2] << 16) |
-			(task->ata_task.atapi_packet[3] << 24)));
+				     (task->ata_task.atapi_packet[1] << 8) |
+				     (task->ata_task.atapi_packet[2] << 16) |
+				     (task->ata_task.atapi_packet[3] << 24)));
 		sata_cmd.atapi_scsi_cdb[1] =
 			cpu_to_le32(((task->ata_task.atapi_packet[4]) |
-			(task->ata_task.atapi_packet[5] << 8) |
-			(task->ata_task.atapi_packet[6] << 16) |
-			(task->ata_task.atapi_packet[7] << 24)));
+				     (task->ata_task.atapi_packet[5] << 8) |
+				     (task->ata_task.atapi_packet[6] << 16) |
+				     (task->ata_task.atapi_packet[7] << 24)));
 		sata_cmd.atapi_scsi_cdb[2] =
 			cpu_to_le32(((task->ata_task.atapi_packet[8]) |
-			(task->ata_task.atapi_packet[9] << 8) |
-			(task->ata_task.atapi_packet[10] << 16) |
-			(task->ata_task.atapi_packet[11] << 24)));
+				     (task->ata_task.atapi_packet[9] << 8) |
+				     (task->ata_task.atapi_packet[10] << 16) |
+				     (task->ata_task.atapi_packet[11] << 24)));
 		sata_cmd.atapi_scsi_cdb[3] =
 			cpu_to_le32(((task->ata_task.atapi_packet[12]) |
-			(task->ata_task.atapi_packet[13] << 8) |
-			(task->ata_task.atapi_packet[14] << 16) |
-			(task->ata_task.atapi_packet[15] << 24)));
+				     (task->ata_task.atapi_packet[13] << 8) |
+				     (task->ata_task.atapi_packet[14] << 16) |
+				     (task->ata_task.atapi_packet[15] << 24)));
 	}
 
 	/* Check for read log for failed drive and return */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/20] scsi: pm8001: fix use of struct set_phy_profile_req fields
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (13 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 14/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_sata_req() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 16/20] scsi: pm8001: simplify pm8001_get_ncq_tag() Damien Le Moal
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

In mpi_set_phy_profile_req() and pm8001_set_phy_profile_single(), use
cpu_to_le32() to initialize the ppc_phyid field of struct
set_phy_profile_req. Furthermore, fix the definition of the reserved
field of this structure to be an array of __le32, to match the use of
cpu_to_le32() when assigning values. These changes remove several sparse
type warnings.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 12 +++++++-----
 drivers/scsi/pm8001/pm80xx_hwi.h |  2 +-
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index fbdc56351901..dc0a84c81189 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -4976,12 +4976,13 @@ static void mpi_set_phy_profile_req(struct pm8001_hba_info *pm8001_ha,
 		pm8001_dbg(pm8001_ha, FAIL, "Invalid tag\n");
 	circularQ = &pm8001_ha->inbnd_q_tbl[0];
 	payload.tag = cpu_to_le32(tag);
-	payload.ppc_phyid = (((operation & 0xF) << 8) | (phyid  & 0xFF));
+	payload.ppc_phyid =
+		cpu_to_le32(((operation & 0xF) << 8) | (phyid  & 0xFF));
 	pm8001_dbg(pm8001_ha, INIT,
 		   " phy profile command for phy %x ,length is %d\n",
-		   payload.ppc_phyid, length);
+		   le32_to_cpu(payload.ppc_phyid), length);
 	for (i = length; i < (length + PHY_DWORD_LENGTH - 1); i++) {
-		payload.reserved[j] =  cpu_to_le32(*((u32 *)buf + i));
+		payload.reserved[j] = cpu_to_le32(*((u32 *)buf + i));
 		j++;
 	}
 	rc = pm8001_mpi_build_cmd(pm8001_ha, circularQ, opc, &payload,
@@ -5021,8 +5022,9 @@ void pm8001_set_phy_profile_single(struct pm8001_hba_info *pm8001_ha,
 	opc = OPC_INB_SET_PHY_PROFILE;
 
 	payload.tag = cpu_to_le32(tag);
-	payload.ppc_phyid = (((SAS_PHY_ANALOG_SETTINGS_PAGE & 0xF) << 8)
-				| (phy & 0xFF));
+	payload.ppc_phyid =
+		cpu_to_le32(((SAS_PHY_ANALOG_SETTINGS_PAGE & 0xF) << 8)
+			    | (phy & 0xFF));
 
 	for (i = 0; i < length; i++)
 		payload.reserved[i] = cpu_to_le32(*(buf + i));
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h b/drivers/scsi/pm8001/pm80xx_hwi.h
index c41ed039c92a..d66b49323d49 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.h
+++ b/drivers/scsi/pm8001/pm80xx_hwi.h
@@ -972,7 +972,7 @@ struct dek_mgmt_req {
 struct set_phy_profile_req {
 	__le32	tag;
 	__le32	ppc_phyid;
-	u32	reserved[29];
+	__le32	reserved[29];
 } __attribute__((packed, aligned(4)));
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 16/20] scsi: pm8001: simplify pm8001_get_ncq_tag()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (14 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 15/20] scsi: pm8001: fix use of struct set_phy_profile_req fields Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 14:50   ` John Garry
  2022-02-10 11:42 ` [PATCH 17/20] scsi: pm8001: fix NCQ NON DATA command task initialization Damien Le Moal
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

To detect if a command is NCQ, there is no need to test all possible NCQ
command codes. Instead, use ata_is_ncq() to test the command protocol.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_sas.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index 6805c7f43e41..711eaf81f546 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -306,16 +306,12 @@ static int pm8001_task_prep_smp(struct pm8001_hba_info *pm8001_ha,
 u32 pm8001_get_ncq_tag(struct sas_task *task, u32 *tag)
 {
 	struct ata_queued_cmd *qc = task->uldd_task;
-	if (qc) {
-		if (qc->tf.command == ATA_CMD_FPDMA_WRITE ||
-		    qc->tf.command == ATA_CMD_FPDMA_READ ||
-		    qc->tf.command == ATA_CMD_FPDMA_RECV ||
-		    qc->tf.command == ATA_CMD_FPDMA_SEND ||
-		    qc->tf.command == ATA_CMD_NCQ_NON_DATA) {
-			*tag = qc->tag;
-			return 1;
-		}
+
+	if (qc && ata_is_ncq(qc->tf.protocol)) {
+		*tag = qc->tag;
+		return 1;
 	}
+
 	return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 17/20] scsi: pm8001: fix NCQ NON DATA command task initialization
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (15 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 16/20] scsi: pm8001: simplify pm8001_get_ncq_tag() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 18/20] scsi: pm8001: fix NCQ NON DATA command completion handling Damien Le Moal
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

In the pm8001_chip_sata_req() and pm80xx_chip_sata_req() functions, all
tasks with a DMA direction of DMA_NONE (no data transfer) are
initialized using the ATAP value 0x04. However, NCQ NON DATA commands,
while being DMA_NONE commands are NCQ commands and need to be
initialized using the value 0x07 for ATAP, similarly to other NCQ
commands.

Make sure that NCQ NON DATA command tasks are initialized similarly to
other NCQ commands by also testing the task "use_ncq" field in addition
to the DMA direction. While at it, reorganize the code into a chain of
if - else if - else to avoid useless affectations and debug messages.

Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c | 14 +++++++-------
 drivers/scsi/pm8001/pm80xx_hwi.c | 13 ++++++-------
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 43c2ab90f711..9d982eb970fe 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -4271,22 +4271,22 @@ static int pm8001_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 	u32  opc = OPC_INB_SATA_HOST_OPSTART;
 	memset(&sata_cmd, 0, sizeof(sata_cmd));
 	circularQ = &pm8001_ha->inbnd_q_tbl[0];
-	if (task->data_dir == DMA_NONE) {
+
+	if (task->data_dir == DMA_NONE && !task->ata_task.use_ncq) {
 		ATAP = 0x04;  /* no data*/
 		pm8001_dbg(pm8001_ha, IO, "no data\n");
 	} else if (likely(!task->ata_task.device_control_reg_update)) {
-		if (task->ata_task.dma_xfer) {
+		if (task->ata_task.use_ncq &&
+		    dev->sata_dev.class != ATA_DEV_ATAPI) {
+			ATAP = 0x07; /* FPDMA */
+			pm8001_dbg(pm8001_ha, IO, "FPDMA\n");
+		} else if (task->ata_task.dma_xfer) {
 			ATAP = 0x06; /* DMA */
 			pm8001_dbg(pm8001_ha, IO, "DMA\n");
 		} else {
 			ATAP = 0x05; /* PIO*/
 			pm8001_dbg(pm8001_ha, IO, "PIO\n");
 		}
-		if (task->ata_task.use_ncq &&
-			dev->sata_dev.class != ATA_DEV_ATAPI) {
-			ATAP = 0x07; /* FPDMA */
-			pm8001_dbg(pm8001_ha, IO, "FPDMA\n");
-		}
 	}
 	if (task->ata_task.use_ncq && pm8001_get_ncq_tag(task, &hdr_tag)) {
 		task->ata_task.fis.sector_count |= (u8) (hdr_tag << 3);
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index dc0a84c81189..44071a97d23b 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -4551,22 +4551,21 @@ static int pm80xx_chip_sata_req(struct pm8001_hba_info *pm8001_ha,
 	q_index = (u32) (cpu_id) % (pm8001_ha->max_q_num);
 	circularQ = &pm8001_ha->inbnd_q_tbl[q_index];
 
-	if (task->data_dir == DMA_NONE) {
+	if (task->data_dir == DMA_NONE && !task->ata_task.use_ncq) {
 		ATAP = 0x04; /* no data*/
 		pm8001_dbg(pm8001_ha, IO, "no data\n");
 	} else if (likely(!task->ata_task.device_control_reg_update)) {
-		if (task->ata_task.dma_xfer) {
+		if (task->ata_task.use_ncq &&
+		    dev->sata_dev.class != ATA_DEV_ATAPI) {
+			ATAP = 0x07; /* FPDMA */
+			pm8001_dbg(pm8001_ha, IO, "FPDMA\n");
+		} else if (task->ata_task.dma_xfer) {
 			ATAP = 0x06; /* DMA */
 			pm8001_dbg(pm8001_ha, IO, "DMA\n");
 		} else {
 			ATAP = 0x05; /* PIO*/
 			pm8001_dbg(pm8001_ha, IO, "PIO\n");
 		}
-		if (task->ata_task.use_ncq &&
-		    dev->sata_dev.class != ATA_DEV_ATAPI) {
-			ATAP = 0x07; /* FPDMA */
-			pm8001_dbg(pm8001_ha, IO, "FPDMA\n");
-		}
 	}
 	if (task->ata_task.use_ncq && pm8001_get_ncq_tag(task, &hdr_tag)) {
 		task->ata_task.fis.sector_count |= (u8) (hdr_tag << 3);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 18/20] scsi: pm8001: fix NCQ NON DATA command completion handling
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (16 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 17/20] scsi: pm8001: fix NCQ NON DATA command task initialization Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 11:42 ` [PATCH 19/20] scsi: pm8001: cleanup pm8001_queue_command() Damien Le Moal
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

NCQ NON DATA is an NCQ command with the DMA_NONE DMA direction and so a
register-device-to-host-FIS response is expected for it.

However, for an IO_SUCCESS case, mpi_sata_completion() expects a
set-device-bits-FIS for any ata task with an use_ncq field true, which
includes NCQ NON DATA commands.

Fix this to correctly treat NCQ NON DATA commands as non-data by also
testing for the DMA_NONE DMA direction.

Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c | 3 ++-
 drivers/scsi/pm8001/pm80xx_hwi.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 9d982eb970fe..8095eb0b04f7 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -2418,7 +2418,8 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
 				len = sizeof(struct pio_setup_fis);
 				pm8001_dbg(pm8001_ha, IO,
 					   "PIO read len = %d\n", len);
-			} else if (t->ata_task.use_ncq) {
+			} else if (t->ata_task.use_ncq &&
+				   t->data_dir != DMA_NONE) {
 				len = sizeof(struct set_dev_bits_fis);
 				pm8001_dbg(pm8001_ha, IO, "FPDMA len = %d\n",
 					   len);
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 44071a97d23b..4d88c0dbcefc 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -2510,7 +2510,8 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha,
 				len = sizeof(struct pio_setup_fis);
 				pm8001_dbg(pm8001_ha, IO,
 					   "PIO read len = %d\n", len);
-			} else if (t->ata_task.use_ncq) {
+			} else if (t->ata_task.use_ncq &&
+				   t->data_dir != DMA_NONE) {
 				len = sizeof(struct set_dev_bits_fis);
 				pm8001_dbg(pm8001_ha, IO, "FPDMA len = %d\n",
 					   len);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 19/20] scsi: pm8001: cleanup pm8001_queue_command()
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (17 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 18/20] scsi: pm8001: fix NCQ NON DATA command completion handling Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 14:53   ` John Garry
  2022-02-10 11:42 ` [PATCH 20/20] scsi: pm8001: fix abort all task initialization Damien Le Moal
  2022-02-10 15:35 ` [PATCH 00/20] libsas and pm8001 fixes John Garry
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

Avoid repeatedly declaring "struct task_status_struct *ts" to handle
error cases by declaring this variable for the entire function scope.
This allows simplifying the error cases, and together with the addition
of blank lines make the code more readable.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_sas.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index 711eaf81f546..929d3d3c9d3b 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -383,54 +383,53 @@ static int pm8001_task_exec(struct sas_task *task,
 	struct pm8001_device *pm8001_dev;
 	struct pm8001_port *port = NULL;
 	struct sas_task *t = task;
+	struct task_status_struct *ts = &t->task_status;
 	struct pm8001_ccb_info *ccb;
 	u32 tag = 0xdeadbeef, rc = 0, n_elem = 0;
 	unsigned long flags = 0;
 	enum sas_protocol task_proto = t->task_proto;
 
 	if (!dev->port) {
-		struct task_status_struct *tsm = &t->task_status;
-		tsm->resp = SAS_TASK_UNDELIVERED;
-		tsm->stat = SAS_PHY_DOWN;
+		ts->resp = SAS_TASK_UNDELIVERED;
+		ts->stat = SAS_PHY_DOWN;
 		if (dev->dev_type != SAS_SATA_DEV)
 			t->task_done(t);
 		return 0;
 	}
+
 	pm8001_ha = pm8001_find_ha_by_dev(task->dev);
 	if (pm8001_ha->controller_fatal_error) {
-		struct task_status_struct *ts = &t->task_status;
-
 		ts->resp = SAS_TASK_UNDELIVERED;
 		t->task_done(t);
 		return 0;
 	}
+
 	pm8001_dbg(pm8001_ha, IO, "pm8001_task_exec device\n");
+
 	spin_lock_irqsave(&pm8001_ha->lock, flags);
+
 	do {
 		dev = t->dev;
 		pm8001_dev = dev->lldd_dev;
 		port = &pm8001_ha->port[sas_find_local_port_id(dev)];
+
 		if (DEV_IS_GONE(pm8001_dev) || !port->port_attached) {
+			ts->resp = SAS_TASK_UNDELIVERED;
+			ts->stat = SAS_PHY_DOWN;
 			if (sas_protocol_ata(task_proto)) {
-				struct task_status_struct *ts = &t->task_status;
-				ts->resp = SAS_TASK_UNDELIVERED;
-				ts->stat = SAS_PHY_DOWN;
-
 				spin_unlock_irqrestore(&pm8001_ha->lock, flags);
 				t->task_done(t);
 				spin_lock_irqsave(&pm8001_ha->lock, flags);
-				continue;
 			} else {
-				struct task_status_struct *ts = &t->task_status;
-				ts->resp = SAS_TASK_UNDELIVERED;
-				ts->stat = SAS_PHY_DOWN;
 				t->task_done(t);
-				continue;
 			}
+			continue;
 		}
+
 		rc = pm8001_tag_alloc(pm8001_ha, &tag);
 		if (rc)
 			goto err_out;
+
 		ccb = &pm8001_ha->ccb_info[tag];
 
 		if (!sas_protocol_ata(task_proto)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 20/20] scsi: pm8001: fix abort all task initialization
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (18 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 19/20] scsi: pm8001: cleanup pm8001_queue_command() Damien Le Moal
@ 2022-02-10 11:42 ` Damien Le Moal
  2022-02-10 14:28   ` John Garry
  2022-02-10 15:35 ` [PATCH 00/20] libsas and pm8001 fixes John Garry
  20 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 11:42 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

In pm80xx_send_abort_all(), the n_elem field of the ccb used is not
initialized to 0. This missing initialization sometimes lead to the
task completion path seeing the ccb with a non-zero n_elem resulting in
the execution of invalid dma_unmap_sg() calls in pm8001_ccb_task_free(),
causing a crash such as:

[  197.676341] RIP: 0010:iommu_dma_unmap_sg+0x6d/0x280
[  197.700204] RSP: 0018:ffff889bbcf89c88 EFLAGS: 00010012
[  197.705485] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff83d0bda0
[  197.712687] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88810dffc0d0
[  197.719887] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8881c790098b
[  197.727089] R10: ffffed1038f20131 R11: 0000000000000001 R12: 0000000000000000
[  197.734296] R13: ffff88810dffc0d0 R14: 0000000000000010 R15: 0000000000000000
[  197.741493] FS:  0000000000000000(0000) GS:ffff889bbcf80000(0000) knlGS:0000000000000000
[  197.749659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.755459] CR2: 00007f16c1b42734 CR3: 0000000004814000 CR4: 0000000000350ee0
[  197.762656] Call Trace:
[  197.765127]  <IRQ>
[  197.767162]  pm8001_ccb_task_free+0x5f1/0x820 [pm80xx]
[  197.772364]  ? do_raw_spin_unlock+0x54/0x220
[  197.776680]  pm8001_mpi_task_abort_resp+0x2ce/0x4f0 [pm80xx]
[  197.782406]  process_oq+0xe85/0x7890 [pm80xx]
[  197.786817]  ? lock_acquire+0x194/0x490
[  197.790697]  ? handle_irq_event+0x10e/0x1b0
[  197.794920]  ? mpi_sata_completion+0x2d70/0x2d70 [pm80xx]
[  197.800378]  ? __wake_up_bit+0x100/0x100
[  197.804340]  ? lock_is_held_type+0x98/0x110
[  197.808565]  pm80xx_chip_isr+0x94/0x130 [pm80xx]
[  197.813243]  tasklet_action_common.constprop.0+0x24b/0x2f0
[  197.818785]  __do_softirq+0x1b5/0x82d
[  197.822485]  ? do_raw_spin_unlock+0x54/0x220
[  197.826799]  __irq_exit_rcu+0x17e/0x1e0
[  197.830678]  irq_exit_rcu+0xa/0x20
[  197.834114]  common_interrupt+0x78/0x90
[  197.840051]  </IRQ>
[  197.844236]  <TASK>
[  197.848397]  asm_common_interrupt+0x1e/0x40

Avoid this issue by always initializing the ccb n_elem field to 0 in
pm8001_send_abort_all(), pm8001_send_read_log() and
pm80xx_send_abort_all().

Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c | 2 ++
 drivers/scsi/pm8001/pm80xx_hwi.c | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index 8095eb0b04f7..d853e8d0195a 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -1788,6 +1788,7 @@ static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
 	ccb->device = pm8001_ha_dev;
 	ccb->ccb_tag = ccb_tag;
 	ccb->task = task;
+	ccb->n_elem = 0;
 
 	circularQ = &pm8001_ha->inbnd_q_tbl[0];
 
@@ -1849,6 +1850,7 @@ static void pm8001_send_read_log(struct pm8001_hba_info *pm8001_ha,
 	ccb->device = pm8001_ha_dev;
 	ccb->ccb_tag = ccb_tag;
 	ccb->task = task;
+	ccb->n_elem = 0;
 	pm8001_ha_dev->id |= NCQ_READ_LOG_FLAG;
 	pm8001_ha_dev->id |= NCQ_2ND_RLE_FLAG;
 
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 4d88c0dbcefc..902af4eefa26 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1801,6 +1801,7 @@ static void pm80xx_send_abort_all(struct pm8001_hba_info *pm8001_ha,
 	ccb->device = pm8001_ha_dev;
 	ccb->ccb_tag = ccb_tag;
 	ccb->task = task;
+	ccb->n_elem = 0;
 
 	circularQ = &pm8001_ha->inbnd_q_tbl[0];
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/20] scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue()
  2022-02-10 11:42 ` [PATCH 03/20] scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue() Damien Le Moal
@ 2022-02-10 11:48   ` John Garry
  0 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2022-02-10 11:48 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 10/02/2022 11:42, Damien Le Moal wrote:
> sas_task_alloc() sets the state flag SAS_TASK_STATE_PENDING for any
> new task allocated. So there is no need to set this flag again in
> sas_ata_qc_issue() after the task allocation.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

I think that my colleague Xiang Chen had the same patch queued locally, 
but yours got to the list first..

Reviewed-by: John Garry <john.garry@huawei.com>

> ---
>   drivers/scsi/libsas/sas_ata.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
> index 99549862c9c7..a03a841921db 100644
> --- a/drivers/scsi/libsas/sas_ata.c
> +++ b/drivers/scsi/libsas/sas_ata.c
> @@ -204,7 +204,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>   	}
>   	task->scatter = qc->sg;
>   	task->ata_task.retry_count = 1;
> -	task->task_state_flags = SAS_TASK_STATE_PENDING;
>   	qc->lldd_task = task;
>   
>   	task->ata_task.use_ncq = ata_is_ncq(qc->tf.protocol);


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume()
  2022-02-10 11:42 ` [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume() Damien Le Moal
@ 2022-02-10 12:04   ` John Garry
  2022-02-10 12:13     ` Damien Le Moal
  0 siblings, 1 reply; 43+ messages in thread
From: John Garry @ 2022-02-10 12:04 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 10/02/2022 11:42, Damien Le Moal wrote:
> In pm8001_pci_resume(), the use of the u32 type for the local variable
> device_state causes a sparse warning:
> 
> warning: incorrect type in assignment (different base types)
>      expected unsigned int [usertype] device_state
>      got restricted pci_power_t [usertype] current_state
> 
> Since this variable is used only once in the function, remove it and
> use pdev->current_state directly. While at it, also add a blank line
> after the last local variable declaration.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

Regardless of a couple of comments:
Reviewed-by: John Garry <john.garry@huawei.com>

> ---
>   drivers/scsi/pm8001/pm8001_init.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
> index d8a2121cb8d9..4b9a26f008a9 100644
> --- a/drivers/scsi/pm8001/pm8001_init.c
> +++ b/drivers/scsi/pm8001/pm8001_init.c
> @@ -1335,13 +1335,13 @@ static int __maybe_unused pm8001_pci_resume(struct device *dev)
>   	struct pm8001_hba_info *pm8001_ha;
>   	int rc;
>   	u8 i = 0, j;
> -	u32 device_state;
>   	DECLARE_COMPLETION_ONSTACK(completion);
> +
>   	pm8001_ha = sha->lldd_ha;
> -	device_state = pdev->current_state;
>   
> -	pm8001_info(pm8001_ha, "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",
> -		      pdev, pm8001_ha->name, device_state);
> +	pm8001_info(pm8001_ha,
> +		    "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",

I think that we may put this on the same line as pm8001_info

Feel free to ignore this: if we're ok with changing logs, I am not sure 
on the "slot" value - it is already printed with pm8001_info. And 
printing pdev is suspect, since we should really be using dev_info or 
pci_info() and friends - but that is a bigger job.

Thanks,
John

> +		    pdev, pm8001_ha->name, pdev->current_state);
>   
>   	rc = pci_go_44(pdev);
>   	if (rc)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume()
  2022-02-10 12:04   ` John Garry
@ 2022-02-10 12:13     ` Damien Le Moal
  0 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 12:13 UTC (permalink / raw)
  To: John Garry, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 2/10/22 21:04, John Garry wrote:
> On 10/02/2022 11:42, Damien Le Moal wrote:
>> In pm8001_pci_resume(), the use of the u32 type for the local variable
>> device_state causes a sparse warning:
>>
>> warning: incorrect type in assignment (different base types)
>>      expected unsigned int [usertype] device_state
>>      got restricted pci_power_t [usertype] current_state
>>
>> Since this variable is used only once in the function, remove it and
>> use pdev->current_state directly. While at it, also add a blank line
>> after the last local variable declaration.
>>
>> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> 
> Regardless of a couple of comments:
> Reviewed-by: John Garry <john.garry@huawei.com>
> 
>> ---
>>   drivers/scsi/pm8001/pm8001_init.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
>> index d8a2121cb8d9..4b9a26f008a9 100644
>> --- a/drivers/scsi/pm8001/pm8001_init.c
>> +++ b/drivers/scsi/pm8001/pm8001_init.c
>> @@ -1335,13 +1335,13 @@ static int __maybe_unused pm8001_pci_resume(struct device *dev)
>>   	struct pm8001_hba_info *pm8001_ha;
>>   	int rc;
>>   	u8 i = 0, j;
>> -	u32 device_state;
>>   	DECLARE_COMPLETION_ONSTACK(completion);
>> +
>>   	pm8001_ha = sha->lldd_ha;
>> -	device_state = pdev->current_state;
>>   
>> -	pm8001_info(pm8001_ha, "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",
>> -		      pdev, pm8001_ha->name, device_state);
>> +	pm8001_info(pm8001_ha,
>> +		    "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",
> 
> I think that we may put this on the same line as pm8001_info
> 
> Feel free to ignore this: if we're ok with changing logs, I am not sure 
> on the "slot" value - it is already printed with pm8001_info. And 
> printing pdev is suspect, since we should really be using dev_info or 
> pci_info() and friends - but that is a bigger job.

Yeah... This driver debug messages are a nightmare: hard to read format
and not-so-useful information. Not to mention that the default log level
is way too verbose... We should revisit the messages once we have
flushed out all know bugs :)

Thanks for the review.


> 
> Thanks,
> John
> 
>> +		    pdev, pm8001_ha->name, pdev->current_state);
>>   
>>   	rc = pci_go_44(pdev);
>>   	if (rc)
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 20/20] scsi: pm8001: fix abort all task initialization
  2022-02-10 11:42 ` [PATCH 20/20] scsi: pm8001: fix abort all task initialization Damien Le Moal
@ 2022-02-10 14:28   ` John Garry
  2022-02-10 22:43     ` Damien Le Moal
  0 siblings, 1 reply; 43+ messages in thread
From: John Garry @ 2022-02-10 14:28 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 10/02/2022 11:42, Damien Le Moal wrote:
> In pm80xx_send_abort_all(), the n_elem field of the ccb used is not
> initialized to 0. This missing initialization sometimes lead to the
> task completion path seeing the ccb with a non-zero n_elem resulting in
> the execution of invalid dma_unmap_sg() calls in pm8001_ccb_task_free(),
> causing a crash such as:
> 
> [  197.676341] RIP: 0010:iommu_dma_unmap_sg+0x6d/0x280
> [  197.700204] RSP: 0018:ffff889bbcf89c88 EFLAGS: 00010012
> [  197.705485] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff83d0bda0
> [  197.712687] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88810dffc0d0
> [  197.719887] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8881c790098b
> [  197.727089] R10: ffffed1038f20131 R11: 0000000000000001 R12: 0000000000000000
> [  197.734296] R13: ffff88810dffc0d0 R14: 0000000000000010 R15: 0000000000000000
> [  197.741493] FS:  0000000000000000(0000) GS:ffff889bbcf80000(0000) knlGS:0000000000000000
> [  197.749659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  197.755459] CR2: 00007f16c1b42734 CR3: 0000000004814000 CR4: 0000000000350ee0
> [  197.762656] Call Trace:
> [  197.765127]  <IRQ>
> [  197.767162]  pm8001_ccb_task_free+0x5f1/0x820 [pm80xx]
> [  197.772364]  ? do_raw_spin_unlock+0x54/0x220
> [  197.776680]  pm8001_mpi_task_abort_resp+0x2ce/0x4f0 [pm80xx]
> [  197.782406]  process_oq+0xe85/0x7890 [pm80xx]
> [  197.786817]  ? lock_acquire+0x194/0x490
> [  197.790697]  ? handle_irq_event+0x10e/0x1b0
> [  197.794920]  ? mpi_sata_completion+0x2d70/0x2d70 [pm80xx]
> [  197.800378]  ? __wake_up_bit+0x100/0x100
> [  197.804340]  ? lock_is_held_type+0x98/0x110
> [  197.808565]  pm80xx_chip_isr+0x94/0x130 [pm80xx]
> [  197.813243]  tasklet_action_common.constprop.0+0x24b/0x2f0
> [  197.818785]  __do_softirq+0x1b5/0x82d
> [  197.822485]  ? do_raw_spin_unlock+0x54/0x220
> [  197.826799]  __irq_exit_rcu+0x17e/0x1e0
> [  197.830678]  irq_exit_rcu+0xa/0x20
> [  197.834114]  common_interrupt+0x78/0x90
> [  197.840051]  </IRQ>
> [  197.844236]  <TASK>
> [  197.848397]  asm_common_interrupt+0x1e/0x40
> 

That's nasty.

> Avoid this issue by always initializing the ccb n_elem field to 0 in
> pm8001_send_abort_all(), pm8001_send_read_log() and
> pm80xx_send_abort_all().
> 
> Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> ---
>   drivers/scsi/pm8001/pm8001_hwi.c | 2 ++
>   drivers/scsi/pm8001/pm80xx_hwi.c | 1 +
>   2 files changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
> index 8095eb0b04f7..d853e8d0195a 100644
> --- a/drivers/scsi/pm8001/pm8001_hwi.c
> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
> @@ -1788,6 +1788,7 @@ static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
>   	ccb->device = pm8001_ha_dev;
>   	ccb->ccb_tag = ccb_tag;
>   	ccb->task = task;
> +	ccb->n_elem = 0;

Do you think that it would be better to clear this field when we free 
the tag/ccb in pm8001_ccb_task_free()? I will note that there are error 
paths whch only free the tag (not ccb), so need to be careful there.

BTW, I see that this never landed:
https://lore.kernel.org/lkml/20211214090337.29156-1-niejianglei2021@163.com/

Though alloc'ing a domain_device in pm8001_send_read_log() is questionable.

Thanks,
John

>   
>   	circularQ = &pm8001_ha->inbnd_q_tbl[0];
>   
> @@ -1849,6 +1850,7 @@ static void pm8001_send_read_log(struct pm8001_hba_info *pm8001_ha,
>   	ccb->device = pm8001_ha_dev;
>   	ccb->ccb_tag = ccb_tag;
>   	ccb->task = task;
> +	ccb->n_elem = 0;
>   	pm8001_ha_dev->id |= NCQ_READ_LOG_FLAG;
>   	pm8001_ha_dev->id |= NCQ_2ND_RLE_FLAG;
>   
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 4d88c0dbcefc..902af4eefa26 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -1801,6 +1801,7 @@ static void pm80xx_send_abort_all(struct pm8001_hba_info *pm8001_ha,
>   	ccb->device = pm8001_ha_dev;
>   	ccb->ccb_tag = ccb_tag;
>   	ccb->task = task;
> +	ccb->n_elem = 0;
>   
>   	circularQ = &pm8001_ha->inbnd_q_tbl[0];
>   


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/20] scsi: pm8001: Fix command initialization in pm80XX_send_read_log()
  2022-02-10 11:42 ` [PATCH 07/20] scsi: pm8001: Fix command initialization in pm80XX_send_read_log() Damien Le Moal
@ 2022-02-10 14:32   ` John Garry
  0 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2022-02-10 14:32 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 10/02/2022 11:42, Damien Le Moal wrote:
> Since the sata_cmd struct is zeroed out before its fields are
> initialized, there is no need for using "|=" to initialize the
> ncqtag_atap_dir_m field. Using a standard assignment removes the sparse
> warning:
> 
> warning: invalid assignment: |=
> 
> Also, since the ncqtag_atap_dir_m field has type __le32, use
> cpu_to_le32() to generate the assigned value.
> 
> Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
> Signed-off-by: Damien Le Moal<damien.lemoal@opensource.wdc.com>

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy()
  2022-02-10 11:42 ` [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy() Damien Le Moal
@ 2022-02-10 14:42   ` John Garry
  2022-02-11  6:14   ` Christoph Hellwig
  1 sibling, 0 replies; 43+ messages in thread
From: John Garry @ 2022-02-10 14:42 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 10/02/2022 11:42, Damien Le Moal wrote:
> Declare the local variable destination1 as a pointer to a __le32 value
> rather than a u32. This suppresses the sparse warning:
> 
> warning: incorrect type in assignment (different base types)
>      expected unsigned int [usertype]
>      got restricted __le32 [usertype]
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

Reviewed-by: John Garry <john.garry@huawei.com>

> ---
>   drivers/scsi/pm8001/pm80xx_hwi.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index ec6b970e05a1..37ede7c79e85 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -71,14 +71,13 @@ static void pm80xx_pci_mem_copy(struct pm8001_hba_info  *pm8001_ha, u32 soffset,
>   				u32 dw_count, u32 bus_base_number)
>   {
>   	u32 index, value, offset;
> -	u32 *destination1;
> -	destination1 = (u32 *)destination;
> +	__le32 *destination1 = (__le32 *)destination;
>   
>   	for (index = 0; index < dw_count; index += 4, destination1++) {
>   		offset = (soffset + index);
>   		if (offset < (64 * 1024)) {
>   			value = pm8001_cr32(pm8001_ha, bus_base_number, offset);
> -			*destination1 =  cpu_to_le32(value);
> +			*destination1 = cpu_to_le32(value);

I can't help but wonder if there is already something to do this, like 
memcpy_fromio()

>   		}
>   	}
>   	return;


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/20] scsi: pm8001: fix payload initialization in pm80xx_set_thermal_config()
  2022-02-10 11:42 ` [PATCH 10/20] scsi: pm8001: fix payload initialization in pm80xx_set_thermal_config() Damien Le Moal
@ 2022-02-10 14:43   ` John Garry
  0 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2022-02-10 14:43 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 10/02/2022 11:42, Damien Le Moal wrote:
> The fields of the set_ctrl_cfg_req structure have the __le32 type, so
> use cpu_to_le32() to assign them. This removes the sparse warnings:
> 
> warning: incorrect type in assignment (different base types)
>      expected restricted __le32
>      got unsigned int
> 
> Fixes: 842784e0d15b ("pm80xx: Update For Thermal Page Code")
> Fixes: f5860992db55 ("[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files")
> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>


Reviewed-by: John Garry <john.garry@huawei.com>


> ---
>   drivers/scsi/pm8001/pm80xx_hwi.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 37ede7c79e85..9c7383fb4bdc 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -1202,9 +1202,11 @@ pm80xx_set_thermal_config(struct pm8001_hba_info *pm8001_ha)
>   	else
>   		page_code = THERMAL_PAGE_CODE_8H;
>   
> -	payload.cfg_pg[0] = (THERMAL_LOG_ENABLE << 9) |
> -				(THERMAL_ENABLE << 8) | page_code;
> -	payload.cfg_pg[1] = (LTEMPHIL << 24) | (RTEMPHIL << 8);
> +	payload.cfg_pg[0] =
> +		cpu_to_le32((THERMAL_LOG_ENABLE << 9) |
> +			    (THERMAL_ENABLE << 8) | page_code);
> +	payload.cfg_pg[1] =
> +		cpu_to_le32((LTEMPHIL << 24) | (RTEMPHIL << 8));
>   
>   	pm8001_dbg(pm8001_ha, DEV,
>   		   "Setting up thermal config. cfg_pg 0 0x%x cfg_pg 1 0x%x\n",


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 16/20] scsi: pm8001: simplify pm8001_get_ncq_tag()
  2022-02-10 11:42 ` [PATCH 16/20] scsi: pm8001: simplify pm8001_get_ncq_tag() Damien Le Moal
@ 2022-02-10 14:50   ` John Garry
  0 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2022-02-10 14:50 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 10/02/2022 11:42, Damien Le Moal wrote:
> To detect if a command is NCQ, there is no need to test all possible NCQ
> command codes. Instead, use ata_is_ncq() to test the command protocol.
> 

task->ata_task.use_ncq holds ata_is_ncq() and that is checked before 
calling pm8001_get_ncq_tag() in both callsites, so I doubt the point in 
the caller pre-check, so maybe that can be removed

> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

Reviewed-by: John Garry <john.garry@huawei.com>

> ---
>   drivers/scsi/pm8001/pm8001_sas.c | 14 +++++---------
>   1 file changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
> index 6805c7f43e41..711eaf81f546 100644
> --- a/drivers/scsi/pm8001/pm8001_sas.c
> +++ b/drivers/scsi/pm8001/pm8001_sas.c
> @@ -306,16 +306,12 @@ static int pm8001_task_prep_smp(struct pm8001_hba_info *pm8001_ha,
>   u32 pm8001_get_ncq_tag(struct sas_task *task, u32 *tag)
>   {
>   	struct ata_queued_cmd *qc = task->uldd_task;
> -	if (qc) {
> -		if (qc->tf.command == ATA_CMD_FPDMA_WRITE ||
> -		    qc->tf.command == ATA_CMD_FPDMA_READ ||
> -		    qc->tf.command == ATA_CMD_FPDMA_RECV ||
> -		    qc->tf.command == ATA_CMD_FPDMA_SEND ||
> -		    qc->tf.command == ATA_CMD_NCQ_NON_DATA) {
> -			*tag = qc->tag;
> -			return 1;
> -		}
> +
> +	if (qc && ata_is_ncq(qc->tf.protocol)) {
> +		*tag = qc->tag;
> +		return 1;
>   	}
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 19/20] scsi: pm8001: cleanup pm8001_queue_command()
  2022-02-10 11:42 ` [PATCH 19/20] scsi: pm8001: cleanup pm8001_queue_command() Damien Le Moal
@ 2022-02-10 14:53   ` John Garry
  0 siblings, 0 replies; 43+ messages in thread
From: John Garry @ 2022-02-10 14:53 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 10/02/2022 11:42, Damien Le Moal wrote:
> Avoid repeatedly declaring "struct task_status_struct *ts" to handle
> error cases by declaring this variable for the entire function scope.
> This allows simplifying the error cases, and together with the addition
> of blank lines make the code more readable.
> 
> Signed-off-by: Damien Le Moal<damien.lemoal@opensource.wdc.com>
> ---

Reviewed-by: John Garry <john.garry@huawei.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
                   ` (19 preceding siblings ...)
  2022-02-10 11:42 ` [PATCH 20/20] scsi: pm8001: fix abort all task initialization Damien Le Moal
@ 2022-02-10 15:35 ` John Garry
  2022-02-10 22:44   ` Damien Le Moal
  20 siblings, 1 reply; 43+ messages in thread
From: John Garry @ 2022-02-10 15:35 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 10/02/2022 11:41, Damien Le Moal wrote:
> The first 3 patches fix a problem with libsas handling of NCQ NON DATA
> commands and simplify libsas code in a couple of places.
> The remaining patches are fixes for the pm8001 driver:
> * All sparse warnings are addressed, fixing along the way many le32
>    handling bugs for big-endian architectures
> * Fix handling of NCQ NON DATA commands
> * Fix NCQ error recovery (abort all task execution) that was causing a
>    crash
> * Simplify the code in many places
> 
> With these fixes, libzbc test suite passes all test case. This test
> suite was used with an SMR drive for testing because it generates many
> NCQ NON DATA commands (for zone management commands) and also generates
> many NCQ command errors to check ASC/ASCQ returned by the device. With
> the test suite, the error recovery path was extensively exercised.
> 
> Note that without these patches, libzbc test suite result in the
> controller hanging, or in kernel crashes.

Unfortunately I still see the hang on my arm64 system with this series :(

Thanks,
John

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 20/20] scsi: pm8001: fix abort all task initialization
  2022-02-10 14:28   ` John Garry
@ 2022-02-10 22:43     ` Damien Le Moal
  0 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 22:43 UTC (permalink / raw)
  To: John Garry, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 2/10/22 23:28, John Garry wrote:
> On 10/02/2022 11:42, Damien Le Moal wrote:
>> In pm80xx_send_abort_all(), the n_elem field of the ccb used is not
>> initialized to 0. This missing initialization sometimes lead to the
>> task completion path seeing the ccb with a non-zero n_elem resulting in
>> the execution of invalid dma_unmap_sg() calls in pm8001_ccb_task_free(),
>> causing a crash such as:
>>
>> [  197.676341] RIP: 0010:iommu_dma_unmap_sg+0x6d/0x280
>> [  197.700204] RSP: 0018:ffff889bbcf89c88 EFLAGS: 00010012
>> [  197.705485] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff83d0bda0
>> [  197.712687] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88810dffc0d0
>> [  197.719887] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8881c790098b
>> [  197.727089] R10: ffffed1038f20131 R11: 0000000000000001 R12: 0000000000000000
>> [  197.734296] R13: ffff88810dffc0d0 R14: 0000000000000010 R15: 0000000000000000
>> [  197.741493] FS:  0000000000000000(0000) GS:ffff889bbcf80000(0000) knlGS:0000000000000000
>> [  197.749659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  197.755459] CR2: 00007f16c1b42734 CR3: 0000000004814000 CR4: 0000000000350ee0
>> [  197.762656] Call Trace:
>> [  197.765127]  <IRQ>
>> [  197.767162]  pm8001_ccb_task_free+0x5f1/0x820 [pm80xx]
>> [  197.772364]  ? do_raw_spin_unlock+0x54/0x220
>> [  197.776680]  pm8001_mpi_task_abort_resp+0x2ce/0x4f0 [pm80xx]
>> [  197.782406]  process_oq+0xe85/0x7890 [pm80xx]
>> [  197.786817]  ? lock_acquire+0x194/0x490
>> [  197.790697]  ? handle_irq_event+0x10e/0x1b0
>> [  197.794920]  ? mpi_sata_completion+0x2d70/0x2d70 [pm80xx]
>> [  197.800378]  ? __wake_up_bit+0x100/0x100
>> [  197.804340]  ? lock_is_held_type+0x98/0x110
>> [  197.808565]  pm80xx_chip_isr+0x94/0x130 [pm80xx]
>> [  197.813243]  tasklet_action_common.constprop.0+0x24b/0x2f0
>> [  197.818785]  __do_softirq+0x1b5/0x82d
>> [  197.822485]  ? do_raw_spin_unlock+0x54/0x220
>> [  197.826799]  __irq_exit_rcu+0x17e/0x1e0
>> [  197.830678]  irq_exit_rcu+0xa/0x20
>> [  197.834114]  common_interrupt+0x78/0x90
>> [  197.840051]  </IRQ>
>> [  197.844236]  <TASK>
>> [  197.848397]  asm_common_interrupt+0x1e/0x40
>>
> 
> That's nasty.
> 
>> Avoid this issue by always initializing the ccb n_elem field to 0 in
>> pm8001_send_abort_all(), pm8001_send_read_log() and
>> pm80xx_send_abort_all().
>>
>> Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
>> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
>> ---
>>   drivers/scsi/pm8001/pm8001_hwi.c | 2 ++
>>   drivers/scsi/pm8001/pm80xx_hwi.c | 1 +
>>   2 files changed, 3 insertions(+)
>>
>> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
>> index 8095eb0b04f7..d853e8d0195a 100644
>> --- a/drivers/scsi/pm8001/pm8001_hwi.c
>> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
>> @@ -1788,6 +1788,7 @@ static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
>>   	ccb->device = pm8001_ha_dev;
>>   	ccb->ccb_tag = ccb_tag;
>>   	ccb->task = task;
>> +	ccb->n_elem = 0;
> 
> Do you think that it would be better to clear this field when we free 
> the tag/ccb in pm8001_ccb_task_free()? I will note that there are error 
> paths whch only free the tag (not ccb), so need to be careful there.

I am thinking that hunk like this one:

        ccb = &pm8001_ha->ccb_info[ccb_tag];

        ccb->device = pm8001_ha_dev;

        ccb->ccb_tag = ccb_tag;

        ccb->task = task;

To initialize a new ccb should be wrapped into a helper function, which
would also add initialization for the other ccb fields. There are many
places that have code like this, so that will also nicely cleanup the
code. Then the free path can be left alone. Hmm ?

> 
> BTW, I see that this never landed:
> https://lore.kernel.org/lkml/20211214090337.29156-1-niejianglei2021@163.com/

Repost !

> 
> Though alloc'ing a domain_device in pm8001_send_read_log() is questionable.

Yes, and the messages say "device not found" when the task completes so
I think it is 100% useless. But I did not touch that since it did not
seem to cause troubles.

> 
> Thanks,
> John
> 
>>   
>>   	circularQ = &pm8001_ha->inbnd_q_tbl[0];
>>   
>> @@ -1849,6 +1850,7 @@ static void pm8001_send_read_log(struct pm8001_hba_info *pm8001_ha,
>>   	ccb->device = pm8001_ha_dev;
>>   	ccb->ccb_tag = ccb_tag;
>>   	ccb->task = task;
>> +	ccb->n_elem = 0;
>>   	pm8001_ha_dev->id |= NCQ_READ_LOG_FLAG;
>>   	pm8001_ha_dev->id |= NCQ_2ND_RLE_FLAG;
>>   
>> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
>> index 4d88c0dbcefc..902af4eefa26 100644
>> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
>> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
>> @@ -1801,6 +1801,7 @@ static void pm80xx_send_abort_all(struct pm8001_hba_info *pm8001_ha,
>>   	ccb->device = pm8001_ha_dev;
>>   	ccb->ccb_tag = ccb_tag;
>>   	ccb->task = task;
>> +	ccb->n_elem = 0;
>>   
>>   	circularQ = &pm8001_ha->inbnd_q_tbl[0];
>>   
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-10 15:35 ` [PATCH 00/20] libsas and pm8001 fixes John Garry
@ 2022-02-10 22:44   ` Damien Le Moal
  2022-02-11  9:24     ` John Garry
  0 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-10 22:44 UTC (permalink / raw)
  To: John Garry, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 2/11/22 00:35, John Garry wrote:
> On 10/02/2022 11:41, Damien Le Moal wrote:
>> The first 3 patches fix a problem with libsas handling of NCQ NON DATA
>> commands and simplify libsas code in a couple of places.
>> The remaining patches are fixes for the pm8001 driver:
>> * All sparse warnings are addressed, fixing along the way many le32
>>    handling bugs for big-endian architectures
>> * Fix handling of NCQ NON DATA commands
>> * Fix NCQ error recovery (abort all task execution) that was causing a
>>    crash
>> * Simplify the code in many places
>>
>> With these fixes, libzbc test suite passes all test case. This test
>> suite was used with an SMR drive for testing because it generates many
>> NCQ NON DATA commands (for zone management commands) and also generates
>> many NCQ command errors to check ASC/ASCQ returned by the device. With
>> the test suite, the error recovery path was extensively exercised.
>>
>> Note that without these patches, libzbc test suite result in the
>> controller hanging, or in kernel crashes.
> 
> Unfortunately I still see the hang on my arm64 system with this series :(

That is unfortunate. Any particular command sequence triggering the hang
? Or is it random ? What workload are you running ?

> 
> Thanks,
> John


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control()
  2022-02-10 11:42 ` [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control() Damien Le Moal
@ 2022-02-11  6:11   ` Christoph Hellwig
  2022-02-11  7:03     ` Damien Le Moal
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2022-02-11  6:11 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

On Thu, Feb 10, 2022 at 08:42:02PM +0900, Damien Le Moal wrote:
>  			struct sas_phy *phy = sas_phy->phy;
> -			uint32_t *qp = (uint32_t *)(((char *)
> -				pm8001_ha->io_mem[2].memvirtaddr)
> -				+ 0x1034 + (0x4000 * (phy_id & 3)));
> +			unsigned long vaddr = (unsigned long)
> +				pm8001_ha->io_mem[2].memvirtaddr;
> +			uint32_t *qp = (uint32_t *)
> +				(vaddr + 0x1034 + (0x4000 * (phy_id & 3)));

This just removes the warning, but does not actually fix the issue.
Both long_vaddr and qp need to be __iomem pointers...

>  
>  			phy->invalid_dword_count = qp[0];
>  			phy->running_disparity_error_count = qp[1];

... and reads from qp need to use readl.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy()
  2022-02-10 11:42 ` [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy() Damien Le Moal
  2022-02-10 14:42   ` John Garry
@ 2022-02-11  6:14   ` Christoph Hellwig
  2022-02-11  7:18     ` Damien Le Moal
  1 sibling, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2022-02-11  6:14 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

On Thu, Feb 10, 2022 at 08:42:06PM +0900, Damien Le Moal wrote:
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -71,14 +71,13 @@ static void pm80xx_pci_mem_copy(struct pm8001_hba_info  *pm8001_ha, u32 soffset,
>  				u32 dw_count, u32 bus_base_number)
>  {
>  	u32 index, value, offset;
> -	u32 *destination1;
> -	destination1 = (u32 *)destination;
> +	__le32 *destination1 = (__le32 *)destination;

I think the right fix here is to declare the destination argument as a
le32 pointer without the incorrect const attribute.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control()
  2022-02-11  6:11   ` Christoph Hellwig
@ 2022-02-11  7:03     ` Damien Le Moal
  0 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-11  7:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 2/11/22 15:11, Christoph Hellwig wrote:
> On Thu, Feb 10, 2022 at 08:42:02PM +0900, Damien Le Moal wrote:
>>  			struct sas_phy *phy = sas_phy->phy;
>> -			uint32_t *qp = (uint32_t *)(((char *)
>> -				pm8001_ha->io_mem[2].memvirtaddr)
>> -				+ 0x1034 + (0x4000 * (phy_id & 3)));
>> +			unsigned long vaddr = (unsigned long)
>> +				pm8001_ha->io_mem[2].memvirtaddr;
>> +			uint32_t *qp = (uint32_t *)
>> +				(vaddr + 0x1034 + (0x4000 * (phy_id & 3)));
> 
> This just removes the warning, but does not actually fix the issue.
> Both long_vaddr and qp need to be __iomem pointers...
> 
>>  
>>  			phy->invalid_dword_count = qp[0];
>>  			phy->running_disparity_error_count = qp[1];
> 
> ... and reads from qp need to use readl.

OK. About to send a v2 with more fixes. Will rework this one and patch 8
too.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy()
  2022-02-11  6:14   ` Christoph Hellwig
@ 2022-02-11  7:18     ` Damien Le Moal
  0 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-11  7:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-scsi, Martin K . Petersen, John Garry, Xiang Chen,
	Jason Yan, Luo Jiaxing

On 2/11/22 15:14, Christoph Hellwig wrote:
> On Thu, Feb 10, 2022 at 08:42:06PM +0900, Damien Le Moal wrote:
>> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
>> @@ -71,14 +71,13 @@ static void pm80xx_pci_mem_copy(struct pm8001_hba_info  *pm8001_ha, u32 soffset,
>>  				u32 dw_count, u32 bus_base_number)
>>  {
>>  	u32 index, value, offset;
>> -	u32 *destination1;
>> -	destination1 = (u32 *)destination;
>> +	__le32 *destination1 = (__le32 *)destination;
> 
> I think the right fix here is to declare the destination argument as a
> le32 pointer without the incorrect const attribute.

Yes. Much cleaner :)


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-10 22:44   ` Damien Le Moal
@ 2022-02-11  9:24     ` John Garry
  2022-02-11 12:37       ` Damien Le Moal
  0 siblings, 1 reply; 43+ messages in thread
From: John Garry @ 2022-02-11  9:24 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 10/02/2022 22:44, Damien Le Moal wrote:

Hi Damien,

>>> Note that without these patches, libzbc test suite result in the
>>> controller hanging, or in kernel crashes.
>> Unfortunately I still see the hang on my arm64 system with this series:(
> That is unfortunate. Any particular command sequence triggering the hang
> ? Or is it random ? What workload are you running ?
> 

mount/unmount fails mostly even after as few as one attempt, but then 
even fdisk -l fails sometimes:

root@(none)$ fdisk -l
[   97.924789] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[   97.930652] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   97.937149] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   97.943232] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   97.952020] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   97.958099] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   97.966881] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   97.972961] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   97.981737] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   97.991241] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.000752] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.010223] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.015180] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.021645] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.027725] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.036495] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.042574] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.051344] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.057423] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.066154] pm80xx: rc= -5
[   98.068854] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.075331] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.081411] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.090192] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.096271] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.105053] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.111132] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.119909] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.129414] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.138919] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.148388] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.153345] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.159812] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.165892] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.174661] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.180741] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.189511] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.195590] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.204320] pm80xx: rc= -5
[   98.207019] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.213497] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.219577] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.228359] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.234438] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.243220] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.249299] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.258075] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.267580] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.277085] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.286554] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.291510] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.297978] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.304059] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.312827] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.318906] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.327677] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.333756] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.342487] pm80xx: rc= -5
[   98.345186] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.351664] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.357743] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.366525] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.372604] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.381386] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.387465] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.396230] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.405734] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.415239] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.424709] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.429665] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.436133] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.442213] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.450982] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.457061] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.465825] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.471904] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.480629] pm80xx: rc= -5
[   98.483327] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.489800] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.495880] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.504661] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.510740] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.519523] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.525602] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.534372] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.543877] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.553382] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.562851] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.567807] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.574275] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.580355] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.589124] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.595203] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.603968] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.610047] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.618778] pm80xx: rc= -5
[   98.621505] sas: --- Exit sas_scsi_recover_host: busy: 1 failed: 1 
tries: 1

Sometimes I get TMF timeouts, which is a bad situation. I guess it's a 
subtle driver bug, but where ....?

BTW, this following log needs removal/fixing at some stage by someone:

[   98.480629] pm80xx: rc= -5

It's from pm8001_query_task().

Thanks,
John

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-11  9:24     ` John Garry
@ 2022-02-11 12:37       ` Damien Le Moal
  2022-02-11 13:08         ` John Garry
  0 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-11 12:37 UTC (permalink / raw)
  To: John Garry, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 2/11/22 18:24, John Garry wrote:
> On 10/02/2022 22:44, Damien Le Moal wrote:
> 
> Hi Damien,
> 
>>>> Note that without these patches, libzbc test suite result in the
>>>> controller hanging, or in kernel crashes.
>>> Unfortunately I still see the hang on my arm64 system with this series:(
>> That is unfortunate. Any particular command sequence triggering the hang
>> ? Or is it random ? What workload are you running ?
>>
> 
> mount/unmount fails mostly even after as few as one attempt, but then 
> even fdisk -l fails sometimes:

Try with patch 21 of my v2. It does fix a bug for scsi/sas case. That
problem would likely lead to a crash though, but never know...

> root@(none)$ fdisk -l
> [   97.924789] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
> [   97.930652] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
> [   97.937149] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
> [   97.943232] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
> Failure Drive:5000c500a7babc61
[...]
> 
> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a 
> subtle driver bug, but where ....?

What is the command failing ? Always the same ? Can you try adding scsi
trace to see the commands ?

If you are "lucky", it is always the same type of command like for the
NCQ NON DATA in my case. Though on mount, I would only expect a lot of
read commands and not much else. There may be some writes and a flush
too, so there will be "data" commands and "non data" commands. It may be
an issue with non-data commands too ?

> BTW, this following log needs removal/fixing at some stage by someone:
> 
> [   98.480629] pm80xx: rc= -5
> 
> It's from pm8001_query_task().
> 
> Thanks,
> John


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-11 12:37       ` Damien Le Moal
@ 2022-02-11 13:08         ` John Garry
  2022-02-11 13:14           ` Damien Le Moal
  0 siblings, 1 reply; 43+ messages in thread
From: John Garry @ 2022-02-11 13:08 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 11/02/2022 12:37, Damien Le Moal wrote:

Hi Damien,

>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>> subtle driver bug, but where ....?
> What is the command failing ? Always the same ? Can you try adding scsi
> trace to see the commands ?

This is the same issue I have had since day #1.

Generally mount/unmount or even fdisk -l fails after booting into 
miniramfs. I wouldn't ever try to boot a distro.

> 
> If you are "lucky", it is always the same type of command like for the
> NCQ NON DATA in my case.

I'm just trying SAS disks to start with - so it's an SCSI READ command. 
SATA/STP support is generally never as robust for SAS HBAs (HW and LLD 
bugs are common - as this series is evidence) so I start on something 
more basic - however SATA/STP also has this issue.

The command is sent successfully but just never completes. Then 
sometimes the TMFs for error handling timeout and sometimes succeed. I 
don't have much to do on....

> Though on mount, I would only expect a lot of
> read commands and not much else.

Yes, and it is commonly the first SCSI read command which times out. It 
reliably breaks quite early. So I can think we can rule out issues like 
memory barriers/timing.

  There may be some writes and a flush
> too, so there will be "data" commands and "non data" commands. It may be
> an issue with non-data commands too ?
> 

Not sure on that. I guess it isn't.

Thanks,
John

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-11 13:08         ` John Garry
@ 2022-02-11 13:14           ` Damien Le Moal
  2022-02-11 13:54             ` John Garry
  0 siblings, 1 reply; 43+ messages in thread
From: Damien Le Moal @ 2022-02-11 13:14 UTC (permalink / raw)
  To: John Garry, linux-scsi, Martin K . Petersen, Xiang Chen, Jason Yan

On 2/11/22 22:08, John Garry wrote:
> On 11/02/2022 12:37, Damien Le Moal wrote:
> 
> Hi Damien,
> 
>>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>>> subtle driver bug, but where ....?
>> What is the command failing ? Always the same ? Can you try adding scsi
>> trace to see the commands ?
> 
> This is the same issue I have had since day #1.
> 
> Generally mount/unmount or even fdisk -l fails after booting into 
> miniramfs. I wouldn't ever try to boot a distro.

busybox ?

> 
>>
>> If you are "lucky", it is always the same type of command like for the
>> NCQ NON DATA in my case.
> 
> I'm just trying SAS disks to start with - so it's an SCSI READ command. 
> SATA/STP support is generally never as robust for SAS HBAs (HW and LLD 
> bugs are common - as this series is evidence) so I start on something 
> more basic - however SATA/STP also has this issue.
> 
> The command is sent successfully but just never completes. Then 
> sometimes the TMFs for error handling timeout and sometimes succeed. I 
> don't have much to do on....

No SAS bus analyzer lying in a corner of the office ? :)
That could help...

I will go to the office Monday. So I will get a chance to add SAS drives
to my setup to see what I get. I have only tested with SATA until now.
My controller is not the same chip as yours though.

> 
>> Though on mount, I would only expect a lot of
>> read commands and not much else.
> 
> Yes, and it is commonly the first SCSI read command which times out. It 
> reliably breaks quite early. So I can think we can rule out issues like 
> memory barriers/timing.
> 
>   There may be some writes and a flush
>> too, so there will be "data" commands and "non data" commands. It may be
>> an issue with non-data commands too ?
>>
> 
> Not sure on that. I guess it isn't.

Anything special with the drives you are using ? Have you tried other
drives to see if you get lucky ?

> 
> Thanks,
> John


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-11 13:14           ` Damien Le Moal
@ 2022-02-11 13:54             ` John Garry
  2022-02-12  6:19               ` Damien Le Moal
  0 siblings, 1 reply; 43+ messages in thread
From: John Garry @ 2022-02-11 13:54 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Ajish.Koshy

Hi Damien,

>>
>>>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>>>> subtle driver bug, but where ....?
>>> What is the command failing ? Always the same ? Can you try adding scsi
>>> trace to see the commands ?
>>
>> This is the same issue I have had since day #1.
>>
>> Generally mount/unmount or even fdisk -l fails after booting into
>> miniramfs. I wouldn't ever try to boot a distro.
> 
> busybox ?
> 

Yes

>>
>>>
>>> If you are "lucky", it is always the same type of command like for the
>>> NCQ NON DATA in my case.
>>
>> I'm just trying SAS disks to start with - so it's an SCSI READ command.
>> SATA/STP support is generally never as robust for SAS HBAs (HW and LLD
>> bugs are common - as this series is evidence) so I start on something
>> more basic - however SATA/STP also has this issue.
>>
>> The command is sent successfully but just never completes. Then
>> sometimes the TMFs for error handling timeout and sometimes succeed. I
>> don't have much to do on....
> 
> No SAS bus analyzer lying in a corner of the office ? :)
> That could help...

None unfortunately

> 
> I will go to the office Monday. So I will get a chance to add SAS drives
> to my setup to see what I get. I have only tested with SATA until now.
> My controller is not the same chip as yours though.

jfyi, Ajish, cc'ed, from microchip says that they see the same issue on 
their arm64 system. Unfortunately fixing it is not a priority for them. 
So it is something which arm64 is exposing.

And I tried an old kernel - like 4.10 - on the same board and the pm8001 
driver was working somewhat reliably (no hangs). It just looks like a 
driver issue.

I'll have a look at the driver code again if I get a chance. It might be 
a DMA issue.

> 
>>
>>> Though on mount, I would only expect a lot of
>>> read commands and not much else.
>>
>> Yes, and it is commonly the first SCSI read command which times out. It
>> reliably breaks quite early. So I can think we can rule out issues like
>> memory barriers/timing.
>>
>>    There may be some writes and a flush
>>> too, so there will be "data" commands and "non data" commands. It may be
>>> an issue with non-data commands too ?
>>>
>>
>> Not sure on that. I guess it isn't.
> 
> Anything special with the drives you are using ? Have you tried other
> drives to see if you get lucky ?
> 

I think that the drives are ok. On the same board I originally had them 
plugged in the hisi_sas HBA and no issues.

thanks,
john

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/20] libsas and pm8001 fixes
  2022-02-11 13:54             ` John Garry
@ 2022-02-12  6:19               ` Damien Le Moal
  0 siblings, 0 replies; 43+ messages in thread
From: Damien Le Moal @ 2022-02-12  6:19 UTC (permalink / raw)
  To: John Garry, linux-scsi, Martin K . Petersen, Xiang Chen,
	Jason Yan, Ajish.Koshy

On 2/11/22 22:54, John Garry wrote:
> Hi Damien,
> 
>>>
>>>>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>>>>> subtle driver bug, but where ....?
>>>> What is the command failing ? Always the same ? Can you try adding scsi
>>>> trace to see the commands ?
>>>
>>> This is the same issue I have had since day #1.
>>>
>>> Generally mount/unmount or even fdisk -l fails after booting into
>>> miniramfs. I wouldn't ever try to boot a distro.
>>
>> busybox ?
>>
> 
> Yes
> 
>>>
>>>>
>>>> If you are "lucky", it is always the same type of command like for the
>>>> NCQ NON DATA in my case.
>>>
>>> I'm just trying SAS disks to start with - so it's an SCSI READ command.
>>> SATA/STP support is generally never as robust for SAS HBAs (HW and LLD
>>> bugs are common - as this series is evidence) so I start on something
>>> more basic - however SATA/STP also has this issue.
>>>
>>> The command is sent successfully but just never completes. Then
>>> sometimes the TMFs for error handling timeout and sometimes succeed. I
>>> don't have much to do on....
>>
>> No SAS bus analyzer lying in a corner of the office ? :)
>> That could help...
> 
> None unfortunately
> 
>>
>> I will go to the office Monday. So I will get a chance to add SAS drives
>> to my setup to see what I get. I have only tested with SATA until now.
>> My controller is not the same chip as yours though.
> 
> jfyi, Ajish, cc'ed, from microchip says that they see the same issue on 
> their arm64 system. Unfortunately fixing it is not a priority for them. 
> So it is something which arm64 is exposing.
> 
> And I tried an old kernel - like 4.10 - on the same board and the pm8001 
> driver was working somewhat reliably (no hangs). It just looks like a 
> driver issue.
> 
> I'll have a look at the driver code again if I get a chance. It might be 
> a DMA issue.

There is one more thing that I find strange in the driver and that may
cause problems: tag 0 is a perfectly valid tag value that can be
returned by pm8001_tag_alloc() since find_first_zero_bit() will return 0
if the first bit is 0. And yet, there are many places in the driver that
treat !tag as an error. Extremely weird, if not outright broken...

I patched that and tested and everything seems OK... Could it be that
you are not seeing some completions because of that ?

I added the patch to my v3. Will send Monday.



-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2022-02-12  6:19 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 11:41 [PATCH 00/20] libsas and pm8001 fixes Damien Le Moal
2022-02-10 11:41 ` [PATCH 01/20] scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands Damien Le Moal
2022-02-10 11:42 ` [PATCH 02/20] scsi: libsas: simplify sas_ata_qc_issue() detection of NCQ commands Damien Le Moal
2022-02-10 11:42 ` [PATCH 03/20] scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue() Damien Le Moal
2022-02-10 11:48   ` John Garry
2022-02-10 11:42 ` [PATCH 04/20] scsi: pm8001: fix __iomem pointer use in pm8001_phy_control() Damien Le Moal
2022-02-11  6:11   ` Christoph Hellwig
2022-02-11  7:03     ` Damien Le Moal
2022-02-10 11:42 ` [PATCH 05/20] scsi: pm8001: Remove local variable in pm8001_pci_resume() Damien Le Moal
2022-02-10 12:04   ` John Garry
2022-02-10 12:13     ` Damien Le Moal
2022-02-10 11:42 ` [PATCH 06/20] scsi: pm8001: Fix pm8001_update_flash() local variable type Damien Le Moal
2022-02-10 11:42 ` [PATCH 07/20] scsi: pm8001: Fix command initialization in pm80XX_send_read_log() Damien Le Moal
2022-02-10 14:32   ` John Garry
2022-02-10 11:42 ` [PATCH 08/20] scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy() Damien Le Moal
2022-02-10 14:42   ` John Garry
2022-02-11  6:14   ` Christoph Hellwig
2022-02-11  7:18     ` Damien Le Moal
2022-02-10 11:42 ` [PATCH 09/20] scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req() Damien Le Moal
2022-02-10 11:42 ` [PATCH 10/20] scsi: pm8001: fix payload initialization in pm80xx_set_thermal_config() Damien Le Moal
2022-02-10 14:43   ` John Garry
2022-02-10 11:42 ` [PATCH 11/20] scsi: pm8001: fix le32 values handling in pm80xx_set_sas_protocol_timer_config() Damien Le Moal
2022-02-10 11:42 ` [PATCH 12/20] scsi: pm8001: fix payload initialization in pm80xx_encrypt_update() Damien Le Moal
2022-02-10 11:42 ` [PATCH 13/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_ssp_io_req() Damien Le Moal
2022-02-10 11:42 ` [PATCH 14/20] scsi: pm8001: fix le32 values handling in pm80xx_chip_sata_req() Damien Le Moal
2022-02-10 11:42 ` [PATCH 15/20] scsi: pm8001: fix use of struct set_phy_profile_req fields Damien Le Moal
2022-02-10 11:42 ` [PATCH 16/20] scsi: pm8001: simplify pm8001_get_ncq_tag() Damien Le Moal
2022-02-10 14:50   ` John Garry
2022-02-10 11:42 ` [PATCH 17/20] scsi: pm8001: fix NCQ NON DATA command task initialization Damien Le Moal
2022-02-10 11:42 ` [PATCH 18/20] scsi: pm8001: fix NCQ NON DATA command completion handling Damien Le Moal
2022-02-10 11:42 ` [PATCH 19/20] scsi: pm8001: cleanup pm8001_queue_command() Damien Le Moal
2022-02-10 14:53   ` John Garry
2022-02-10 11:42 ` [PATCH 20/20] scsi: pm8001: fix abort all task initialization Damien Le Moal
2022-02-10 14:28   ` John Garry
2022-02-10 22:43     ` Damien Le Moal
2022-02-10 15:35 ` [PATCH 00/20] libsas and pm8001 fixes John Garry
2022-02-10 22:44   ` Damien Le Moal
2022-02-11  9:24     ` John Garry
2022-02-11 12:37       ` Damien Le Moal
2022-02-11 13:08         ` John Garry
2022-02-11 13:14           ` Damien Le Moal
2022-02-11 13:54             ` John Garry
2022-02-12  6:19               ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.