linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: <jejb@linux.ibm.com>, <martin.petersen@oracle.com>,
	<jinpu.wang@cloud.ionos.com>, <damien.lemoal@opensource.wdc.com>
Cc: <linux-scsi@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linuxarm@huawei.com>, <yangxingui@huawei.com>,
	<niklas.cassel@wdc.com>, "John Garry" <john.garry@huawei.com>
Subject: [PATCH v6 4/8] scsi: hisi_sas: Modify v3 HW SATA disk error state completion processing
Date: Mon, 17 Oct 2022 17:20:31 +0800	[thread overview]
Message-ID: <1665998435-199946-5-git-send-email-john.garry@huawei.com> (raw)
In-Reply-To: <1665998435-199946-1-git-send-email-john.garry@huawei.com>

From: Xingui Yang <yangxingui@huawei.com>

When an NCQ error occurs, the  controller will abnormally complete the I/Os
that are newly delivered to disk, and bit8 in CQ dw3 will be set which
indicates that the SATA disk is in error state. The current processing flow
is to set ts->stat to SAS_OPEN_REJECT and then sas_ata_task_done() will
set FIS stat to ATA_ERR. After analyzing the IO by ata_eh_analyze_tf(),
err_mask will set to AC_ERR_HSM. If media error occurs for four times
within 10 minutes and the chip rejects new I/Os for four times, NCQ will
be disabled due to excessive errors, which is undesirable.

Therefore, use sas_task_abort() to handle abnormally completed I/Os when
SATA disk is in error state, as these abnormally completed I/Os are already
processed by sas_ata_device_link_abort() and qc->flag are set to
ATA_QCFLAG_FAILED. If sas_task_abort() is used, qc->err_mask will not be
modified in EH. Unlike the current process flow, it will not increase
the count of ECAT_TOUT_HSM and not turn off NCQ. Like other I/Os on the
disk that do not have an error but do not return after the NCQ error, they
are retried after the EH.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 0ae8a60aaf93..0c3fcb807806 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -428,6 +428,8 @@
 #define CMPLT_HDR_DEV_ID_OFF		16
 #define CMPLT_HDR_DEV_ID_MSK		(0xffff << CMPLT_HDR_DEV_ID_OFF)
 /* dw3 */
+#define SATA_DISK_IN_ERROR_STATUS_OFF	8
+#define SATA_DISK_IN_ERROR_STATUS_MSK	(0x1 << SATA_DISK_IN_ERROR_STATUS_OFF)
 #define CMPLT_HDR_SATA_DISK_ERR_OFF	16
 #define CMPLT_HDR_SATA_DISK_ERR_MSK	(0x1 << CMPLT_HDR_SATA_DISK_ERR_OFF)
 #define CMPLT_HDR_IO_IN_TARGET_OFF	17
@@ -2219,7 +2221,8 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
 		} else if (dma_rx_err_type & RX_DATA_LEN_UNDERFLOW_MSK) {
 			ts->residual = trans_tx_fail_type;
 			ts->stat = SAS_DATA_UNDERRUN;
-		} else if (dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) {
+		} else if ((dw3 & CMPLT_HDR_IO_IN_TARGET_MSK) ||
+			   (dw3 & SATA_DISK_IN_ERROR_STATUS_MSK)) {
 			ts->stat = SAS_PHY_DOWN;
 			slot->abort = 1;
 		} else {
-- 
2.35.3


  parent reply	other threads:[~2022-10-17  8:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-17  9:20 [PATCH v6 0/8] libsas and drivers: NCQ error handling John Garry
2022-10-17  9:20 ` [PATCH v6 1/8] scsi: libsas: Add sas_ata_device_link_abort() John Garry
2022-10-17 10:52   ` Niklas Cassel
2022-10-17 10:57     ` John Garry
2022-10-17  9:20 ` [PATCH v6 2/8] scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task() John Garry
2022-10-17  9:20 ` [PATCH v6 3/8] scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw John Garry
2022-10-17  9:20 ` John Garry [this message]
2022-10-17 10:45   ` [PATCH v6 4/8] scsi: hisi_sas: Modify v3 HW SATA disk error state completion processing Niklas Cassel
2022-10-17 11:01     ` John Garry
2022-10-17  9:20 ` [PATCH v6 5/8] scsi: pm8001: Modify task abort handling for SATA task John Garry
2022-10-17  9:20 ` [PATCH v6 6/8] scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors John Garry
2022-10-17  9:01   ` Jinpu Wang
2022-10-17  9:20 ` [PATCH v6 7/8] scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private John Garry
2022-10-17  9:20 ` [PATCH v6 8/8] scsi: libsas: Update SATA dev FIS in sas_ata_task_done() John Garry
2022-10-17 10:41   ` Niklas Cassel
2022-10-18  2:40 ` [PATCH v6 0/8] libsas and drivers: NCQ error handling Martin K. Petersen
2022-10-22  3:52 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1665998435-199946-5-git-send-email-john.garry@huawei.com \
    --to=john.garry@huawei.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=jejb@linux.ibm.com \
    --cc=jinpu.wang@cloud.ionos.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=niklas.cassel@wdc.com \
    --cc=yangxingui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).