linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: <jejb@linux.vnet.ibm.com>, <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>, <linuxarm@huawei.com>,
	<linux-kernel@vger.kernel.org>,
	Xiang Chen <chenxiang66@hisilicon.com>,
	"John Garry" <john.garry@huawei.com>
Subject: [PATCH 13/13] scsi: hisi_sas: Fix the conflict between device gone and host reset
Date: Fri, 6 Sep 2019 20:55:37 +0800	[thread overview]
Message-ID: <1567774537-20003-14-git-send-email-john.garry@huawei.com> (raw)
In-Reply-To: <1567774537-20003-1-git-send-email-john.garry@huawei.com>

From: Xiang Chen <chenxiang66@hisilicon.com>

When device gone, it will check whether it is during reset, if not, it
will send internal task abort. Before internal task abort returned, reset
begins, and it will check whether SAS_PHY_UNUSED is set, if not, it will
call hisi_sas_init_device(), but at that time domain_device may already
be freed or part of it is freed, so it may referenece null pointer in
hisi_sas_init_device(). It may occur as follows:
    thread0				thread1
hisi_sas_dev_gone()
    check whether in RESET(no)
    internal task abort
				    reset prep
				    soft_reset
				    ... (part of reset_done)
    internal task abort failed
    release resource anyway
    clear_itct
    device->lldd_dev=NULL
				    hisi_sas_reset_init_all_device
					check sas_dev->dev_type is SAS_PHY_UNUSED and
					!device
    set dev_type SAS_PHY_UNUSED
    sas_free_device
					hisi_sas_init_device
					...

Semaphore hisi_hba.sema is used to sync the processes of device gone and
host reset.

To solve the issue, expand the scope that semaphore protects and let
them never occur together.

And also some places will check whether domain_device is NULL to judge
whether the device is gone. So when device gone, need to clear
sas_dev->sas_device.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 04cbc54be387..a7b3d9d38fdc 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1049,21 +1049,22 @@ static void hisi_sas_dev_gone(struct domain_device *device)
 	dev_info(dev, "dev[%d:%x] is gone\n",
 		 sas_dev->device_id, sas_dev->dev_type);
 
+	down(&hisi_hba->sem);
 	if (!test_bit(HISI_SAS_RESET_BIT, &hisi_hba->flags)) {
 		hisi_sas_internal_task_abort(hisi_hba, device,
 					     HISI_SAS_INT_ABT_DEV, 0);
 
 		hisi_sas_dereg_device(hisi_hba, device);
 
-		down(&hisi_hba->sem);
 		hisi_hba->hw->clear_itct(hisi_hba, sas_dev);
-		up(&hisi_hba->sem);
 		device->lldd_dev = NULL;
 	}
 
 	if (hisi_hba->hw->free_device)
 		hisi_hba->hw->free_device(sas_dev);
 	sas_dev->dev_type = SAS_PHY_UNUSED;
+	sas_dev->sas_device = NULL;
+	up(&hisi_hba->sem);
 }
 
 static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
@@ -1543,11 +1544,11 @@ void hisi_sas_controller_reset_done(struct hisi_hba *hisi_hba)
 	msleep(1000);
 	hisi_sas_refresh_port_id(hisi_hba);
 	clear_bit(HISI_SAS_REJECT_CMD_BIT, &hisi_hba->flags);
-	up(&hisi_hba->sem);
 
 	if (hisi_hba->reject_stp_links_msk)
 		hisi_sas_terminate_stp_reject(hisi_hba);
 	hisi_sas_reset_init_all_devices(hisi_hba);
+	up(&hisi_hba->sem);
 	scsi_unblock_requests(shost);
 	clear_bit(HISI_SAS_RESET_BIT, &hisi_hba->flags);
 
-- 
2.17.1


  parent reply	other threads:[~2019-09-06 12:58 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06 12:55 [PATCH 00/13] hisi_sas: Some misc patches John Garry
2019-09-06 12:55 ` [PATCH 01/13] scsi: hisi_sas: add debugfs auto-trigger for internal abort time out John Garry
2019-09-06 12:55 ` [PATCH 02/13] scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset() John Garry
2019-09-06 12:55 ` [PATCH 03/13] scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled John Garry
2019-09-06 12:55 ` [PATCH 04/13] scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails John Garry
2019-09-06 12:55 ` [PATCH 05/13] scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device John Garry
2019-09-06 12:55 ` [PATCH 06/13] scsi: hisi_sas: Update all the registers after suspend and resume John Garry
2019-09-06 12:55 ` [PATCH 07/13] scsi: hisi_sas: Assign NCQ tag for all NCQ commands John Garry
2019-09-06 12:55 ` [PATCH 08/13] scsi: hisi_sas: Remove hisi_sas_hw.slot_complete John Garry
2019-09-06 12:55 ` [PATCH 09/13] scsi: hisi_sas: Remove redundant work declaration John Garry
2019-09-06 12:55 ` [PATCH 10/13] scsi: hisi_sas: Remove some unused function arguments John Garry
2019-09-06 12:55 ` [PATCH 11/13] scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation John Garry
2019-09-06 12:55 ` [PATCH 12/13] scsi: hisi_sas: Add BIST support for phy loopback John Garry
2019-09-06 12:55 ` John Garry [this message]
2019-09-11  2:29 ` [PATCH 00/13] hisi_sas: Some misc patches Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1567774537-20003-14-git-send-email-john.garry@huawei.com \
    --to=john.garry@huawei.com \
    --cc=chenxiang66@hisilicon.com \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).