linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw
@ 2020-10-02 14:30 John Garry
  2020-10-02 14:30 ` [PATCH 1/7] scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() John Garry
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen; +Cc: linux-scsi, linux-kernel, linuxarm, John Garry

This series adds runtime PM support for v3 hw. Consists of:
- Switch to new PM suspend and resume framework
- Add links to devices to ensure host cannot be suspended while devices
  are not
- Filter out phy events during suspend to avoid deadlock
- Add controller RPM support
- And some more minor misc related changes

I also included a random small fix, only visible when #CPUs < #queues and
MSI affinity module param set.

Note that this series does not conflict with patch "scsi: hisi_sas:
Switch v3 hw to MQ", which is supposed to go through the block tree:

https://lore.kernel.org/linux-scsi/32574da3d8de863ff38347ef6ead9b35@mail.gmail.com/T/#m39c82fc8a3e3a6b20247d0bd0122d2916e620a28

Luo Jiaxing (1):
  scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling
    synchronize_irq()

Xiang Chen (6):
  scsi: hisi_sas: Switch to new framework to support suspend and resume
  scsi: hisi_sas: Add controller runtime PM support for v3 hw
  scsi: hisi_sas: Add the check of the definition of method _PS0 and
    _PR0
  scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
  scsi: hisi_sas: Filter out new PHYs up events during suspended
  scsi: hisi_sas: Recover phys state according to the status before
    reset

 drivers/scsi/hisi_sas/Kconfig          |   1 +
 drivers/scsi/hisi_sas/hisi_sas.h       |   2 +
 drivers/scsi/hisi_sas/hisi_sas_main.c  |  10 ++-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 105 +++++++++++++++++++++++--
 4 files changed, 107 insertions(+), 11 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-02 14:30 ` [PATCH 2/7] scsi: hisi_sas: Switch to new framework to support suspend and resume John Garry
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Luo Jiaxing, John Garry

From: Luo Jiaxing <luojiaxing@huawei.com>

We got one call trace when running function level reset with online CPUs
number less than 16 and enable MSI auto-affinity.

[16538.348038] Call trace:
[16538.348422]  pci_irq_vector+0x98/0xc0
[16538.348947]  disable_host_v3_hw+0x8c/0x288 [hisi_sas_v3_hw]
[16538.349706]  hisi_sas_reset_prepare_v3_hw+0x60/0x88 [hisi_sas_v3_hw]
[16538.350631]  pci_dev_save_and_disable+0x38/0x68
[16538.351290]  pci_reset_function+0x44/0x88
[16538.351846]  reset_store+0x6c/0xb8
[16538.352429]  dev_attr_store+0x44/0x60
[16538.353035]  sysfs_kf_write+0x58/0x80
[16538.353558]  kernfs_fop_write+0x140/0x230
[16538.354175]  __vfs_write+0x48/0x80
[16538.354675]  vfs_write+0xb8/0x1d8
[16538.355145]  ksys_write+0x74/0xf8
[16538.355615]  __arm64_sys_write+0x24/0x30
[16538.356240]  el0_svc_common.constprop.4+0x80/0x1f0
[16538.356905]  do_el0_svc+0x2c/0x38
[16538.357408]  el0_svc+0x14/0x40
[16538.357848]  el0_sync_handler+0xbc/0x2ec
[16538.358388]  el0_sync+0x140/0x180

The reason is that if we use pci_alloc_irq_vectors_affinity() to alloc IRQ,
the number of CQ IRQs can only be less than or equal to the number of
online CPUs, but we use hisi_hba->queue_count(alway 16) for cycle at
interrupt_disable_v3_hw(). So pci_irq_vector() warn us by this call trace.

So use hisi_hba->cq_nvecs to replace hisi_hba->queue_count to avoid
synchronize IRQ which is not existed.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 87bda037303f..0cc186fcbca8 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -2525,10 +2525,11 @@ static void interrupt_disable_v3_hw(struct hisi_hba *hisi_hba)
 	synchronize_irq(pci_irq_vector(pdev, 1));
 	synchronize_irq(pci_irq_vector(pdev, 2));
 	synchronize_irq(pci_irq_vector(pdev, 11));
-	for (i = 0; i < hisi_hba->queue_count; i++) {
+	for (i = 0; i < hisi_hba->queue_count; i++)
 		hisi_sas_write32(hisi_hba, OQ0_INT_SRC_MSK + 0x4 * i, 0x1);
+
+	for (i = 0; i < hisi_hba->cq_nvecs; i++)
 		synchronize_irq(pci_irq_vector(pdev, i + 16));
-	}
 
 	hisi_sas_write32(hisi_hba, ENT_INT_SRC_MSK1, 0xffffffff);
 	hisi_sas_write32(hisi_hba, ENT_INT_SRC_MSK2, 0xffffffff);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/7] scsi: hisi_sas: Switch to new framework to support suspend and resume
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
  2020-10-02 14:30 ` [PATCH 1/7] scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-02 14:30 ` [PATCH 3/7] scsi: hisi_sas: Add controller runtime PM support for v3 hw John Garry
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Xiang Chen, John Garry

From: Xiang Chen <chenxiang66@hisilicon.com>

For v3 hw, we will add support for runtime PM which is only supported in
new framework. Legacy PM support and new framework are not allowed to be
used together. So switch to new framework to support suspend and resume.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 0cc186fcbca8..e73c124355e5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -3407,8 +3407,9 @@ enum {
 	hip08,
 };
 
-static int hisi_sas_v3_suspend(struct pci_dev *pdev, pm_message_t state)
+static int suspend_v3_hw(struct device *device)
 {
+	struct pci_dev *pdev = to_pci_dev(device);
 	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
 	struct hisi_hba *hisi_hba = sha->lldd_ha;
 	struct device *dev = hisi_hba->dev;
@@ -3439,7 +3440,7 @@ static int hisi_sas_v3_suspend(struct pci_dev *pdev, pm_message_t state)
 
 	hisi_sas_init_mem(hisi_hba);
 
-	device_state = pci_choose_state(pdev, state);
+	device_state = pci_choose_state(pdev, PMSG_SUSPEND);
 	dev_warn(dev, "entering operating state [D%d]\n",
 			device_state);
 	pci_save_state(pdev);
@@ -3452,8 +3453,9 @@ static int hisi_sas_v3_suspend(struct pci_dev *pdev, pm_message_t state)
 	return 0;
 }
 
-static int hisi_sas_v3_resume(struct pci_dev *pdev)
+static int resume_v3_hw(struct device *device)
 {
+	struct pci_dev *pdev = to_pci_dev(device);
 	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
 	struct hisi_hba *hisi_hba = sha->lldd_ha;
 	struct Scsi_Host *shost = hisi_hba->shost;
@@ -3501,14 +3503,17 @@ static const struct pci_error_handlers hisi_sas_err_handler = {
 	.reset_done	= hisi_sas_reset_done_v3_hw,
 };
 
+static const struct dev_pm_ops hisi_sas_v3_pm_ops = {
+	SET_SYSTEM_SLEEP_PM_OPS(suspend_v3_hw, resume_v3_hw)
+};
+
 static struct pci_driver sas_v3_pci_driver = {
 	.name		= DRV_NAME,
 	.id_table	= sas_v3_pci_table,
 	.probe		= hisi_sas_v3_probe,
 	.remove		= hisi_sas_v3_remove,
-	.suspend	= hisi_sas_v3_suspend,
-	.resume		= hisi_sas_v3_resume,
 	.err_handler	= &hisi_sas_err_handler,
+	.driver.pm	= &hisi_sas_v3_pm_ops,
 };
 
 module_pci_driver(sas_v3_pci_driver);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/7] scsi: hisi_sas: Add controller runtime PM support for v3 hw
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
  2020-10-02 14:30 ` [PATCH 1/7] scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() John Garry
  2020-10-02 14:30 ` [PATCH 2/7] scsi: hisi_sas: Switch to new framework to support suspend and resume John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-02 14:30 ` [PATCH 4/7] scsi: hisi_sas: Add the check of the definition of method _PS0 and _PR0 John Garry
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Xiang Chen, John Garry

From: Xiang Chen <chenxiang66@hisilicon.com>

Add controller runtime PM support for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas.h       |  2 +
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 56 +++++++++++++++++++++++++-
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index c617ac8d8315..961842ee8906 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -19,6 +19,7 @@
 #include <linux/of_address.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
 #include <linux/property.h>
 #include <linux/regmap.h>
 #include <linux/timer.h>
@@ -32,6 +33,7 @@
 #define HISI_SAS_MAX_DEVICES HISI_SAS_MAX_ITCT_ENTRIES
 #define HISI_SAS_RESET_BIT	0
 #define HISI_SAS_REJECT_CMD_BIT	1
+#define HISI_SAS_PM_BIT		2
 #define HISI_SAS_MAX_COMMANDS (HISI_SAS_QUEUE_SLOTS)
 #define HISI_SAS_RESERVED_IPTT  96
 #define HISI_SAS_UNRESERVED_IPTT \
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index e73c124355e5..fa9db57cc3fc 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -3314,6 +3314,17 @@ hisi_sas_v3_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	scsi_scan_host(shost);
 
+	/*
+	 * For the situation that there are ATA disks connected with SAS
+	 * controller, it addtionly creates ata_port which will affect the
+	 * child_count of hisi_hba->dev. Even if suspended all the disks,
+	 * ata_port is still and the child_count of hisi_hba->dev is not 0.
+	 * So use pm_suspend_ignore_children() to ignore the affect to
+	 * hisi_hba->dev.
+	 */
+	pm_suspend_ignore_children(dev, true);
+	pm_runtime_put_noidle(&pdev->dev);
+
 	return 0;
 
 err_out_register_ha:
@@ -3353,6 +3364,7 @@ static void hisi_sas_v3_remove(struct pci_dev *pdev)
 	struct hisi_hba *hisi_hba = sha->lldd_ha;
 	struct Scsi_Host *shost = sha->core.shost;
 
+	pm_runtime_get_noresume(dev);
 	if (timer_pending(&hisi_hba->timer))
 		del_timer(&hisi_hba->timer);
 
@@ -3407,7 +3419,7 @@ enum {
 	hip08,
 };
 
-static int suspend_v3_hw(struct device *device)
+static int _suspend_v3_hw(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
@@ -3453,7 +3465,7 @@ static int suspend_v3_hw(struct device *device)
 	return 0;
 }
 
-static int resume_v3_hw(struct device *device)
+static int _resume_v3_hw(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
@@ -3492,6 +3504,34 @@ static int resume_v3_hw(struct device *device)
 	return 0;
 }
 
+static int suspend_v3_hw(struct device *device)
+{
+	struct pci_dev *pdev = to_pci_dev(device);
+	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
+	struct hisi_hba *hisi_hba = sha->lldd_ha;
+	int rc;
+
+	set_bit(HISI_SAS_PM_BIT, &hisi_hba->flags);
+
+	rc = _suspend_v3_hw(device);
+	if (rc)
+		clear_bit(HISI_SAS_PM_BIT, &hisi_hba->flags);
+
+	return rc;
+}
+
+static int resume_v3_hw(struct device *device)
+{
+	struct pci_dev *pdev = to_pci_dev(device);
+	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
+	struct hisi_hba *hisi_hba = sha->lldd_ha;
+	int rc = _resume_v3_hw(device);
+
+	clear_bit(HISI_SAS_PM_BIT, &hisi_hba->flags);
+
+	return rc;
+}
+
 static const struct pci_device_id sas_v3_pci_table[] = {
 	{ PCI_VDEVICE(HUAWEI, 0xa230), hip08 },
 	{}
@@ -3503,8 +3543,20 @@ static const struct pci_error_handlers hisi_sas_err_handler = {
 	.reset_done	= hisi_sas_reset_done_v3_hw,
 };
 
+static int runtime_suspend_v3_hw(struct device *dev)
+{
+	return suspend_v3_hw(dev);
+}
+
+static int runtime_resume_v3_hw(struct device *dev)
+{
+	return resume_v3_hw(dev);
+}
+
 static const struct dev_pm_ops hisi_sas_v3_pm_ops = {
 	SET_SYSTEM_SLEEP_PM_OPS(suspend_v3_hw, resume_v3_hw)
+	SET_RUNTIME_PM_OPS(runtime_suspend_v3_hw,
+			   runtime_resume_v3_hw, NULL)
 };
 
 static struct pci_driver sas_v3_pci_driver = {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/7] scsi: hisi_sas: Add the check of the definition of method _PS0 and _PR0
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
                   ` (2 preceding siblings ...)
  2020-10-02 14:30 ` [PATCH 3/7] scsi: hisi_sas: Add controller runtime PM support for v3 hw John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-02 14:30 ` [PATCH 5/7] scsi: hisi_sas: Add device link between SCSI devices and hisi_hba John Garry
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Xiang Chen, John Garry

From: Xiang Chen <chenxiang66@hisilicon.com>

To support system suspend/resume or runtime suspend/resume, need to use
the function pci_set_power_state() to change the power state which requires
at least method _PS0 or _PR0 be filled by platform for v3 hw. So check
whether the method is supported, if not, add a print to remind.

A Kconfig dependency is added as there is no stub for
acpi_device_power_manageable().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/Kconfig          | 1 +
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/drivers/scsi/hisi_sas/Kconfig b/drivers/scsi/hisi_sas/Kconfig
index 13ed9073fc72..b8148b1733f8 100644
--- a/drivers/scsi/hisi_sas/Kconfig
+++ b/drivers/scsi/hisi_sas/Kconfig
@@ -15,5 +15,6 @@ config SCSI_HISI_SAS_PCI
 	tristate "HiSilicon SAS on PCI bus"
 	depends on SCSI_HISI_SAS
 	depends on PCI
+	depends on ACPI
 	help
 		This driver supports HiSilicon's SAS HBA based on PCI device
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index fa9db57cc3fc..708b5661b127 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -903,6 +903,7 @@ static int reset_hw_v3_hw(struct hisi_hba *hisi_hba)
 static int hw_init_v3_hw(struct hisi_hba *hisi_hba)
 {
 	struct device *dev = hisi_hba->dev;
+	struct acpi_device *acpi_dev;
 	union acpi_object *obj;
 	guid_t guid;
 	int rc;
@@ -933,6 +934,9 @@ static int hw_init_v3_hw(struct hisi_hba *hisi_hba)
 	else
 		ACPI_FREE(obj);
 
+	acpi_dev = ACPI_COMPANION(dev);
+	if (!acpi_device_power_manageable(acpi_dev))
+		dev_notice(dev, "neither _PS0 nor _PR0 is defined\n");
 	return 0;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/7] scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
                   ` (3 preceding siblings ...)
  2020-10-02 14:30 ` [PATCH 4/7] scsi: hisi_sas: Add the check of the definition of method _PS0 and _PR0 John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-02 14:30 ` [PATCH 6/7] scsi: hisi_sas: Filter out new PHYs up events during suspended John Garry
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Xiang Chen, John Garry

From: Xiang Chen <chenxiang66@hisilicon.com>

Runtime PM of SCSI devices are already supported in SCSI layer, we can
suspend/resume every SCSI device separately. But if there is not link
between hisi_hba and SCSI devices or SCSI targets, it will cause issues
if the controller is suspended while SCSI devices are still resuming.
If only when all the SCSI devices under the controller are suspended,
the controller can be suspended. So add the device link between
SCSI devices and the controller.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 29 +++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 708b5661b127..c9353e02fdd5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -2758,6 +2758,33 @@ static ssize_t intr_coal_count_v3_hw_store(struct device *dev,
 }
 static DEVICE_ATTR_RW(intr_coal_count_v3_hw);
 
+static int slave_configure_v3_hw(struct scsi_device *sdev)
+{
+	struct Scsi_Host *shost = dev_to_shost(&sdev->sdev_gendev);
+	struct domain_device *ddev = sdev_to_domain_dev(sdev);
+	struct hisi_hba *hisi_hba = shost_priv(shost);
+	struct device *dev = hisi_hba->dev;
+	int ret = sas_slave_configure(sdev);
+
+	if (ret)
+		return ret;
+	if (!dev_is_sata(ddev))
+		sas_change_queue_depth(sdev, 64);
+
+	if (sdev->type == TYPE_ENCLOSURE)
+		return 0;
+
+	if (!device_link_add(&sdev->sdev_gendev, dev,
+			     DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE)) {
+		if (pm_runtime_enabled(dev)) {
+			dev_info(dev, "add device link failed, disable runtime PM for the host\n");
+			pm_runtime_disable(dev);
+		}
+	}
+
+	return 0;
+}
+
 static struct device_attribute *host_attrs_v3_hw[] = {
 	&dev_attr_phy_event_threshold,
 	&dev_attr_intr_conv_v3_hw,
@@ -3114,7 +3141,7 @@ static struct scsi_host_template sht_v3_hw = {
 	.queuecommand		= sas_queuecommand,
 	.dma_need_drain		= ata_scsi_dma_need_drain,
 	.target_alloc		= sas_target_alloc,
-	.slave_configure	= hisi_sas_slave_configure,
+	.slave_configure	= slave_configure_v3_hw,
 	.scan_finished		= hisi_sas_scan_finished,
 	.scan_start		= hisi_sas_scan_start,
 	.change_queue_depth	= sas_change_queue_depth,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/7] scsi: hisi_sas: Filter out new PHYs up events during suspended
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
                   ` (4 preceding siblings ...)
  2020-10-02 14:30 ` [PATCH 5/7] scsi: hisi_sas: Add device link between SCSI devices and hisi_hba John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-02 14:30 ` [PATCH 7/7] scsi: hisi_sas: Recover phys state according to the status before reset John Garry
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Xiang Chen, John Garry

From: Xiang Chen <chenxiang66@hisilicon.com>

Currently sas_resume_ha() is called in the last process of resuming the
controller which waits all suspended PHYs up and all the libsas event
completed. But there is a situation which will cause task hung: for
directly attached situation, two disks are connected with two PHYs, disable phy0
before suspended the disk on phy1 and the controller, then enable phy0
and resume the controller, and task hung occurs as follows:

[  591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0]
[  593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined
[  593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume
[  593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[  593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata)
[  593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200)
[  593.148350] sas: DOING DISCOVERY on port 0, pid:948
[  593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found
[  593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
[  593.165663] sas: ata7: end_device-2:0: dev error handler
[  593.165730] sas: ata2: end_device-2:1: dev error handler
[  593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2
[  593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down
[  593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[  593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133
[  593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32)
[  593.514295] ata7.00: configured for UDMA/133
[  593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[  593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005
serial:S45NNA0M712225
[  593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0
[  593.543674] device_link_add 324
[  593.546801] device_link_add 352
[  593.549930] device_link_add 406
[  593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0
[  593.559208] device_link_add 444
[  593.562335] device_link_add 455
[  593.565517] scsi 2:0:2:0: Direct-Access     ATA      SAMSUNG MZ7LH960 404Q PQ: 0
ANSI: 5
[  620.057464]  phy-2:1: resume timeout
[  738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds.
[  738.848295]       Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[  738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  738.862155] kworker/u256:0  D    0     8      2 0x00000028
[  738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker
[  738.873693] Call trace:
[  738.876133]  __switch_to+0xf4/0x148
[  738.879613]  __schedule+0x270/0x5d8
[  738.883091]  schedule+0x78/0x110
[  738.886307]  schedule_timeout+0x1ac/0x280
[  738.890299]  wait_for_completion+0x94/0x138
[  738.894472]  flush_workqueue+0x114/0x438
[  738.898377]  sas_porte_bytes_dmaed+0x400/0x500
[  738.902801]  sas_port_event_worker+0x28/0x40
[  738.907053]  process_one_work+0x1e8/0x360
[  738.911046]  worker_thread+0x44/0x478
[  738.914698]  kthread+0x150/0x158
[  738.917915]  ret_from_fork+0x10/0x1c
[  738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds.
[  738.928550]       Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[  738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  738.942408] kworker/u256:1  D    0   948      2 0x00000028
[  738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain
[  738.953766] Call trace:
[  738.956203]  __switch_to+0xf4/0x148
[  738.959678]  __schedule+0x270/0x5d8
[  738.963152]  schedule+0x78/0x110
[  738.966368]  rpm_resume+0xcc/0x550
[  738.969757]  __pm_runtime_resume+0x3c/0x88
[  738.973836]  rpm_get_suppliers+0x50/0x148
[  738.977829]  __pm_runtime_set_status+0x124/0x2f0
[  738.982427]  scsi_sysfs_add_sdev+0x1a0/0x2a8
[  738.986679]  scsi_probe_and_add_lun+0x888/0xab0
[  738.991190]  __scsi_scan_target+0xec/0x520
[  738.995268]  scsi_scan_target+0x11c/0x128
[  738.999261]  sas_rphy_add+0x15c/0x1e8
[  739.002907]  sas_probe_devices+0xe4/0x150
[  739.006899]  sas_discover_domain+0x33c/0x588
[  739.011150]  process_one_work+0x1e8/0x360
[  739.015143]  worker_thread+0x44/0x478
[  739.018789]  kthread+0x150/0x158
[  739.022003]  ret_from_fork+0x10/0x1c
...

We find that if extra phy0 up during resuming SAS controller, it will bring
new libsas event of phy0 (event PORTE_BYTES_DMAED and event
DISCE_DISCOVER_DOMAIN). It will call function scsi_sysfs_add_sdev() in
event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to
resume supplier(host controller). For runtime PM core, if device is in the
resuming status, the later resume request of the device will wait for
previous resume request completed in sync mode. So at that time the
status of the controller is still resuming as it waits for all libsas
event completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked as
the status of the controller is resuming which causes a deadlock.

To avoid the issue, filter out new PHYs up events during suspended time
of the controller.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index f18452942508..ef3922ad70c0 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -619,6 +619,12 @@ static void hisi_sas_bytes_dmaed(struct hisi_hba *hisi_hba, int phy_no)
 	if (!phy->phy_attached)
 		return;
 
+	if (test_bit(HISI_SAS_PM_BIT, &hisi_hba->flags) &&
+	    !sas_phy->suspended) {
+		dev_warn(hisi_hba->dev, "phy%d during suspend filtered out\n", phy_no);
+		return;
+	}
+
 	sas_ha = &hisi_hba->sha;
 	sas_ha->notify_phy_event(sas_phy, PHYE_OOB_DONE);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/7] scsi: hisi_sas: Recover phys state according to the status before reset
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
                   ` (5 preceding siblings ...)
  2020-10-02 14:30 ` [PATCH 6/7] scsi: hisi_sas: Filter out new PHYs up events during suspended John Garry
@ 2020-10-02 14:30 ` John Garry
  2020-10-03  3:11 ` [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw Martin K. Petersen
  2020-10-07  3:47 ` Martin K. Petersen
  8 siblings, 0 replies; 10+ messages in thread
From: John Garry @ 2020-10-02 14:30 UTC (permalink / raw)
  To: jejb, martin.petersen
  Cc: linux-scsi, linux-kernel, linuxarm, Xiang Chen, John Garry

From: Xiang Chen <chenxiang66@hisilicon.com>

Currently the phys stat is recovered according to the status of phys after
reset which is invalid, as the phys are already re-initialized before here.
Actually need to recover phys state according to the state of phys before
reset, so fix it.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index ef3922ad70c0..5b7357a5620d 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1551,7 +1551,6 @@ EXPORT_SYMBOL_GPL(hisi_sas_controller_reset_prepare);
 void hisi_sas_controller_reset_done(struct hisi_hba *hisi_hba)
 {
 	struct Scsi_Host *shost = hisi_hba->shost;
-	u32 state;
 
 	/* Init and wait for PHYs to come up and all libsas event finished. */
 	hisi_hba->hw->phys_init(hisi_hba);
@@ -1566,8 +1565,7 @@ void hisi_sas_controller_reset_done(struct hisi_hba *hisi_hba)
 	scsi_unblock_requests(shost);
 	clear_bit(HISI_SAS_RESET_BIT, &hisi_hba->flags);
 
-	state = hisi_hba->hw->get_phys_state(hisi_hba);
-	hisi_sas_rescan_topology(hisi_hba, state);
+	hisi_sas_rescan_topology(hisi_hba, hisi_hba->phy_state);
 }
 EXPORT_SYMBOL_GPL(hisi_sas_controller_reset_done);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
                   ` (6 preceding siblings ...)
  2020-10-02 14:30 ` [PATCH 7/7] scsi: hisi_sas: Recover phys state according to the status before reset John Garry
@ 2020-10-03  3:11 ` Martin K. Petersen
  2020-10-07  3:47 ` Martin K. Petersen
  8 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2020-10-03  3:11 UTC (permalink / raw)
  To: John Garry; +Cc: jejb, martin.petersen, linux-scsi, linux-kernel, linuxarm


John,

> This series adds runtime PM support for v3 hw. Consists of:
> - Switch to new PM suspend and resume framework
> - Add links to devices to ensure host cannot be suspended while devices
>   are not
> - Filter out phy events during suspend to avoid deadlock
> - Add controller RPM support
> - And some more minor misc related changes

Applied to 5.10/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw
  2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
                   ` (7 preceding siblings ...)
  2020-10-03  3:11 ` [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw Martin K. Petersen
@ 2020-10-07  3:47 ` Martin K. Petersen
  8 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2020-10-07  3:47 UTC (permalink / raw)
  To: jejb, John Garry; +Cc: Martin K . Petersen, linux-kernel, linux-scsi, linuxarm

On Fri, 2 Oct 2020 22:30:31 +0800, John Garry wrote:

> This series adds runtime PM support for v3 hw. Consists of:
> - Switch to new PM suspend and resume framework
> - Add links to devices to ensure host cannot be suspended while devices
>   are not
> - Filter out phy events during suspend to avoid deadlock
> - Add controller RPM support
> - And some more minor misc related changes
> 
> [...]

Applied to 5.10/scsi-queue, thanks!

[1/7] scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
      https://git.kernel.org/mkp/scsi/c/7f054da7738a
[2/7] scsi: hisi_sas: Switch to new framework to support suspend and resume
      https://git.kernel.org/mkp/scsi/c/6c459ea1542b
[3/7] scsi: hisi_sas: Add controller runtime PM support for v3 hw
      https://git.kernel.org/mkp/scsi/c/65ff4aef7e9b
[4/7] scsi: hisi_sas: Add check for methods _PS0 and _PR0
      https://git.kernel.org/mkp/scsi/c/e06596d5000c
[5/7] scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
      https://git.kernel.org/mkp/scsi/c/16fd4a7c5917
[6/7] scsi: hisi_sas: Filter out new PHY up events during suspend
      https://git.kernel.org/mkp/scsi/c/b14a37e011d8
[7/7] scsi: hisi_sas: Recover PHY state according to the status before reset
      https://git.kernel.org/mkp/scsi/c/69f4ec1edb13

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-07  3:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02 14:30 [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw John Garry
2020-10-02 14:30 ` [PATCH 1/7] scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq() John Garry
2020-10-02 14:30 ` [PATCH 2/7] scsi: hisi_sas: Switch to new framework to support suspend and resume John Garry
2020-10-02 14:30 ` [PATCH 3/7] scsi: hisi_sas: Add controller runtime PM support for v3 hw John Garry
2020-10-02 14:30 ` [PATCH 4/7] scsi: hisi_sas: Add the check of the definition of method _PS0 and _PR0 John Garry
2020-10-02 14:30 ` [PATCH 5/7] scsi: hisi_sas: Add device link between SCSI devices and hisi_hba John Garry
2020-10-02 14:30 ` [PATCH 6/7] scsi: hisi_sas: Filter out new PHYs up events during suspended John Garry
2020-10-02 14:30 ` [PATCH 7/7] scsi: hisi_sas: Recover phys state according to the status before reset John Garry
2020-10-03  3:11 ` [PATCH 0/7] hisi_sas: Add runtime PM support for v3 hw Martin K. Petersen
2020-10-07  3:47 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).