All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix sysfs recursive removal splats in isci
@ 2017-03-29  9:41 Johannes Thumshirn
  2017-03-29  9:41 ` [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister Johannes Thumshirn
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

This series fixes a sysfs warning caused by isci not being able to cope with
recursive sysfs path removals which are in place since commit bcdde7e
("sysfs: make __sysfs_remove_dir() recursive").

The mvsas, aic94xx and pm8001 and hisi_sas patches have been compile tested
only hence they have no callstack of the affected path in their changelogs.

I'm not sure whether to mark this patches as stable or not. I tend to say no
here, although we've seen complaints/bug reports on lkml and the scsi list.

Johannes Thumshirn (6):
      scsi: sas: flush destruct workqueue on device unregister
      scsi: isci: remove the SAS host after the SCSI host
      aic94xx: remove the SAS host after the SCSI host
      scsi: hisi_sas: remove the SAS host after the SCSI host
      mvsas: remove the SAS host after the SCSI host
      scsi: pm8001: remove the SAS host after the SCSI host

 drivers/scsi/aic94xx/aic94xx_init.c   | 13 ++++++++++---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 10 ++++++++--
 drivers/scsi/isci/init.c              |  9 ++++++++-
 drivers/scsi/libsas/sas_discover.c    |  4 ++++
 drivers/scsi/mvsas/mv_init.c          | 13 +++++++++++--
 drivers/scsi/pm8001/pm8001_init.c     | 14 ++++++++++++--
 6 files changed, 53 insertions(+), 10 deletions(-)
-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
@ 2017-03-29  9:41 ` Johannes Thumshirn
  2017-03-29 11:15   ` John Garry
  2017-03-29  9:41 ` [PATCH 2/2] scsi: isci: remove the SAS host after the SCSI host Johannes Thumshirn
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

In the advent of an SAS device unregister we have to wait for all destruct
works to be done to not accidently delay deletion of a SAS rphy or it's
children to the point when we're removing the SCSI or SAS hosts.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/libsas/sas_discover.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index 60de662..75b18f1 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
 	}
 
 	if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
+		struct sas_discovery *disc = &dev->port->disc;
+		struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
+
 		sas_rphy_unlink(dev->rphy);
 		list_move_tail(&dev->disco_list_node, &port->destroy_list);
 		sas_discover_event(dev->port, DISCE_DESTRUCT);
+		flush_work(&sw->work);
 	}
 }
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] scsi: isci: remove the SAS host after the SCSI host
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
  2017-03-29  9:41 ` [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister Johannes Thumshirn
@ 2017-03-29  9:41 ` Johannes Thumshirn
  2017-03-29 10:17   ` Hannes Reinecke
  2017-03-29  9:41 ` [PATCH 3/6] aic94xx: " Johannes Thumshirn
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
removal path of kernfs to make it recursive we have to remove the SAS host
before the SCSI host or we will see sysfs warnings like the below when
triggering the the removal of the SAS HBA PCI device like with writing to the
sysfs pci remove file:

echo 1 > /sys/module/isci/drivers/pci:isci/<device>/remove

WARNING: CPU: 2 PID: 5 at fs/sysfs/group.c:241 sysfs_remove_group+0xc3/0xd0
sysfs group 'power' not found for kobject 'end_device-6:0'
CPU: 16 PID: 5884 Comm: echo Not tainted 4.11.0-rc3-libsas+ #504
Call Trace:
 dump_stack+0x85/0xc2
 __warn+0xc6/0xe0
 warn_slowpath_fmt+0x4a/0x50
 sysfs_remove_group+0xc3/0xd0
 dpm_sysfs_remove+0x52/0x60
 device_del+0x13c/0x360
 ? device_remove_file+0x14/0x20
 attribute_container_class_device_del+0x15/0x20
 transport_remove_classdev+0x4c/0x60
 ? transport_add_class_device+0x40/0x40
 attribute_container_device_trigger+0xb3/0xc0
 transport_remove_device+0x10/0x20
 sas_port_delete+0x12d/0x160 [scsi_transport_sas]
 sas_deform_port+0x1bf/0x1d0 [libsas]
 sas_unregister_ports+0x36/0x50 [libsas]
 sas_unregister_ha+0x1b/0x40 [libsas]
 isci_unregister+0x2a/0x40 [isci]
 isci_pci_remove+0x52/0xb0 [isci]
 ? __pm_runtime_resume+0x56/0x80
 pci_device_remove+0x34/0xb0
 device_release_driver_internal+0x158/0x210
 device_release_driver+0xd/0x10
 pci_stop_bus_device+0x85/0x90
 pci_stop_and_remove_bus_device_locked+0x15/0x30
 remove_store+0x59/0x70
 dev_attr_store+0x13/0x20
 sysfs_kf_write+0x40/0x50
 kernfs_fop_write+0x130/0x1b0
 __vfs_write+0x23/0x130
 ? rcu_read_lock_sched_held+0x6d/0x80
 ? rcu_sync_lockdep_assert+0x2a/0x50
 ? __sb_start_write+0xd7/0x1e0
 ? vfs_write+0x1a4/0x1f0
 vfs_write+0xc6/0x1f0
 SyS_write+0x44/0xa0
 entry_SYSCALL_64_fastpath+0x23/0xc6

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/isci/init.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index 0b5b5db..afa6b25 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -267,15 +267,22 @@ static int isci_register_sas_ha(struct isci_host *isci_host)
 static void isci_unregister(struct isci_host *isci_host)
 {
 	struct Scsi_Host *shost;
+	unsigned long flags;
 
 	if (!isci_host)
 		return;
 
 	shost = to_shost(isci_host);
-	scsi_remove_host(shost);
+
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_CANCEL))
+		WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	sas_unregister_ha(&isci_host->sas_ha);
 
 	sas_remove_host(shost);
+	scsi_remove_host(shost);
 	scsi_host_put(shost);
 }
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] aic94xx: remove the SAS host after the SCSI host
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
  2017-03-29  9:41 ` [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister Johannes Thumshirn
  2017-03-29  9:41 ` [PATCH 2/2] scsi: isci: remove the SAS host after the SCSI host Johannes Thumshirn
@ 2017-03-29  9:41 ` Johannes Thumshirn
  2017-03-29 10:17   ` Hannes Reinecke
  2017-03-29  9:41 ` [PATCH 4/6] scsi: hisi_sas: " Johannes Thumshirn
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
removal path of kernfs to make it recursive we have to remove the SAS host
before the SCSI host or we will see sysfs warnings on not found sysfs groups for
kobjects.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/aic94xx/aic94xx_init.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c
index 662b232..362d65a 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -701,13 +701,20 @@ static int asd_register_sas_ha(struct asd_ha_struct *asd_ha)
 
 static int asd_unregister_sas_ha(struct asd_ha_struct *asd_ha)
 {
+	struct Scsi_Host *shost = asd_ha->sas_ha.core.shost;
+	unsigned long flags;
 	int err;
 
-	scsi_remove_host(asd_ha->sas_ha.core.shost);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_CANCEL))
+		WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	err = sas_unregister_ha(&asd_ha->sas_ha);
 
-	sas_remove_host(asd_ha->sas_ha.core.shost);
-	scsi_host_put(asd_ha->sas_ha.core.shost);
+	sas_remove_host(shost);
+	scsi_remove_host(shost);
+	scsi_host_put(shost);
 
 	kfree(asd_ha->sas_ha.sas_phy);
 	kfree(asd_ha->sas_ha.sas_port);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] scsi: hisi_sas: remove the SAS host after the SCSI host
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
                   ` (2 preceding siblings ...)
  2017-03-29  9:41 ` [PATCH 3/6] aic94xx: " Johannes Thumshirn
@ 2017-03-29  9:41 ` Johannes Thumshirn
  2017-03-29 10:17   ` Hannes Reinecke
  2017-03-29  9:41 ` [PATCH 5/6] mvsas: " Johannes Thumshirn
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
removal path of kernfs to make it recursive we have to remove the SAS host
before the SCSI host or we will see sysfs warnings on not found sysfs groups for
kobjects.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 53637a9..34476c2 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1582,10 +1582,16 @@ int hisi_sas_remove(struct platform_device *pdev)
 	struct sas_ha_struct *sha = platform_get_drvdata(pdev);
 	struct hisi_hba *hisi_hba = sha->lldd_ha;
 	struct Scsi_Host *shost = sha->core.shost;
+	unsigned long flags;
+
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_CANCEL))
+		WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
-	scsi_remove_host(sha->core.shost);
 	sas_unregister_ha(sha);
-	sas_remove_host(sha->core.shost);
+	sas_remove_host(shost);
+	scsi_remove_host(shost);
 
 	hisi_sas_free(hisi_hba);
 	kfree(shost);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] mvsas: remove the SAS host after the SCSI host
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
                   ` (3 preceding siblings ...)
  2017-03-29  9:41 ` [PATCH 4/6] scsi: hisi_sas: " Johannes Thumshirn
@ 2017-03-29  9:41 ` Johannes Thumshirn
  2017-03-29 10:18   ` Hannes Reinecke
  2017-03-29  9:41 ` [PATCH 6/6] scsi: pm8001: " Johannes Thumshirn
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
removal path of kernfs to make it recursive we have to remove the SAS host
before the SCSI host or we will see sysfs warnings on not found sysfs groups for
kobjects.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/mvsas/mv_init.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 8280046..32b86c8 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -634,17 +634,26 @@ static void mvs_pci_remove(struct pci_dev *pdev)
 	unsigned short core_nr, i = 0;
 	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
 	struct mvs_info *mvi = NULL;
+	struct Scsi_Host *shost;
+	unsigned long flags;
 
 	core_nr = ((struct mvs_prv_info *)sha->lldd_ha)->n_host;
 	mvi = ((struct mvs_prv_info *)sha->lldd_ha)->mvi[0];
+	shost = mvi->shost;
 
 #ifdef CONFIG_SCSI_MVSAS_TASKLET
 	tasklet_kill(&((struct mvs_prv_info *)sha->lldd_ha)->mv_tasklet);
 #endif
 
-	scsi_remove_host(mvi->shost);
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_CANCEL))
+		WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
+
 	sas_unregister_ha(sha);
-	sas_remove_host(mvi->shost);
+	sas_remove_host(shost);
+	scsi_remove_host(shost);
 
 	MVS_CHIP_DISP->interrupt_disable(mvi);
 	free_irq(mvi->pdev->irq, sha);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] scsi: pm8001: remove the SAS host after the SCSI host
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
                   ` (4 preceding siblings ...)
  2017-03-29  9:41 ` [PATCH 5/6] mvsas: " Johannes Thumshirn
@ 2017-03-29  9:41 ` Johannes Thumshirn
  2017-03-29 10:18   ` Hannes Reinecke
  2017-03-29 10:27   ` Jinpu Wang
  2017-03-29 11:37 ` [PATCH 0/2] Fix sysfs recursive removal splats in isci James Bottomley
  2017-03-29 15:23 ` Tejun Heo
  7 siblings, 2 replies; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29  9:41 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist, Johannes Thumshirn

After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
removal path of kernfs to make it recursive we have to remove the SAS host
before the SCSI host or we will see sysfs warnings on not found sysfs groups for
kobjects.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/pm8001/pm8001_init.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index 417368c..9116e9c 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -1086,11 +1086,21 @@ static void pm8001_pci_remove(struct pci_dev *pdev)
 {
 	struct sas_ha_struct *sha = pci_get_drvdata(pdev);
 	struct pm8001_hba_info *pm8001_ha;
+	unsigned long flags;
+	struct Scsi_Host *shost;
 	int i, j;
 	pm8001_ha = sha->lldd_ha;
-	scsi_remove_host(pm8001_ha->shost);
+	shost = pm8001_ha->shost;
+
+	spin_lock_irqsave(shost->host_lock, flags);
+	if (scsi_host_set_state(shost, SHOST_CANCEL))
+		WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	sas_unregister_ha(sha);
-	sas_remove_host(pm8001_ha->shost);
+	sas_remove_host(shost);
+	scsi_remove_host(shost);
+
 	list_del(&pm8001_ha->list);
 	PM8001_CHIP_DISP->interrupt_disable(pm8001_ha, 0xFF);
 	PM8001_CHIP_DISP->chip_soft_rst(pm8001_ha);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] scsi: isci: remove the SAS host after the SCSI host
  2017-03-29  9:41 ` [PATCH 2/2] scsi: isci: remove the SAS host after the SCSI host Johannes Thumshirn
@ 2017-03-29 10:17   ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2017-03-29 10:17 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Linux SCSI Mailinglist, Linux Kernel Mailinglist

On 03/29/2017 11:41 AM, Johannes Thumshirn wrote:
> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
> removal path of kernfs to make it recursive we have to remove the SAS host
> before the SCSI host or we will see sysfs warnings like the below when
> triggering the the removal of the SAS HBA PCI device like with writing to the
> sysfs pci remove file:
> 
> echo 1 > /sys/module/isci/drivers/pci:isci/<device>/remove
> 
> WARNING: CPU: 2 PID: 5 at fs/sysfs/group.c:241 sysfs_remove_group+0xc3/0xd0
> sysfs group 'power' not found for kobject 'end_device-6:0'
> CPU: 16 PID: 5884 Comm: echo Not tainted 4.11.0-rc3-libsas+ #504
> Call Trace:
>  dump_stack+0x85/0xc2
>  __warn+0xc6/0xe0
>  warn_slowpath_fmt+0x4a/0x50
>  sysfs_remove_group+0xc3/0xd0
>  dpm_sysfs_remove+0x52/0x60
>  device_del+0x13c/0x360
>  ? device_remove_file+0x14/0x20
>  attribute_container_class_device_del+0x15/0x20
>  transport_remove_classdev+0x4c/0x60
>  ? transport_add_class_device+0x40/0x40
>  attribute_container_device_trigger+0xb3/0xc0
>  transport_remove_device+0x10/0x20
>  sas_port_delete+0x12d/0x160 [scsi_transport_sas]
>  sas_deform_port+0x1bf/0x1d0 [libsas]
>  sas_unregister_ports+0x36/0x50 [libsas]
>  sas_unregister_ha+0x1b/0x40 [libsas]
>  isci_unregister+0x2a/0x40 [isci]
>  isci_pci_remove+0x52/0xb0 [isci]
>  ? __pm_runtime_resume+0x56/0x80
>  pci_device_remove+0x34/0xb0
>  device_release_driver_internal+0x158/0x210
>  device_release_driver+0xd/0x10
>  pci_stop_bus_device+0x85/0x90
>  pci_stop_and_remove_bus_device_locked+0x15/0x30
>  remove_store+0x59/0x70
>  dev_attr_store+0x13/0x20
>  sysfs_kf_write+0x40/0x50
>  kernfs_fop_write+0x130/0x1b0
>  __vfs_write+0x23/0x130
>  ? rcu_read_lock_sched_held+0x6d/0x80
>  ? rcu_sync_lockdep_assert+0x2a/0x50
>  ? __sb_start_write+0xd7/0x1e0
>  ? vfs_write+0x1a4/0x1f0
>  vfs_write+0xc6/0x1f0
>  SyS_write+0x44/0xa0
>  entry_SYSCALL_64_fastpath+0x23/0xc6
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/isci/init.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] aic94xx: remove the SAS host after the SCSI host
  2017-03-29  9:41 ` [PATCH 3/6] aic94xx: " Johannes Thumshirn
@ 2017-03-29 10:17   ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2017-03-29 10:17 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Linux SCSI Mailinglist, Linux Kernel Mailinglist

On 03/29/2017 11:41 AM, Johannes Thumshirn wrote:
> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
> removal path of kernfs to make it recursive we have to remove the SAS host
> before the SCSI host or we will see sysfs warnings on not found sysfs groups for
> kobjects.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/aic94xx/aic94xx_init.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] scsi: hisi_sas: remove the SAS host after the SCSI host
  2017-03-29  9:41 ` [PATCH 4/6] scsi: hisi_sas: " Johannes Thumshirn
@ 2017-03-29 10:17   ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2017-03-29 10:17 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Linux SCSI Mailinglist, Linux Kernel Mailinglist

On 03/29/2017 11:41 AM, Johannes Thumshirn wrote:
> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
> removal path of kernfs to make it recursive we have to remove the SAS host
> before the SCSI host or we will see sysfs warnings on not found sysfs groups for
> kobjects.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/hisi_sas/hisi_sas_main.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] mvsas: remove the SAS host after the SCSI host
  2017-03-29  9:41 ` [PATCH 5/6] mvsas: " Johannes Thumshirn
@ 2017-03-29 10:18   ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2017-03-29 10:18 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Linux SCSI Mailinglist, Linux Kernel Mailinglist

On 03/29/2017 11:41 AM, Johannes Thumshirn wrote:
> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
> removal path of kernfs to make it recursive we have to remove the SAS host
> before the SCSI host or we will see sysfs warnings on not found sysfs groups for
> kobjects.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/mvsas/mv_init.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] scsi: pm8001: remove the SAS host after the SCSI host
  2017-03-29  9:41 ` [PATCH 6/6] scsi: pm8001: " Johannes Thumshirn
@ 2017-03-29 10:18   ` Hannes Reinecke
  2017-03-29 10:27   ` Jinpu Wang
  1 sibling, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2017-03-29 10:18 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, John Garry, Jack Wang,
	Linux SCSI Mailinglist, Linux Kernel Mailinglist

On 03/29/2017 11:41 AM, Johannes Thumshirn wrote:
> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
> removal path of kernfs to make it recursive we have to remove the SAS host
> before the SCSI host or we will see sysfs warnings on not found sysfs groups for
> kobjects.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/pm8001/pm8001_init.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] scsi: pm8001: remove the SAS host after the SCSI host
  2017-03-29  9:41 ` [PATCH 6/6] scsi: pm8001: " Johannes Thumshirn
  2017-03-29 10:18   ` Hannes Reinecke
@ 2017-03-29 10:27   ` Jinpu Wang
  2017-03-29 10:30     ` Hannes Reinecke
  1 sibling, 1 reply; 23+ messages in thread
From: Jinpu Wang @ 2017-03-29 10:27 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Martin K . Petersen, Tejun Heo, James Bottomley, Dan Williams,
	John Garry, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On Wed, Mar 29, 2017 at 11:41 AM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
> removal path of kernfs to make it recursive we have to remove the SAS host
> before the SCSI host or we will see sysfs warnings on not found sysfs groups for
> kobjects.
>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/pm8001/pm8001_init.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
> index 417368c..9116e9c 100644
> --- a/drivers/scsi/pm8001/pm8001_init.c
> +++ b/drivers/scsi/pm8001/pm8001_init.c
> @@ -1086,11 +1086,21 @@ static void pm8001_pci_remove(struct pci_dev *pdev)
>  {
>         struct sas_ha_struct *sha = pci_get_drvdata(pdev);
>         struct pm8001_hba_info *pm8001_ha;
> +       unsigned long flags;
> +       struct Scsi_Host *shost;
>         int i, j;
>         pm8001_ha = sha->lldd_ha;
> -       scsi_remove_host(pm8001_ha->shost);
> +       shost = pm8001_ha->shost;
> +
> +       spin_lock_irqsave(shost->host_lock, flags);
> +       if (scsi_host_set_state(shost, SHOST_CANCEL))
> +               WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
> +       spin_unlock_irqrestore(shost->host_lock, flags);
> +
>         sas_unregister_ha(sha);
> -       sas_remove_host(pm8001_ha->shost);
> +       sas_remove_host(shost);
> +       scsi_remove_host(shost);
> +
>         list_del(&pm8001_ha->list);
>         PM8001_CHIP_DISP->interrupt_disable(pm8001_ha, 0xFF);
>         PM8001_CHIP_DISP->chip_soft_rst(pm8001_ha);
> --
> 1.8.5.6
>
Thanks Johannes for taking care of this. Looks good to me,

I have a question regarding the scsi_host_set_state change, why do we need that?
Can't we simply change the order of sas_remove_host and scsi_remove_host?

Cheers?
-- 
Jack Wang

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] scsi: pm8001: remove the SAS host after the SCSI host
  2017-03-29 10:27   ` Jinpu Wang
@ 2017-03-29 10:30     ` Hannes Reinecke
  0 siblings, 0 replies; 23+ messages in thread
From: Hannes Reinecke @ 2017-03-29 10:30 UTC (permalink / raw)
  To: Jinpu Wang, Johannes Thumshirn
  Cc: Martin K . Petersen, Tejun Heo, James Bottomley, Dan Williams,
	John Garry, Linux SCSI Mailinglist, Linux Kernel Mailinglist

On 03/29/2017 12:27 PM, Jinpu Wang wrote:
> On Wed, Mar 29, 2017 at 11:41 AM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
>> After commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive") changed the
>> removal path of kernfs to make it recursive we have to remove the SAS host
>> before the SCSI host or we will see sysfs warnings on not found sysfs groups for
>> kobjects.
>>
>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>> ---
>>  drivers/scsi/pm8001/pm8001_init.c | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
>> index 417368c..9116e9c 100644
>> --- a/drivers/scsi/pm8001/pm8001_init.c
>> +++ b/drivers/scsi/pm8001/pm8001_init.c
>> @@ -1086,11 +1086,21 @@ static void pm8001_pci_remove(struct pci_dev *pdev)
>>  {
>>         struct sas_ha_struct *sha = pci_get_drvdata(pdev);
>>         struct pm8001_hba_info *pm8001_ha;
>> +       unsigned long flags;
>> +       struct Scsi_Host *shost;
>>         int i, j;
>>         pm8001_ha = sha->lldd_ha;
>> -       scsi_remove_host(pm8001_ha->shost);
>> +       shost = pm8001_ha->shost;
>> +
>> +       spin_lock_irqsave(shost->host_lock, flags);
>> +       if (scsi_host_set_state(shost, SHOST_CANCEL))
>> +               WARN_ON(scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY));
>> +       spin_unlock_irqrestore(shost->host_lock, flags);
>> +
>>         sas_unregister_ha(sha);
>> -       sas_remove_host(pm8001_ha->shost);
>> +       sas_remove_host(shost);
>> +       scsi_remove_host(shost);
>> +
>>         list_del(&pm8001_ha->list);
>>         PM8001_CHIP_DISP->interrupt_disable(pm8001_ha, 0xFF);
>>         PM8001_CHIP_DISP->chip_soft_rst(pm8001_ha);
>> --
>> 1.8.5.6
>>
> Thanks Johannes for taking care of this. Looks good to me,
> 
> I have a question regarding the scsi_host_set_state change, why do we need that?
> Can't we simply change the order of sas_remove_host and scsi_remove_host?
> 
If we don't do that I/O might still be coming in for the attached ports
while we're trying to remove the same. Which might cause all sorts of
race conditions.
So it's better to set the host state to CANCEL to stop I/O before
removing the ports.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29  9:41 ` [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister Johannes Thumshirn
@ 2017-03-29 11:15   ` John Garry
  2017-03-29 11:29     ` Johannes Thumshirn
  0 siblings, 1 reply; 23+ messages in thread
From: John Garry @ 2017-03-29 11:15 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, James Bottomley, Dan Williams, Jack Wang,
	Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On 29/03/2017 10:41, Johannes Thumshirn wrote:
> In the advent of an SAS device unregister we have to wait for all destruct
> works to be done to not accidently delay deletion of a SAS rphy or it's
> children to the point when we're removing the SCSI or SAS hosts.
>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/scsi/libsas/sas_discover.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> index 60de662..75b18f1 100644
> --- a/drivers/scsi/libsas/sas_discover.c
> +++ b/drivers/scsi/libsas/sas_discover.c
> @@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
>  	}
>
>  	if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> +		struct sas_discovery *disc = &dev->port->disc;
> +		struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
> +
>  		sas_rphy_unlink(dev->rphy);
>  		list_move_tail(&dev->disco_list_node, &port->destroy_list);
>  		sas_discover_event(dev->port, DISCE_DESTRUCT);
> +		flush_work(&sw->work);

I quickly tested plugging out the expander and we never get past this 
call to flush - a hang results:

root@(none)$ [  243.357088] INFO: task kworker/u32:1:106 blocked for 
more than 120 seconds.
[  243.364030]       Not tainted 4.11.0-rc1-13687-g2562e6a-dirty #1388
[  243.370282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  243.378086] kworker/u32:1   D    0   106      2 0x00000000
[  243.383566] Workqueue: scsi_wq_0 sas_phye_loss_of_signal
[  243.388863] Call trace:
[  243.391314] [<ffff000008085d70>] __switch_to+0xa4/0xb0
[  243.396442] [<ffff0000088f1134>] __schedule+0x1b4/0x5d0
[  243.401654] [<ffff0000088f1588>] schedule+0x38/0x9c
[  243.406520] [<ffff0000088f4540>] schedule_timeout+0x194/0x294
[  243.412249] [<ffff0000088f202c>] wait_for_common+0xb0/0x144
[  243.417805] [<ffff0000088f20d4>] wait_for_completion+0x14/0x1c
[  243.423623] [<ffff0000080d5bd4>] flush_work+0xe0/0x1a8
[  243.428747] [<ffff000008598158>] sas_unregister_dev+0xf8/0x110
[  243.434563] [<ffff000008598304>] sas_unregister_domain_devices+0x4c/0xc8
[  243.441242] [<ffff000008596884>] sas_deform_port+0x14c/0x15c
[  243.446886] [<ffff000008596508>] sas_phye_loss_of_signal+0x48/0x54
[  243.453048] [<ffff0000080d6164>] process_one_work+0x138/0x2d8
[  243.458776] [<ffff0000080d635c>] worker_thread+0x58/0x424
[  243.464161] [<ffff0000080dc16c>] kthread+0xf4/0x120
[  243.469024] [<ffff0000080836c0>] ret_from_fork+0x10/0x50
[  364.189094] INFO: task kworker/u32:1:106 blocked for more than 120 
seconds.
[  364.196035]       Not tainted 4.11.0-rc1-13687-g2562e6a-dirty #1388
[  364.202281] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  364.210085] kworker/u32:1   D    0   106      2 0x00000000
[  364.215558] Workqueue: scsi_wq_0 sas_phye_loss_of_signal
[  364.220855] Call trace:
[  364.223303] [<ffff000008085d70>] __switch_to+0xa4/0xb0
[  364.228428] [<ffff0000088f1134>] __schedule+0x1b4/0x5d0
[  364.233640] [<ffff0000088f1588>] schedule+0x38/0x9c
[  364.238506] [<ffff0000088f4540>] schedule_timeout+0x194/0x294
[  364.244237] [<ffff0000088f202c>] wait_for_common+0xb0/0x144
[  364.249793] [<ffff0000088f20d4>] wait_for_completion+0x14/0x1c
[  364.255610] [<ffff0000080d5bd4>] flush_work+0xe0/0x1a8
[  364.260736] [<ffff000008598158>] sas_unregister_dev+0xf8/0x110
[  364.266551] [<ffff000008598304>] sas_unregister_domain_devices+0x4c/0xc8
[  364.273230] [<ffff000008596884>] sas_deform_port+0x14c/0x15c
[  364.278872] [<ffff000008596508>] sas_phye_loss_of_signal+0x48/0x54
[  364.285034] [<ffff0000080d6164>] process_one_work+0x138/0x2d8
[  364.290763] [<ffff0000080d635c>] worker_thread+0x58/0x424
[  364.296147] [<ffff0000080dc16c>] kthread+0xf4/0x120
[  364.301013] [<ffff0000080836c0>] ret_from_fork+0x10/0x50

Is the issue that we are trying to flush the queue when we are working 
in the same queue context?

Thanks,
John

>  	}
>  }
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29 11:15   ` John Garry
@ 2017-03-29 11:29     ` Johannes Thumshirn
  2017-03-29 11:53       ` John Garry
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29 11:29 UTC (permalink / raw)
  To: John Garry
  Cc: Martin K . Petersen, Tejun Heo, James Bottomley, Dan Williams,
	Jack Wang, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
> On 29/03/2017 10:41, Johannes Thumshirn wrote:
> >In the advent of an SAS device unregister we have to wait for all destruct
> >works to be done to not accidently delay deletion of a SAS rphy or it's
> >children to the point when we're removing the SCSI or SAS hosts.
> >
> >Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> >---
> > drivers/scsi/libsas/sas_discover.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> >diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> >index 60de662..75b18f1 100644
> >--- a/drivers/scsi/libsas/sas_discover.c
> >+++ b/drivers/scsi/libsas/sas_discover.c
> >@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
> > 	}
> >
> > 	if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> >+		struct sas_discovery *disc = &dev->port->disc;
> >+		struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
> >+
> > 		sas_rphy_unlink(dev->rphy);
> > 		list_move_tail(&dev->disco_list_node, &port->destroy_list);
> > 		sas_discover_event(dev->port, DISCE_DESTRUCT);
> >+		flush_work(&sw->work);
> 
> I quickly tested plugging out the expander and we never get past this call
> to flush - a hang results:

Can you activat lockdep so we can see which lock it is that we're blocking on?

It's most likely in sas_unregister_common_dev() but this function takes two spin
locks, port->dev_list_lock and ha->lock.

Thanks a lot,
       Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] Fix sysfs recursive removal splats in isci
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
                   ` (5 preceding siblings ...)
  2017-03-29  9:41 ` [PATCH 6/6] scsi: pm8001: " Johannes Thumshirn
@ 2017-03-29 11:37 ` James Bottomley
  2017-03-29 15:23 ` Tejun Heo
  7 siblings, 0 replies; 23+ messages in thread
From: James Bottomley @ 2017-03-29 11:37 UTC (permalink / raw)
  To: Johannes Thumshirn, Martin K . Petersen
  Cc: Tejun Heo, Dan Williams, John Garry, Jack Wang, Hannes Reinecke,
	Linux SCSI Mailinglist, Linux Kernel Mailinglist

On Wed, 2017-03-29 at 11:41 +0200, Johannes Thumshirn wrote:
> This series fixes a sysfs warning caused by isci not being able to 
> cope with recursive sysfs path removals which are in place since 
> commit bcdde7e ("sysfs: make __sysfs_remove_dir() recursive").
> 
> The mvsas, aic94xx and pm8001 and hisi_sas patches have been compile 
> tested only hence they have no callstack of the affected path in 
> their changelogs.
> 
> I'm not sure whether to mark this patches as stable or not. I tend to 
> say no here, although we've seen complaints/bug reports on lkml and 
> the scsi list.

What happens to the SYNC CACHE for devices with write back caches?  It
looks like you've already torn down most of the sas objects by the time
they're sent, so do they actually reach the device (or worse, do they
hang the system by not making progress)?

Assuming the above is OK, what about putting the state change inside
sas_remove_ha()?  It's better than making every driver do it.

James

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29 11:29     ` Johannes Thumshirn
@ 2017-03-29 11:53       ` John Garry
  2017-03-29 12:26         ` Johannes Thumshirn
  0 siblings, 1 reply; 23+ messages in thread
From: John Garry @ 2017-03-29 11:53 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Martin K . Petersen, Tejun Heo, James Bottomley, Dan Williams,
	Jack Wang, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On 29/03/2017 12:29, Johannes Thumshirn wrote:
> On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
>> On 29/03/2017 10:41, Johannes Thumshirn wrote:
>>> In the advent of an SAS device unregister we have to wait for all destruct
>>> works to be done to not accidently delay deletion of a SAS rphy or it's
>>> children to the point when we're removing the SCSI or SAS hosts.
>>>
>>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>>> ---
>>> drivers/scsi/libsas/sas_discover.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
>>> index 60de662..75b18f1 100644
>>> --- a/drivers/scsi/libsas/sas_discover.c
>>> +++ b/drivers/scsi/libsas/sas_discover.c
>>> @@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
>>> 	}
>>>
>>> 	if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
>>> +		struct sas_discovery *disc = &dev->port->disc;
>>> +		struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
>>> +
>>> 		sas_rphy_unlink(dev->rphy);
>>> 		list_move_tail(&dev->disco_list_node, &port->destroy_list);
>>> 		sas_discover_event(dev->port, DISCE_DESTRUCT);
>>> +		flush_work(&sw->work);
>>
>> I quickly tested plugging out the expander and we never get past this call
>> to flush - a hang results:
>
> Can you activat lockdep so we can see which lock it is that we're blocking on?
>

I have it on:
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y

> It's most likely in sas_unregister_common_dev() but this function takes two spin
> locks, port->dev_list_lock and ha->lock.
>

We can see from the callstack I provided that we're working in workqueue 
scsi_wq_0 and trying to flush that same queue.

Much appreciated,
John

> Thanks a lot,
>        Johannes
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29 11:53       ` John Garry
@ 2017-03-29 12:26         ` Johannes Thumshirn
  2017-03-29 12:36           ` Jinpu Wang
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29 12:26 UTC (permalink / raw)
  To: John Garry
  Cc: Martin K . Petersen, Tejun Heo, James Bottomley, Dan Williams,
	Jack Wang, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On Wed, Mar 29, 2017 at 12:53:28PM +0100, John Garry wrote:
> On 29/03/2017 12:29, Johannes Thumshirn wrote:
> >On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
> >>On 29/03/2017 10:41, Johannes Thumshirn wrote:
> >>>In the advent of an SAS device unregister we have to wait for all destruct
> >>>works to be done to not accidently delay deletion of a SAS rphy or it's
> >>>children to the point when we're removing the SCSI or SAS hosts.
> >>>
> >>>Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> >>>---
> >>>drivers/scsi/libsas/sas_discover.c | 4 ++++
> >>>1 file changed, 4 insertions(+)
> >>>
> >>>diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> >>>index 60de662..75b18f1 100644
> >>>--- a/drivers/scsi/libsas/sas_discover.c
> >>>+++ b/drivers/scsi/libsas/sas_discover.c
> >>>@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
> >>>	}
> >>>
> >>>	if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> >>>+		struct sas_discovery *disc = &dev->port->disc;
> >>>+		struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
> >>>+
> >>>		sas_rphy_unlink(dev->rphy);
> >>>		list_move_tail(&dev->disco_list_node, &port->destroy_list);
> >>>		sas_discover_event(dev->port, DISCE_DESTRUCT);
> >>>+		flush_work(&sw->work);
> >>
> >>I quickly tested plugging out the expander and we never get past this call
> >>to flush - a hang results:
> >
> >Can you activat lockdep so we can see which lock it is that we're blocking on?
> >
> 
> I have it on:
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_LOCKD=y
> CONFIG_LOCKD_V4=y
> 
> >It's most likely in sas_unregister_common_dev() but this function takes two spin
> >locks, port->dev_list_lock and ha->lock.
> >
> 
> We can see from the callstack I provided that we're working in workqueue
> scsi_wq_0 and trying to flush that same queue.

Aaahh, now I get what's happening (with some kicks^Whelp from Hannes I admit).

The sas_unregister_dev() comes from the work queued by notify_phy_event(). So this patch must be
replaced by (untested):

diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index cdbb293..e1e6492 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -375,6 +375,7 @@ void sas_remove_children(struct device *dev)
  */
 void sas_remove_host(struct Scsi_Host *shost)
 {
+       scsi_flush_work(shost);
        sas_remove_children(&shost->shost_gendev);
 }
 EXPORT_SYMBOL(sas_remove_host);

John, mind giving that one a shot in your test setup as well?

Thanks,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29 12:26         ` Johannes Thumshirn
@ 2017-03-29 12:36           ` Jinpu Wang
  2017-03-29 12:47             ` Johannes Thumshirn
  0 siblings, 1 reply; 23+ messages in thread
From: Jinpu Wang @ 2017-03-29 12:36 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: John Garry, Martin K . Petersen, Tejun Heo, James Bottomley,
	Dan Williams, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On Wed, Mar 29, 2017 at 2:26 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> On Wed, Mar 29, 2017 at 12:53:28PM +0100, John Garry wrote:
>> On 29/03/2017 12:29, Johannes Thumshirn wrote:
>> >On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
>> >>On 29/03/2017 10:41, Johannes Thumshirn wrote:
>> >>>In the advent of an SAS device unregister we have to wait for all destruct
>> >>>works to be done to not accidently delay deletion of a SAS rphy or it's
>> >>>children to the point when we're removing the SCSI or SAS hosts.
>> >>>
>> >>>Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>> >>>---
>> >>>drivers/scsi/libsas/sas_discover.c | 4 ++++
>> >>>1 file changed, 4 insertions(+)
>> >>>
>> >>>diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
>> >>>index 60de662..75b18f1 100644
>> >>>--- a/drivers/scsi/libsas/sas_discover.c
>> >>>+++ b/drivers/scsi/libsas/sas_discover.c
>> >>>@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
>> >>>   }
>> >>>
>> >>>   if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
>> >>>+          struct sas_discovery *disc = &dev->port->disc;
>> >>>+          struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
>> >>>+
>> >>>           sas_rphy_unlink(dev->rphy);
>> >>>           list_move_tail(&dev->disco_list_node, &port->destroy_list);
>> >>>           sas_discover_event(dev->port, DISCE_DESTRUCT);
>> >>>+          flush_work(&sw->work);
>> >>
>> >>I quickly tested plugging out the expander and we never get past this call
>> >>to flush - a hang results:
>> >
>> >Can you activat lockdep so we can see which lock it is that we're blocking on?
>> >
>>
>> I have it on:
>> CONFIG_LOCKDEP_SUPPORT=y
>> CONFIG_LOCKD=y
>> CONFIG_LOCKD_V4=y
>>
>> >It's most likely in sas_unregister_common_dev() but this function takes two spin
>> >locks, port->dev_list_lock and ha->lock.
>> >
>>
>> We can see from the callstack I provided that we're working in workqueue
>> scsi_wq_0 and trying to flush that same queue.
>
> Aaahh, now I get what's happening (with some kicks^Whelp from Hannes I admit).
>
> The sas_unregister_dev() comes from the work queued by notify_phy_event(). So this patch must be
> replaced by (untested):
>
> diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
> index cdbb293..e1e6492 100644
> --- a/drivers/scsi/scsi_transport_sas.c
> +++ b/drivers/scsi/scsi_transport_sas.c
> @@ -375,6 +375,7 @@ void sas_remove_children(struct device *dev)
>   */
>  void sas_remove_host(struct Scsi_Host *shost)
>  {
> +       scsi_flush_work(shost);
>         sas_remove_children(&shost->shost_gendev);
>  }
>  EXPORT_SYMBOL(sas_remove_host);
>
> John, mind giving that one a shot in your test setup as well?
>
> Thanks,
>         Johannes
>
> --
> Johannes Thumshirn                                          Storage
> jthumshirn@suse.de                                +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Haha, I have same idea :)
Have no test env, so if John could test it, it will be great.
-- 
Jack Wang
Linux Kernel Developer

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin

Tel:       +49 30 577 008  042
Fax:      +49 30 577 008 299
Email:    jinpu.wang@profitbricks.com
URL:      https://www.profitbricks.de

Sitz der Gesellschaft: Berlin
Registergericht: Amtsgericht Charlottenburg, HRB 125506 B
Geschäftsführer: Achim Weiss

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29 12:36           ` Jinpu Wang
@ 2017-03-29 12:47             ` Johannes Thumshirn
  2017-03-29 12:51               ` Johannes Thumshirn
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29 12:47 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: John Garry, Martin K . Petersen, Tejun Heo, James Bottomley,
	Dan Williams, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On Wed, Mar 29, 2017 at 02:36:11PM +0200, Jinpu Wang wrote:
> On Wed, Mar 29, 2017 at 2:26 PM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> > On Wed, Mar 29, 2017 at 12:53:28PM +0100, John Garry wrote:
> >> On 29/03/2017 12:29, Johannes Thumshirn wrote:
> >> >On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
> >> >>On 29/03/2017 10:41, Johannes Thumshirn wrote:
> >> >>>In the advent of an SAS device unregister we have to wait for all destruct
> >> >>>works to be done to not accidently delay deletion of a SAS rphy or it's
> >> >>>children to the point when we're removing the SCSI or SAS hosts.
> >> >>>
> >> >>>Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> >> >>>---
> >> >>>drivers/scsi/libsas/sas_discover.c | 4 ++++
> >> >>>1 file changed, 4 insertions(+)
> >> >>>
> >> >>>diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
> >> >>>index 60de662..75b18f1 100644
> >> >>>--- a/drivers/scsi/libsas/sas_discover.c
> >> >>>+++ b/drivers/scsi/libsas/sas_discover.c
> >> >>>@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
> >> >>>   }
> >> >>>
> >> >>>   if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
> >> >>>+          struct sas_discovery *disc = &dev->port->disc;
> >> >>>+          struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
> >> >>>+
> >> >>>           sas_rphy_unlink(dev->rphy);
> >> >>>           list_move_tail(&dev->disco_list_node, &port->destroy_list);
> >> >>>           sas_discover_event(dev->port, DISCE_DESTRUCT);
> >> >>>+          flush_work(&sw->work);
> >> >>
> >> >>I quickly tested plugging out the expander and we never get past this call
> >> >>to flush - a hang results:
> >> >
> >> >Can you activat lockdep so we can see which lock it is that we're blocking on?
> >> >
> >>
> >> I have it on:
> >> CONFIG_LOCKDEP_SUPPORT=y
> >> CONFIG_LOCKD=y
> >> CONFIG_LOCKD_V4=y
> >>
> >> >It's most likely in sas_unregister_common_dev() but this function takes two spin
> >> >locks, port->dev_list_lock and ha->lock.
> >> >
> >>
> >> We can see from the callstack I provided that we're working in workqueue
> >> scsi_wq_0 and trying to flush that same queue.
> >
> > Aaahh, now I get what's happening (with some kicks^Whelp from Hannes I admit).
> >
> > The sas_unregister_dev() comes from the work queued by notify_phy_event(). So this patch must be
> > replaced by (untested):
> >
> > diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
> > index cdbb293..e1e6492 100644
> > --- a/drivers/scsi/scsi_transport_sas.c
> > +++ b/drivers/scsi/scsi_transport_sas.c
> > @@ -375,6 +375,7 @@ void sas_remove_children(struct device *dev)
> >   */
> >  void sas_remove_host(struct Scsi_Host *shost)
> >  {
> > +       scsi_flush_work(shost);
> >         sas_remove_children(&shost->shost_gendev);
> >  }
> >  EXPORT_SYMBOL(sas_remove_host);
> >
> > John, mind giving that one a shot in your test setup as well?

Well, don't mind. It doesn't work in my test setup.

I'm back to the drawing board...

Anyways thanks,
	Johannes
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister
  2017-03-29 12:47             ` Johannes Thumshirn
@ 2017-03-29 12:51               ` Johannes Thumshirn
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Thumshirn @ 2017-03-29 12:51 UTC (permalink / raw)
  To: Jinpu Wang
  Cc: John Garry, Martin K . Petersen, Tejun Heo, James Bottomley,
	Dan Williams, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

On Wed, Mar 29, 2017 at 02:47:54PM +0200, Johannes Thumshirn wrote:
> > > John, mind giving that one a shot in your test setup as well?
> 
> Well, don't mind. It doesn't work in my test setup.
> 
> I'm back to the drawing board...

Please ignore my last mail, apparently it's a wise idea to do a git stash pop
after a git stash m)

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/2] Fix sysfs recursive removal splats in isci
  2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
                   ` (6 preceding siblings ...)
  2017-03-29 11:37 ` [PATCH 0/2] Fix sysfs recursive removal splats in isci James Bottomley
@ 2017-03-29 15:23 ` Tejun Heo
  7 siblings, 0 replies; 23+ messages in thread
From: Tejun Heo @ 2017-03-29 15:23 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Martin K . Petersen, James Bottomley, Dan Williams, John Garry,
	Jack Wang, Hannes Reinecke, Linux SCSI Mailinglist,
	Linux Kernel Mailinglist

Hello,

On Wed, Mar 29, 2017 at 11:41:07AM +0200, Johannes Thumshirn wrote:
> This series fixes a sysfs warning caused by isci not being able to cope with
> recursive sysfs path removals which are in place since commit bcdde7e
> ("sysfs: make __sysfs_remove_dir() recursive").

Thanks for fixing these.

> The mvsas, aic94xx and pm8001 and hisi_sas patches have been compile tested
> only hence they have no callstack of the affected path in their changelogs.
> 
> I'm not sure whether to mark this patches as stable or not. I tend to say no
> here, although we've seen complaints/bug reports on lkml and the scsi list.

Given that the failures aren't critical or all that common (only
happens on controller removal), I agree that not cc'ing stable is the
right call here.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-03-29 15:23 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-29  9:41 [PATCH 0/2] Fix sysfs recursive removal splats in isci Johannes Thumshirn
2017-03-29  9:41 ` [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister Johannes Thumshirn
2017-03-29 11:15   ` John Garry
2017-03-29 11:29     ` Johannes Thumshirn
2017-03-29 11:53       ` John Garry
2017-03-29 12:26         ` Johannes Thumshirn
2017-03-29 12:36           ` Jinpu Wang
2017-03-29 12:47             ` Johannes Thumshirn
2017-03-29 12:51               ` Johannes Thumshirn
2017-03-29  9:41 ` [PATCH 2/2] scsi: isci: remove the SAS host after the SCSI host Johannes Thumshirn
2017-03-29 10:17   ` Hannes Reinecke
2017-03-29  9:41 ` [PATCH 3/6] aic94xx: " Johannes Thumshirn
2017-03-29 10:17   ` Hannes Reinecke
2017-03-29  9:41 ` [PATCH 4/6] scsi: hisi_sas: " Johannes Thumshirn
2017-03-29 10:17   ` Hannes Reinecke
2017-03-29  9:41 ` [PATCH 5/6] mvsas: " Johannes Thumshirn
2017-03-29 10:18   ` Hannes Reinecke
2017-03-29  9:41 ` [PATCH 6/6] scsi: pm8001: " Johannes Thumshirn
2017-03-29 10:18   ` Hannes Reinecke
2017-03-29 10:27   ` Jinpu Wang
2017-03-29 10:30     ` Hannes Reinecke
2017-03-29 11:37 ` [PATCH 0/2] Fix sysfs recursive removal splats in isci James Bottomley
2017-03-29 15:23 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.