All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading
@ 2020-02-05 21:44 mwilck
  2020-02-05 21:44 ` [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped mwilck
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: mwilck @ 2020-02-05 21:44 UTC (permalink / raw)
  To: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke
  Cc: Bart Van Assche, Martin Wilck, Daniel Wagner, James Bottomley,
	linux-scsi

From: Martin Wilck <mwilck@suse.com>

Hello Martin, Himanshu, all,

here is v2 of the little series I first submitted on Nov 29, 2019.

Changes wrt v2:
 - Added patch 3 to set the UNLOADING flag before waiting for sessions
   to end (Roman)
 - Use test_and_set_bit() for UNLOADING (Hannes)

Martin Wilck (3):
  scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  scsi: qla2xxx: don't shut down firmware before closing sessions
  scsi: qla2xxx: set UNLOADING before waiting for session deletion

 drivers/scsi/qla2xxx/qla_mbx.c |  3 +++
 drivers/scsi/qla2xxx/qla_os.c  | 39 +++++++++++++++++-----------------
 2 files changed, 22 insertions(+), 20 deletions(-)

-- 
2.25.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-02-05 21:44 [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading mwilck
@ 2020-02-05 21:44 ` mwilck
  2020-02-06 12:33   ` Daniel Wagner
  2020-03-24 23:51   ` Arun Easi
  2020-02-05 21:44 ` [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions mwilck
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 24+ messages in thread
From: mwilck @ 2020-02-05 21:44 UTC (permalink / raw)
  To: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke
  Cc: Bart Van Assche, Martin Wilck, Daniel Wagner, James Bottomley,
	linux-scsi

From: Martin Wilck <mwilck@suse.com>

Since commit 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip"),
it is possible that FC commands are scheduled after the adapter firmware
has been shut down. IO sent to the firmware in this situation hangs
indefinitely. Avoid this for the LOGO code path that is typically taken
when adapters are shut down.

Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
---
 drivers/scsi/qla2xxx/qla_mbx.c | 3 +++
 drivers/scsi/qla2xxx/qla_os.c  | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index 9e09964..53129f2 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -2644,6 +2644,9 @@ qla24xx_fabric_logout(scsi_qla_host_t *vha, uint16_t loop_id, uint8_t domain,
 	ql_dbg(ql_dbg_mbx + ql_dbg_verbose, vha, 0x106d,
 	    "Entered %s.\n", __func__);
 
+	if (!ha->flags.fw_started)
+		return QLA_FUNCTION_FAILED;
+
 	lg = dma_pool_zalloc(ha->s_dma_pool, GFP_KERNEL, &lg_dma);
 	if (lg == NULL) {
 		ql_log(ql_log_warn, vha, 0x106e,
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index b520a98..2dafb46 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -4878,6 +4878,9 @@ qla2x00_post_work(struct scsi_qla_host *vha, struct qla_work_evt *e)
 	unsigned long flags;
 	bool q = false;
 
+	if (!vha->hw->flags.fw_started)
+		return QLA_FUNCTION_FAILED;
+
 	spin_lock_irqsave(&vha->work_lock, flags);
 	list_add_tail(&e->list, &vha->work_list);
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions
  2020-02-05 21:44 [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading mwilck
  2020-02-05 21:44 ` [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped mwilck
@ 2020-02-05 21:44 ` mwilck
  2020-02-06 12:42   ` Daniel Wagner
                     ` (2 more replies)
  2020-02-05 21:44 ` [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
  2020-03-09 16:44 ` [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading Martin Wilck
  3 siblings, 3 replies; 24+ messages in thread
From: mwilck @ 2020-02-05 21:44 UTC (permalink / raw)
  To: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke
  Cc: Bart Van Assche, Martin Wilck, Daniel Wagner, James Bottomley,
	linux-scsi

From: Martin Wilck <mwilck@suse.com>

Since 45235022da99, the firmware is shut down early in the controller
shutdown process. This causes commands sent to the firmware (such as LOGO)
to hang forever. Eventually one or more timeouts will be triggered.
Move the stopping of the firmware until after sessions have terminated.

Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/qla2xxx/qla_os.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 2dafb46..e81080d 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3720,6 +3720,16 @@ qla2x00_remove_one(struct pci_dev *pdev)
 	}
 	qla2x00_wait_for_hba_ready(base_vha);
 
+	qla2x00_wait_for_sess_deletion(base_vha);
+
+	/*
+	 * if UNLOAD flag is already set, then continue unload,
+	 * where it was set first.
+	 */
+	if (test_bit(UNLOADING, &base_vha->dpc_flags))
+		return;
+
+	set_bit(UNLOADING, &base_vha->dpc_flags);
 	if (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) ||
 	    IS_QLA28XX(ha)) {
 		if (ha->flags.fw_started)
@@ -3736,17 +3746,6 @@ qla2x00_remove_one(struct pci_dev *pdev)
 		qla2x00_try_to_stop_firmware(base_vha);
 	}
 
-	qla2x00_wait_for_sess_deletion(base_vha);
-
-	/*
-	 * if UNLOAD flag is already set, then continue unload,
-	 * where it was set first.
-	 */
-	if (test_bit(UNLOADING, &base_vha->dpc_flags))
-		return;
-
-	set_bit(UNLOADING, &base_vha->dpc_flags);
-
 	qla_nvme_delete(base_vha);
 
 	dma_free_coherent(&ha->pdev->dev,
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion
  2020-02-05 21:44 [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading mwilck
  2020-02-05 21:44 ` [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped mwilck
  2020-02-05 21:44 ` [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions mwilck
@ 2020-02-05 21:44 ` mwilck
  2020-02-06 12:46   ` Daniel Wagner
                     ` (2 more replies)
  2020-03-09 16:44 ` [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading Martin Wilck
  3 siblings, 3 replies; 24+ messages in thread
From: mwilck @ 2020-02-05 21:44 UTC (permalink / raw)
  To: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke
  Cc: Bart Van Assche, Martin Wilck, Daniel Wagner, James Bottomley,
	linux-scsi

From: Martin Wilck <mwilck@suse.com>

The purpose of the UNLOADING flag is to avoid port login procedures
to continue when a controller is in the process of shutting down.
It makes sense to set this flag before starting session teardown.
The only operations that must be able to continue are LOGO, PRLO,
and the like.

Furthermore, use atomic test_and_set_bit() to avoid the shutdown
being run multiple times in parallel. In qla2x00_disable_board_on_pci_error(),
the test for UNLOADING is postponed until after the check for an already
disabled PCI board.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/qla2xxx/qla_os.c | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index e81080d..8329f95 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3720,16 +3720,15 @@ qla2x00_remove_one(struct pci_dev *pdev)
 	}
 	qla2x00_wait_for_hba_ready(base_vha);
 
-	qla2x00_wait_for_sess_deletion(base_vha);
-
 	/*
-	 * if UNLOAD flag is already set, then continue unload,
+	 * if UNLOADING flag is already set, then continue unload,
 	 * where it was set first.
 	 */
-	if (test_bit(UNLOADING, &base_vha->dpc_flags))
+	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
 		return;
 
-	set_bit(UNLOADING, &base_vha->dpc_flags);
+	qla2x00_wait_for_sess_deletion(base_vha);
+
 	if (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) ||
 	    IS_QLA28XX(ha)) {
 		if (ha->flags.fw_started)
@@ -6046,13 +6045,6 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
 	struct pci_dev *pdev = ha->pdev;
 	scsi_qla_host_t *base_vha = pci_get_drvdata(ha->pdev);
 
-	/*
-	 * if UNLOAD flag is already set, then continue unload,
-	 * where it was set first.
-	 */
-	if (test_bit(UNLOADING, &base_vha->dpc_flags))
-		return;
-
 	ql_log(ql_log_warn, base_vha, 0x015b,
 	    "Disabling adapter.\n");
 
@@ -6063,9 +6055,14 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
 		return;
 	}
 
-	qla2x00_wait_for_sess_deletion(base_vha);
+	/*
+	 * if UNLOADING flag is already set, then continue unload,
+	 * where it was set first.
+	 */
+	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
+		return;
 
-	set_bit(UNLOADING, &base_vha->dpc_flags);
+	qla2x00_wait_for_sess_deletion(base_vha);
 
 	qla2x00_delete_all_vps(ha, base_vha);
 
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-02-05 21:44 ` [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped mwilck
@ 2020-02-06 12:33   ` Daniel Wagner
  2020-02-06 12:50     ` Martin Wilck
  2020-03-24 23:51   ` Arun Easi
  1 sibling, 1 reply; 24+ messages in thread
From: Daniel Wagner @ 2020-02-06 12:33 UTC (permalink / raw)
  To: mwilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche,
	James Bottomley, linux-scsi

Hi Martin,

On Wed, Feb 05, 2020 at 10:44:20PM +0100, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Since commit 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip"),
> it is possible that FC commands are scheduled after the adapter firmware
> has been shut down. IO sent to the firmware in this situation hangs
> indefinitely. Avoid this for the LOGO code path that is typically taken
> when adapters are shut down.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>

Just to understand it correctly: 45235022da99 ("scsi: qla2xxx: Fix
driver unload by shutting down chip") is saying FW is not able to
shutdown propably and therefore we kill it first and still try to do
some cleanup.

Are you sure you got all the necessary places fixed up?

I tend to say we should add

	if (!ha->flags.fw_started)
		return QLA_FUNCTION_FAILED;

do qla2x00_mailbox_command() and handle the errors one by one. Just an
idea.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions
  2020-02-05 21:44 ` [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions mwilck
@ 2020-02-06 12:42   ` Daniel Wagner
  2020-03-13 14:55   ` Roman Bolshakov
  2020-03-25  0:36   ` Arun Easi
  2 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2020-02-06 12:42 UTC (permalink / raw)
  To: mwilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche,
	James Bottomley, linux-scsi

On Wed, Feb 05, 2020 at 10:44:21PM +0100, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Since 45235022da99, the firmware is shut down early in the controller
> shutdown process. This causes commands sent to the firmware (such as LOGO)
> to hang forever. Eventually one or more timeouts will be triggered.
> Move the stopping of the firmware until after sessions have terminated.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Signed-off-by: Martin Wilck <mwilck@suse.com>

Reviewed-by: Daniel Wagner <dwagner@suse.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion
  2020-02-05 21:44 ` [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
@ 2020-02-06 12:46   ` Daniel Wagner
  2020-03-13 15:36   ` Roman Bolshakov
  2020-03-25  0:38   ` Arun Easi
  2 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2020-02-06 12:46 UTC (permalink / raw)
  To: mwilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche,
	James Bottomley, linux-scsi

On Wed, Feb 05, 2020 at 10:44:22PM +0100, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> The purpose of the UNLOADING flag is to avoid port login procedures
> to continue when a controller is in the process of shutting down.
> It makes sense to set this flag before starting session teardown.
> The only operations that must be able to continue are LOGO, PRLO,
> and the like.
> 
> Furthermore, use atomic test_and_set_bit() to avoid the shutdown
> being run multiple times in parallel. In qla2x00_disable_board_on_pci_error(),
> the test for UNLOADING is postponed until after the check for an already
> disabled PCI board.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>

Reviewed-by: Daniel Wagner <dwagner@suse.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-02-06 12:33   ` Daniel Wagner
@ 2020-02-06 12:50     ` Martin Wilck
  2020-02-06 13:03       ` Daniel Wagner
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-02-06 12:50 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche,
	James Bottomley, linux-scsi

On Thu, 2020-02-06 at 13:33 +0100, Daniel Wagner wrote:
> Hi Martin,
> 
> On Wed, Feb 05, 2020 at 10:44:20PM +0100, mwilck@suse.com wrote:
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > Since commit 45235022da99 ("scsi: qla2xxx: Fix driver unload by
> > shutting down chip"),
> > it is possible that FC commands are scheduled after the adapter
> > firmware
> > has been shut down. IO sent to the firmware in this situation hangs
> > indefinitely. Avoid this for the LOGO code path that is typically
> > taken
> > when adapters are shut down.
> > 
> > Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting
> > down chip")
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
> 
> Just to understand it correctly: 45235022da99 ("scsi: qla2xxx: Fix
> driver unload by shutting down chip") is saying FW is not able to
> shutdown propably and therefore we kill it first and still try to do
> some cleanup.

Yes, that's what I observed. Mailbox access hangs in this case, as the
mailbox simply won't execute the command and the expected status change
will not happen.

> Are you sure you got all the necessary places fixed up?
> 
> I tend to say we should add
> 
> 	if (!ha->flags.fw_started)
> 		return QLA_FUNCTION_FAILED;
> 
> do qla2x00_mailbox_command() and handle the errors one by one. Just
> an
> idea.

I had that idea too. I even tried it out. IIRC it's not that easy. Some
commands need to be sent to the mailbox even in shut-down state
(MBC_EXECUTE_FIRMWARE, for example?). I admit I didn't dare going down
the path you're suggesting, I thought it had quite a potential to cause
regressions.

Did you mean this as a negative review, or rather an additional
suggestion?

Thanks
Martin



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-02-06 12:50     ` Martin Wilck
@ 2020-02-06 13:03       ` Daniel Wagner
  0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2020-02-06 13:03 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche,
	James Bottomley, linux-scsi

On Thu, Feb 06, 2020 at 01:50:32PM +0100, Martin Wilck wrote:
> On Thu, 2020-02-06 at 13:33 +0100, Daniel Wagner wrote:
> > Are you sure you got all the necessary places fixed up?
> > 
> > I tend to say we should add
> > 
> > 	if (!ha->flags.fw_started)
> > 		return QLA_FUNCTION_FAILED;
> > 
> > do qla2x00_mailbox_command() and handle the errors one by one. Just
> > an
> > idea.
> 
> I had that idea too. I even tried it out. IIRC it's not that easy. Some
> commands need to be sent to the mailbox even in shut-down state
> (MBC_EXECUTE_FIRMWARE, for example?). I admit I didn't dare going down
> the path you're suggesting, I thought it had quite a potential to cause
> regressions.

Yes, this certainly causing regressions. This patch is using the
blacklist approach which is probably introducing less regression.

But I think for finding all the right spots the white listing approach
is better. Maybe only during development phase. Just an idea.

> Did you mean this as a negative review, or rather an additional
> suggestion?

This patch clearly improves things. So yes, I approve.

Reviewed-by: Daniel Wagner <dwagner@suse.de>

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading
  2020-02-05 21:44 [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading mwilck
                   ` (2 preceding siblings ...)
  2020-02-05 21:44 ` [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
@ 2020-03-09 16:44 ` Martin Wilck
  2020-03-11  2:44   ` Martin K. Petersen
  3 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-03-09 16:44 UTC (permalink / raw)
  To: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke
  Cc: Bart Van Assche, Daniel Wagner, James Bottomley, linux-scsi

Hello Martin, Himanshu,

On Wed, 2020-02-05 at 22:44 +0100, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Hello Martin, Himanshu, all,
> 
> here is v2 of the little series I first submitted on Nov 29, 2019.
> 
> Changes wrt v2:
>  - Added patch 3 to set the UNLOADING flag before waiting for
> sessions
>    to end (Roman)
>  - Use test_and_set_bit() for UNLOADING (Hannes)
> 
> Martin Wilck (3):
>   scsi: qla2xxx: avoid sending mailbox commands if firmware is
> stopped
>   scsi: qla2xxx: don't shut down firmware before closing sessions
>   scsi: qla2xxx: set UNLOADING before waiting for session deletion
> 
>  drivers/scsi/qla2xxx/qla_mbx.c |  3 +++
>  drivers/scsi/qla2xxx/qla_os.c  | 39 +++++++++++++++++---------------
> --
>  2 files changed, 22 insertions(+), 20 deletions(-)
> 

what about this series? Will it be merged for 5.7? It got positive
reviews from Daniel, and no negative ones so far, and it still applies
cleanly to 5.7/scsi-queue.

Thanks,
Martin W.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading
  2020-03-09 16:44 ` [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading Martin Wilck
@ 2020-03-11  2:44   ` Martin K. Petersen
  0 siblings, 0 replies; 24+ messages in thread
From: Martin K. Petersen @ 2020-03-11  2:44 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi


Martin,

> what about this series? Will it be merged for 5.7? It got positive
> reviews from Daniel, and no negative ones so far, and it still applies
> cleanly to 5.7/scsi-queue.

Still waiting for feedback from Marvell on these changes (and your other
qla2xxx patch).

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions
  2020-02-05 21:44 ` [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions mwilck
  2020-02-06 12:42   ` Daniel Wagner
@ 2020-03-13 14:55   ` Roman Bolshakov
  2020-03-25  0:36   ` Arun Easi
  2 siblings, 0 replies; 24+ messages in thread
From: Roman Bolshakov @ 2020-03-13 14:55 UTC (permalink / raw)
  To: mwilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Hannes Reinecke, Bart Van Assche, Daniel Wagner, James Bottomley,
	linux-scsi

On Wed, Feb 05, 2020 at 10:44:21PM +0100, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Since 45235022da99, the firmware is shut down early in the controller
> shutdown process. This causes commands sent to the firmware (such as LOGO)
> to hang forever. Eventually one or more timeouts will be triggered.
> Move the stopping of the firmware until after sessions have terminated.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  drivers/scsi/qla2xxx/qla_os.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 

Hi Martin,

Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>

BTW, I think qla_nvme_delete() should be moved up to split the function into
two parts:
 * session/port cleanup
 * chip shutdown

Although invocation of the function after chip shutdown has no functional
implications at the moment.

Best regards,
Roman

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion
  2020-02-05 21:44 ` [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
  2020-02-06 12:46   ` Daniel Wagner
@ 2020-03-13 15:36   ` Roman Bolshakov
  2020-03-25  0:38   ` Arun Easi
  2 siblings, 0 replies; 24+ messages in thread
From: Roman Bolshakov @ 2020-03-13 15:36 UTC (permalink / raw)
  To: mwilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Hannes Reinecke, Bart Van Assche, Daniel Wagner, James Bottomley,
	linux-scsi

On Wed, Feb 05, 2020 at 10:44:22PM +0100, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> The purpose of the UNLOADING flag is to avoid port login procedures
> to continue when a controller is in the process of shutting down.
> It makes sense to set this flag before starting session teardown.
> The only operations that must be able to continue are LOGO, PRLO,
> and the like.
> 
> Furthermore, use atomic test_and_set_bit() to avoid the shutdown
> being run multiple times in parallel. In qla2x00_disable_board_on_pci_error(),
> the test for UNLOADING is postponed until after the check for an already
> disabled PCI board.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  drivers/scsi/qla2xxx/qla_os.c | 25 +++++++++++--------------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 

Hi Martin,

Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>

Thanks,
Roman

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-02-05 21:44 ` [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped mwilck
  2020-02-06 12:33   ` Daniel Wagner
@ 2020-03-24 23:51   ` Arun Easi
  2020-03-25 16:28     ` Martin Wilck
  1 sibling, 1 reply; 24+ messages in thread
From: Arun Easi @ 2020-03-24 23:51 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:

> From: Martin Wilck <mwilck@suse.com>
> 
> Since commit 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip"),
> it is possible that FC commands are scheduled after the adapter firmware
> has been shut down. IO sent to the firmware in this situation hangs
> indefinitely. Avoid this for the LOGO code path that is typically taken
> when adapters are shut down.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
> ---
>  drivers/scsi/qla2xxx/qla_mbx.c | 3 +++
>  drivers/scsi/qla2xxx/qla_os.c  | 3 +++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
> index 9e09964..53129f2 100644
> --- a/drivers/scsi/qla2xxx/qla_mbx.c
> +++ b/drivers/scsi/qla2xxx/qla_mbx.c
> @@ -2644,6 +2644,9 @@ qla24xx_fabric_logout(scsi_qla_host_t *vha, uint16_t loop_id, uint8_t domain,
>  	ql_dbg(ql_dbg_mbx + ql_dbg_verbose, vha, 0x106d,
>  	    "Entered %s.\n", __func__);
>  
> +	if (!ha->flags.fw_started)
> +		return QLA_FUNCTION_FAILED;
> +

Ok.

>  	lg = dma_pool_zalloc(ha->s_dma_pool, GFP_KERNEL, &lg_dma);
>  	if (lg == NULL) {
>  		ql_log(ql_log_warn, vha, 0x106e,
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index b520a98..2dafb46 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -4878,6 +4878,9 @@ qla2x00_post_work(struct scsi_qla_host *vha, struct qla_work_evt *e)
>  	unsigned long flags;
>  	bool q = false;
>  
> +	if (!vha->hw->flags.fw_started)
> +		return QLA_FUNCTION_FAILED;
> +

I'd probably not do it here; rather in qla2x00_get_sp() 
/qla2x00_start_sp()/ qla2xxx_get_qpair_sp() time. Not all works are for 
posting to firmware (QLA_EVT_IDC_ACK, QLA_EVT_UNMAP etc.).

Regards,
-Arun

>  	spin_lock_irqsave(&vha->work_lock, flags);
>  	list_add_tail(&e->list, &vha->work_list);
>  
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions
  2020-02-05 21:44 ` [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions mwilck
  2020-02-06 12:42   ` Daniel Wagner
  2020-03-13 14:55   ` Roman Bolshakov
@ 2020-03-25  0:36   ` Arun Easi
  2020-03-25 15:34     ` Martin Wilck
  2 siblings, 1 reply; 24+ messages in thread
From: Arun Easi @ 2020-03-25  0:36 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Quinn Tran, Roman Bolshakov, Hannes Reinecke,
	Bart Van Assche, Daniel Wagner, James Bottomley, linux-scsi

On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:

> From: Martin Wilck <mwilck@suse.com>
> 
> Since 45235022da99, the firmware is shut down early in the controller
> shutdown process. This causes commands sent to the firmware (such as LOGO)
> to hang forever. Eventually one or more timeouts will be triggered.
> Move the stopping of the firmware until after sessions have terminated.
> 
> Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting down chip")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  drivers/scsi/qla2xxx/qla_os.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 2dafb46..e81080d 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -3720,6 +3720,16 @@ qla2x00_remove_one(struct pci_dev *pdev)
>  	}
>  	qla2x00_wait_for_hba_ready(base_vha);
>  
> +	qla2x00_wait_for_sess_deletion(base_vha);
> +
> +	/*
> +	 * if UNLOAD flag is already set, then continue unload,
> +	 * where it was set first.
> +	 */
> +	if (test_bit(UNLOADING, &base_vha->dpc_flags))
> +		return;
> +
> +	set_bit(UNLOADING, &base_vha->dpc_flags);
>  	if (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) ||
>  	    IS_QLA28XX(ha)) {
>  		if (ha->flags.fw_started)
> @@ -3736,17 +3746,6 @@ qla2x00_remove_one(struct pci_dev *pdev)
>  		qla2x00_try_to_stop_firmware(base_vha);
>  	}
>  
> -	qla2x00_wait_for_sess_deletion(base_vha);
> -
> -	/*
> -	 * if UNLOAD flag is already set, then continue unload,
> -	 * where it was set first.
> -	 */
> -	if (test_bit(UNLOADING, &base_vha->dpc_flags))
> -		return;
> -
> -	set_bit(UNLOADING, &base_vha->dpc_flags);
> -
>  	qla_nvme_delete(base_vha);
>  
>  	dma_free_coherent(&ha->pdev->dev,
> 

NAK.

The fcport deletion was done after chip reset to minimize interference and 
further action on fcports. We should not be sending out logouts during 
unload (driver just implicitly logs out). If you experience any hangs, 
please let us know.

Regards,
-Arun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion
  2020-02-05 21:44 ` [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
  2020-02-06 12:46   ` Daniel Wagner
  2020-03-13 15:36   ` Roman Bolshakov
@ 2020-03-25  0:38   ` Arun Easi
  2 siblings, 0 replies; 24+ messages in thread
From: Arun Easi @ 2020-03-25  0:38 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:

> From: Martin Wilck <mwilck@suse.com>
> 
> The purpose of the UNLOADING flag is to avoid port login procedures
> to continue when a controller is in the process of shutting down.
> It makes sense to set this flag before starting session teardown.
> The only operations that must be able to continue are LOGO, PRLO,
> and the like.
> 
> Furthermore, use atomic test_and_set_bit() to avoid the shutdown
> being run multiple times in parallel. In qla2x00_disable_board_on_pci_error(),
> the test for UNLOADING is postponed until after the check for an already
> disabled PCI board.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  drivers/scsi/qla2xxx/qla_os.c | 25 +++++++++++--------------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index e81080d..8329f95 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -3720,16 +3720,15 @@ qla2x00_remove_one(struct pci_dev *pdev)
>  	}
>  	qla2x00_wait_for_hba_ready(base_vha);
>  
> -	qla2x00_wait_for_sess_deletion(base_vha);
> -
>  	/*
> -	 * if UNLOAD flag is already set, then continue unload,
> +	 * if UNLOADING flag is already set, then continue unload,
>  	 * where it was set first.
>  	 */
> -	if (test_bit(UNLOADING, &base_vha->dpc_flags))
> +	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
>  		return;
>  
> -	set_bit(UNLOADING, &base_vha->dpc_flags);
> +	qla2x00_wait_for_sess_deletion(base_vha);
> +
>  	if (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) ||
>  	    IS_QLA28XX(ha)) {
>  		if (ha->flags.fw_started)
> @@ -6046,13 +6045,6 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
>  	struct pci_dev *pdev = ha->pdev;
>  	scsi_qla_host_t *base_vha = pci_get_drvdata(ha->pdev);
>  
> -	/*
> -	 * if UNLOAD flag is already set, then continue unload,
> -	 * where it was set first.
> -	 */
> -	if (test_bit(UNLOADING, &base_vha->dpc_flags))
> -		return;
> -
>  	ql_log(ql_log_warn, base_vha, 0x015b,
>  	    "Disabling adapter.\n");
>  
> @@ -6063,9 +6055,14 @@ qla2x00_disable_board_on_pci_error(struct work_struct *work)
>  		return;
>  	}
>  
> -	qla2x00_wait_for_sess_deletion(base_vha);
> +	/*
> +	 * if UNLOADING flag is already set, then continue unload,
> +	 * where it was set first.
> +	 */
> +	if (test_and_set_bit(UNLOADING, &base_vha->dpc_flags))
> +		return;
>  
> -	set_bit(UNLOADING, &base_vha->dpc_flags);
> +	qla2x00_wait_for_sess_deletion(base_vha);
>  
>  	qla2x00_delete_all_vps(ha, base_vha);
>  
> 

This was done on top of PATCH 2/3, please resend dropping 2. The 
test_and_set_bit() part looks ok.

Regards,
-Arun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions
  2020-03-25  0:36   ` Arun Easi
@ 2020-03-25 15:34     ` Martin Wilck
  2020-03-25 21:55       ` Arun Easi
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-03-25 15:34 UTC (permalink / raw)
  To: Arun Easi
  Cc: Martin K. Petersen, Quinn Tran, Roman Bolshakov, Hannes Reinecke,
	Bart Van Assche, Daniel Wagner, James Bottomley, linux-scsi

On Tue, 2020-03-24 at 17:36 -0700, Arun Easi wrote:
> On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:
> 
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > Since 45235022da99, the firmware is shut down early in the
> > controller
> > shutdown process. This causes commands sent to the firmware (such
> > as LOGO)
> > to hang forever. Eventually one or more timeouts will be triggered.
> > Move the stopping of the firmware until after sessions have
> > terminated.
> > 
> > Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting
> > down chip")
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  drivers/scsi/qla2xxx/qla_os.c | 21 ++++++++++-----------
> >  1 file changed, 10 insertions(+), 11 deletions(-)
> > 
> > 

...

> NAK.
> 
> The fcport deletion was done after chip reset to minimize
> interference and 
> further action on fcports. We should not be sending out logouts
> during 
> unload (driver just implicitly logs out). If you experience any
> hangs, 
> please let us know.

What about target mode? AFAIK target ports need to send explicit LOGO.

Regards
Martin



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-03-24 23:51   ` Arun Easi
@ 2020-03-25 16:28     ` Martin Wilck
  2020-03-25 17:20       ` Martin Wilck
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-03-25 16:28 UTC (permalink / raw)
  To: Arun Easi
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Tue, 2020-03-24 at 16:51 -0700, Arun Easi wrote:
> On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:
> 
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > Since commit 45235022da99 ("scsi: qla2xxx: Fix driver unload by
> > shutting down chip"),
> > it is possible that FC commands are scheduled after the adapter
> > firmware
> > has been shut down. IO sent to the firmware in this situation hangs
> > indefinitely. Avoid this for the LOGO code path that is typically
> > taken
> > when adapters are shut down.
> > 
> > Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting
> > down chip")
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
> > ---
> >  drivers/scsi/qla2xxx/qla_mbx.c | 3 +++
> >  drivers/scsi/qla2xxx/qla_os.c  | 3 +++
> >  2 files changed, 6 insertions(+)
> > 
> > 
> > --- a/drivers/scsi/qla2xxx/qla_os.c
> > +++ b/drivers/scsi/qla2xxx/qla_os.c
> > @@ -4878,6 +4878,9 @@ qla2x00_post_work(struct scsi_qla_host *vha,
> > struct qla_work_evt *e)
> >  	unsigned long flags;
> >  	bool q = false;
> >  
> > +	if (!vha->hw->flags.fw_started)
> > +		return QLA_FUNCTION_FAILED;
> > +
> 
> I'd probably not do it here; rather in qla2x00_get_sp() 
> /qla2x00_start_sp()/ qla2xxx_get_qpair_sp() time. Not all works are
> for 
> posting to firmware (QLA_EVT_IDC_ACK, QLA_EVT_UNMAP etc.).

qla81xx_idc_ack() calls qla2x00_mailbox_command(), which should be
avoided IMO. But I'll review the various cases and re-post the patch.

Thanks,
Martin



> 
> Regards,
> -Arun
> 
> >  	spin_lock_irqsave(&vha->work_lock, flags);
> >  	list_add_tail(&e->list, &vha->work_list);
> >  
> > 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-03-25 16:28     ` Martin Wilck
@ 2020-03-25 17:20       ` Martin Wilck
  2020-03-25 18:21         ` Martin Wilck
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-03-25 17:20 UTC (permalink / raw)
  To: Arun Easi
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 2020-03-25 at 17:28 +0100, Martin Wilck wrote:
> On Tue, 2020-03-24 at 16:51 -0700, Arun Easi wrote:
> > On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:
> > 
> > > From: Martin Wilck <mwilck@suse.com>
> > > 
> > > Since commit 45235022da99 ("scsi: qla2xxx: Fix driver unload by
> > > shutting down chip"),
> > > it is possible that FC commands are scheduled after the adapter
> > > firmware
> > > has been shut down. IO sent to the firmware in this situation
> > > hangs
> > > indefinitely. Avoid this for the LOGO code path that is typically
> > > taken
> > > when adapters are shut down.
> > > 
> > > Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by
> > > shutting
> > > down chip")
> > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
> > > ---
> > >  drivers/scsi/qla2xxx/qla_mbx.c | 3 +++
> > >  drivers/scsi/qla2xxx/qla_os.c  | 3 +++
> > >  2 files changed, 6 insertions(+)
> > > 
> > > [...]
> > > --- a/drivers/scsi/qla2xxx/qla_os.c
> > > +++ b/drivers/scsi/qla2xxx/qla_os.c
> > > @@ -4878,6 +4878,9 @@ qla2x00_post_work(struct scsi_qla_host
> > > *vha,
> > > struct qla_work_evt *e)
> > >  	unsigned long flags;
> > >  	bool q = false;
> > >  
> > > +	if (!vha->hw->flags.fw_started)
> > > +		return QLA_FUNCTION_FAILED;
> > > +
> > 
> > I'd probably not do it here; rather in qla2x00_get_sp() 
> > /qla2x00_start_sp()/ qla2xxx_get_qpair_sp() time. Not all works are
> > for 
> > posting to firmware (QLA_EVT_IDC_ACK, QLA_EVT_UNMAP etc.).
> 
> qla81xx_idc_ack() calls qla2x00_mailbox_command(), which should be
> avoided IMO. But I'll review the various cases and re-post the patch.

Thinking about it once more, the approach is racy. 

qla2x00_try_stop_firmware() can be called any time, and it sets
fw_started = 0 *after* actually stopping the firmware. Even if we check
fw_started, the firwmare might be stopped between our check and our
actual mailbox / FW register access, and we'd end up hanging.

At least the check should be as close as possible to the actual FW
access, e.g. in qla2x00_mailbox_command(), or before writing to the
request queue registers in qla2x00_start_sp() etc.

Perhaps the (!fw_started) condition should be treated like
ABORT_ISP_ACTIVE in qla2x00_mailbox_command, i.e. execute only if
is_rom_cmd() returns true?

I can re-post, but I feel this should really be done by someone who
knows exactly how the firmware operates, IOW Marvell staff.

Regards
Martin



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-03-25 17:20       ` Martin Wilck
@ 2020-03-25 18:21         ` Martin Wilck
  2020-03-25 22:13           ` Arun Easi
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-03-25 18:21 UTC (permalink / raw)
  To: Arun Easi
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 2020-03-25 at 18:20 +0100, Martin Wilck wrote:
> Perhaps the (!fw_started) condition should be treated like
> ABORT_ISP_ACTIVE in qla2x00_mailbox_command, i.e. execute only if
> is_rom_cmd() returns true?

No, this won't be sufficient, as e.g. MBC_IOCB_COMMAND_A64 is in
rom_cmds[], but has been found to hang (this is why I had the hunk in
qla24xx_fabric_logout()). The list of mailbox commands
that must be passed on when the FW is stopped has to be shorter than
rom_cmds[].

Better suggestions welcome.

Martin



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions
  2020-03-25 15:34     ` Martin Wilck
@ 2020-03-25 21:55       ` Arun Easi
  0 siblings, 0 replies; 24+ messages in thread
From: Arun Easi @ 2020-03-25 21:55 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Quinn Tran, Roman Bolshakov, Hannes Reinecke,
	Bart Van Assche, Daniel Wagner, James Bottomley, linux-scsi

On Wed, 25 Mar 2020, 8:34am, Martin Wilck wrote:

> On Tue, 2020-03-24 at 17:36 -0700, Arun Easi wrote:
> > On Wed, 5 Feb 2020, 1:44pm, mwilck@suse.com wrote:
> > 
> > > From: Martin Wilck <mwilck@suse.com>
> > > 
> > > Since 45235022da99, the firmware is shut down early in the
> > > controller
> > > shutdown process. This causes commands sent to the firmware (such
> > > as LOGO)
> > > to hang forever. Eventually one or more timeouts will be triggered.
> > > Move the stopping of the firmware until after sessions have
> > > terminated.
> > > 
> > > Fixes: 45235022da99 ("scsi: qla2xxx: Fix driver unload by shutting
> > > down chip")
> > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > ---
> > >  drivers/scsi/qla2xxx/qla_os.c | 21 ++++++++++-----------
> > >  1 file changed, 10 insertions(+), 11 deletions(-)
> > > 
> > > 
> 
> ...
> 
> > NAK.
> > 
> > The fcport deletion was done after chip reset to minimize interference 
> > and further action on fcports. We should not be sending out logouts 
> > during unload (driver just implicitly logs out). If you experience any 
> > hangs, please let us know.
> 
> What about target mode? AFAIK target ports need to send explicit LOGO.
> 

I do not see a hard requirement cited in spec, correct me if you see 
otherwise. Other initiators should see a RSCN and see this device has 
gone away.

Regards,
-Arun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-03-25 18:21         ` Martin Wilck
@ 2020-03-25 22:13           ` Arun Easi
  2020-03-25 22:41             ` Martin Wilck
  0 siblings, 1 reply; 24+ messages in thread
From: Arun Easi @ 2020-03-25 22:13 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 25 Mar 2020, 11:21am, Martin Wilck wrote:

> On Wed, 2020-03-25 at 18:20 +0100, Martin Wilck wrote:
> > Perhaps the (!fw_started) condition should be treated like
> > ABORT_ISP_ACTIVE in qla2x00_mailbox_command, i.e. execute only if
> > is_rom_cmd() returns true?
> 
> No, this won't be sufficient, as e.g. MBC_IOCB_COMMAND_A64 is in
> rom_cmds[], but has been found to hang (this is why I had the hunk in
> qla24xx_fabric_logout()). The list of mailbox commands
> that must be passed on when the FW is stopped has to be shorter than
> rom_cmds[].
> 

With your patch 2/3, where UNLOADING is set prior to the reset of chip, in 
place this issue should be largely addressed. Basically, paths that send 
out request to wire check UNLOADING bit (if any path is missing that, 
should add the check) before sending it out.

Now with UNLOADING set (with your patch 2/3), chip reset, and all 
outstanding command's completion called (qla2x00_abort_isp_cleanup) I see 
less chance of anything being sent out. If you see any issue with your 
patches 1 & 2 (addressing my comments) applied, let me know and we can 
tackle then. How about that?

Regards,
-Arun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-03-25 22:13           ` Arun Easi
@ 2020-03-25 22:41             ` Martin Wilck
  2020-03-25 23:35               ` [EXT] " Arun Easi
  0 siblings, 1 reply; 24+ messages in thread
From: Martin Wilck @ 2020-03-25 22:41 UTC (permalink / raw)
  To: Arun Easi
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 2020-03-25 at 15:13 -0700, Arun Easi wrote:
> On Wed, 25 Mar 2020, 11:21am, Martin Wilck wrote:
> 
> > On Wed, 2020-03-25 at 18:20 +0100, Martin Wilck wrote:
> > > Perhaps the (!fw_started) condition should be treated like
> > > ABORT_ISP_ACTIVE in qla2x00_mailbox_command, i.e. execute only if
> > > is_rom_cmd() returns true?
> > 
> > No, this won't be sufficient, as e.g. MBC_IOCB_COMMAND_A64 is in
> > rom_cmds[], but has been found to hang (this is why I had the hunk
> > in
> > qla24xx_fabric_logout()). The list of mailbox commands
> > that must be passed on when the FW is stopped has to be shorter
> > than
> > rom_cmds[].
> > 
> 
> With your patch 2/3, where UNLOADING is set prior to the reset of
> chip, in 
> place this issue should be largely addressed. Basically, paths that
> send 
> out request to wire check UNLOADING bit (if any path is missing
> that, 
> should add the check) before sending it out.
> 
> Now with UNLOADING set (with your patch 2/3), chip reset, and all 
> outstanding command's completion called (qla2x00_abort_isp_cleanup) I
> see 
> less chance of anything being sent out. If you see any issue with
> your 
> patches 1 & 2 (addressing my comments) applied, let me know and we
> can 
> tackle then. How about that?

It sounds like a plan. Although it means that I just wasted time trying
to figure out which mailbox commands need to be processed even if the
firmware is down :-)

Thanks,
Martin



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [EXT] Re: [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped
  2020-03-25 22:41             ` Martin Wilck
@ 2020-03-25 23:35               ` Arun Easi
  0 siblings, 0 replies; 24+ messages in thread
From: Arun Easi @ 2020-03-25 23:35 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Himanshu Madhani, Quinn Tran,
	Roman Bolshakov, Hannes Reinecke, Bart Van Assche, Daniel Wagner,
	James Bottomley, linux-scsi

On Wed, 25 Mar 2020, 3:41pm, Martin Wilck wrote:

>
> ----------------------------------------------------------------------
> On Wed, 2020-03-25 at 15:13 -0700, Arun Easi wrote:
> > On Wed, 25 Mar 2020, 11:21am, Martin Wilck wrote:
> > 
> > > On Wed, 2020-03-25 at 18:20 +0100, Martin Wilck wrote:
> > > > Perhaps the (!fw_started) condition should be treated like
> > > > ABORT_ISP_ACTIVE in qla2x00_mailbox_command, i.e. execute only if
> > > > is_rom_cmd() returns true?
> > > 
> > > No, this won't be sufficient, as e.g. MBC_IOCB_COMMAND_A64 is in
> > > rom_cmds[], but has been found to hang (this is why I had the hunk
> > > in
> > > qla24xx_fabric_logout()). The list of mailbox commands
> > > that must be passed on when the FW is stopped has to be shorter
> > > than
> > > rom_cmds[].
> > > 
> > 
> > With your patch 2/3, where UNLOADING is set prior to the reset of 
> > chip, in place this issue should be largely addressed. Basically, 
> > paths that send out request to wire check UNLOADING bit (if any path 
> > is missing that, should add the check) before sending it out.
> > 
> > Now with UNLOADING set (with your patch 2/3), chip reset, and all 
> > outstanding command's completion called (qla2x00_abort_isp_cleanup) I 
> > see less chance of anything being sent out. If you see any issue with 
> > your patches 1 & 2 (addressing my comments) applied, let me know and 
> > we can tackle then. How about that?
> 
> It sounds like a plan. Although it means that I just wasted time trying
> to figure out which mailbox commands need to be processed even if the
> firmware is down :-)
> 

I hope it was not too much time. On the plus side, you know the mailbox 
path very well now. :)

Regards,
-Arun

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-03-25 23:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-05 21:44 [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading mwilck
2020-02-05 21:44 ` [PATCH v2 1/3] scsi: qla2xxx: avoid sending mailbox commands if firmware is stopped mwilck
2020-02-06 12:33   ` Daniel Wagner
2020-02-06 12:50     ` Martin Wilck
2020-02-06 13:03       ` Daniel Wagner
2020-03-24 23:51   ` Arun Easi
2020-03-25 16:28     ` Martin Wilck
2020-03-25 17:20       ` Martin Wilck
2020-03-25 18:21         ` Martin Wilck
2020-03-25 22:13           ` Arun Easi
2020-03-25 22:41             ` Martin Wilck
2020-03-25 23:35               ` [EXT] " Arun Easi
2020-02-05 21:44 ` [PATCH v2 2/3] scsi: qla2xxx: don't shut down firmware before closing sessions mwilck
2020-02-06 12:42   ` Daniel Wagner
2020-03-13 14:55   ` Roman Bolshakov
2020-03-25  0:36   ` Arun Easi
2020-03-25 15:34     ` Martin Wilck
2020-03-25 21:55       ` Arun Easi
2020-02-05 21:44 ` [PATCH v2 3/3] scsi: qla2xxx: set UNLOADING before waiting for session deletion mwilck
2020-02-06 12:46   ` Daniel Wagner
2020-03-13 15:36   ` Roman Bolshakov
2020-03-25  0:38   ` Arun Easi
2020-03-09 16:44 ` [PATCH v2 0/3] scsi: qla2xxx: fixes for driver unloading Martin Wilck
2020-03-11  2:44   ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.