stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd
       [not found] <20190912003919.8488-1-r.bolshakov@yadro.com>
@ 2019-09-12  0:39 ` Roman Bolshakov
  2019-09-12  0:39 ` [PATCH 2/4] scsi: qla2xxx: Initialize free_work before flushing it Roman Bolshakov
  2019-09-12  0:39 ` [PATCH 4/4] scsi: qla2xxx: Change discovery state before PLOGI Roman Bolshakov
  2 siblings, 0 replies; 3+ messages in thread
From: Roman Bolshakov @ 2019-09-12  0:39 UTC (permalink / raw)
  To: linux-scsi
  Cc: Roman Bolshakov, Quinn Tran, Himanshu Madhani, Bart Van Assche, stable

If ABTS cannot be completed in target mode, the driver attempts to free
related management command and crashes:

  NIP [d000000019181ee8] tcm_qla2xxx_free_mcmd+0x40/0x80 [tcm_qla2xxx]
  LR [d00000001dc1e6f8] qlt_response_pkt+0x190/0xa10 [qla2xxx]
  Call Trace:
  [c000003fff27bb50] [c000003fff27bc10] 0xc000003fff27bc10 (unreliable)
  [c000003fff27bb70] [d00000001dc1e6f8] qlt_response_pkt+0x190/0xa10 [qla2xxx]
  [c000003fff27bc10] [d00000001dbc2be0] qla24xx_process_response_queue+0x5d8/0xbd0 [qla2xxx]
  [c000003fff27bd50] [d00000001dbc632c] qla24xx_msix_rsp_q+0x64/0x150 [qla2xxx]
  [c000003fff27bde0] [c000000000187200] __handle_irq_event_percpu+0x90/0x310
  [c000003fff27bea0] [c0000000001874b8] handle_irq_event_percpu+0x38/0x90
  [c000003fff27bee0] [c000000000187574] handle_irq_event+0x64/0xb0
  [c000003fff27bf10] [c00000000018cd38] handle_fasteoi_irq+0xe8/0x280
  [c000003fff27bf40] [c000000000185ccc] generic_handle_irq+0x4c/0x70
  [c000003fff27bf60] [c000000000016cec] __do_irq+0x7c/0x1d0
  [c000003fff27bf90] [c00000000002a530] call_do_irq+0x14/0x24
  [c00000207d2cba90] [c000000000016edc] do_IRQ+0x9c/0x130
  [c00000207d2cbae0] [c000000000008bf4] hardware_interrupt_common+0x114/0x120
  --- interrupt: 501 at arch_local_irq_restore+0x74/0x90
      LR = arch_local_irq_restore+0x74/0x90
  [c00000207d2cbdd0] [c0000000001c64fc] tick_broadcast_oneshot_control+0x4c/0x60 (unreliable)
  [c00000207d2cbdf0] [c0000000007ac840] cpuidle_enter_state+0xf0/0x450
  [c00000207d2cbe50] [c00000000016b81c] call_cpuidle+0x4c/0x90
  [c00000207d2cbe70] [c00000000016bc30] do_idle+0x2b0/0x330
  [c00000207d2cbec0] [c00000000016beec] cpu_startup_entry+0x3c/0x50
  [c00000207d2cbef0] [c00000000004a06c] start_secondary+0x63c/0x670
  [c00000207d2cbf90] [c00000000000aa6c] start_secondary_prolog+0x10/0x14

The crash can be triggered by ACL deletion when there's active I/O.

During ACL deletion, qla2xxx performs implicit LOGO that's invisible for
the initiator. Only the driver and firmware are aware of the logout.
Therefore the initiator continues to send SCSI commands and the target
always responds with SAM STATUS BUSY as it can't find the session.

The command times out after a while and initiator invokes ABORT TASK TMF
for the command. The TMF is mapped to ABTS-LS in FCP. The target can't
find session for S_ID originating ABTS-LS so it never allocates mcmd.
And since N_Port handle was deleted after LOGO, it is no longer valid
and ABTS Response IOCB is returned from firmware with status 31. Then
free_mcmd is invoked on NULL pointer and the kernel crashes.

[ 7734.578642] qla2xxx [0000:00:0c.0]-e837:6: ABTS_RECV_24XX: instance 0
[ 7734.578644] qla2xxx [0000:00:0c.0]-f811:6: qla_target(0): task abort (s_id=1:2:0, tag=1209504, param=0)
[ 7734.578645] find_sess_by_s_id: 0x010200
[ 7734.578645] Unable to locate s_id: 0x010200
[ 7734.578646] qla2xxx [0000:00:0c.0]-f812:6: qla_target(0): task abort for non-existent session
[ 7734.578648] qla2xxx [0000:00:0c.0]-e806:6: Sending task mgmt ABTS response (ha=c0000000d5819000, atio=c0000000d3fd4700, status=4
[ 7734.578730] qla2xxx [0000:00:0c.0]-e838:6: ABTS_RESP_24XX: compl_status 31
[ 7734.578732] qla2xxx [0000:00:0c.0]-e863:6: qla_target(0): ABTS_RESP_24XX failed 31 (subcode 19:a)
[ 7734.578740] Unable to handle kernel paging request for data at address 0x00000200

Fixes: 6b0431d6fa20b ("scsi: qla2xxx: Fix out of order Termination and ABTS response")
Cc: Quinn Tran <qtran@marvell.com>
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: stable@vger.kernel.org
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
---
 drivers/scsi/qla2xxx/tcm_qla2xxx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 042a24314edc..bab2073c1f72 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -246,6 +246,8 @@ static void tcm_qla2xxx_complete_mcmd(struct work_struct *work)
  */
 static void tcm_qla2xxx_free_mcmd(struct qla_tgt_mgmt_cmd *mcmd)
 {
+	if (!mcmd)
+		return;
 	INIT_WORK(&mcmd->free_work, tcm_qla2xxx_complete_mcmd);
 	queue_work(tcm_qla2xxx_free_wq, &mcmd->free_work);
 }
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/4] scsi: qla2xxx: Initialize free_work before flushing it
       [not found] <20190912003919.8488-1-r.bolshakov@yadro.com>
  2019-09-12  0:39 ` [PATCH 1/4] scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd Roman Bolshakov
@ 2019-09-12  0:39 ` Roman Bolshakov
  2019-09-12  0:39 ` [PATCH 4/4] scsi: qla2xxx: Change discovery state before PLOGI Roman Bolshakov
  2 siblings, 0 replies; 3+ messages in thread
From: Roman Bolshakov @ 2019-09-12  0:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Roman Bolshakov, Quinn Tran, Himanshu Madhani, stable

Target creation triggers a new BUG_ON introduced in in 4d43d395fed12
("workqueue: Try to catch flush_work() without INIT_WORK().").
The BUG_ON reveals an attempt to flush free_work in qla24xx_do_nack_work
before it's initialized in qlt_unreg_sess:

  WARNING: CPU: 7 PID: 211 at kernel/workqueue.c:3031 __flush_work.isra.38+0x40/0x2e0
  CPU: 7 PID: 211 Comm: kworker/7:1 Kdump: loaded Tainted: G            E     5.3.0-rc7-vanilla+ #2
  Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
  NIP:  c000000000159620 LR: c0080000009d91b0 CTR: c0000000001598c0
  REGS: c000000005f3f730 TRAP: 0700   Tainted: G            E      (5.3.0-rc7-vanilla+)
  MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002222  XER: 00000000
  CFAR: c0000000001598d0 IRQMASK: 0
  GPR00: c0080000009d91b0 c000000005f3f9c0 c000000001670a00 c0000003f8655ca8
  GPR04: c0000003f8655c00 000000000000ffff 0000000000000011 ffffffffffffffff
  GPR08: c008000000949228 0000000000000000 0000000000000001 c0080000009e7780
  GPR12: 0000000000002200 c00000003fff6200 c000000000161bc8 0000000000000004
  GPR16: c0000003f9d68280 0000000002000000 0000000000000005 0000000000000003
  GPR20: 0000000000000002 000000000000ffff 0000000000000000 fffffffffffffef7
  GPR24: c000000004f73848 c000000004f73838 c000000004f73f28 c000000005f3fb60
  GPR28: c000000004f73e48 c000000004f73c80 c000000004f73818 c0000003f9d68280
  NIP [c000000000159620] __flush_work.isra.38+0x40/0x2e0
  LR [c0080000009d91b0] qla24xx_do_nack_work+0x88/0x180 [qla2xxx]
  Call Trace:
  [c000000005f3f9c0] [c000000000159644] __flush_work.isra.38+0x64/0x2e0 (unreliable)
  [c000000005f3fa50] [c0080000009d91a0] qla24xx_do_nack_work+0x78/0x180 [qla2xxx]
  [c000000005f3fae0] [c0080000009496ec] qla2x00_do_work+0x604/0xb90 [qla2xxx]
  [c000000005f3fc40] [c008000000949cd8] qla2x00_iocb_work_fn+0x60/0xe0 [qla2xxx]
  [c000000005f3fc80] [c000000000157bb8] process_one_work+0x2c8/0x5b0
  [c000000005f3fd10] [c000000000157f28] worker_thread+0x88/0x660
  [c000000005f3fdb0] [c000000000161d64] kthread+0x1a4/0x1b0
  [c000000005f3fe20] [c00000000000b960] ret_from_kernel_thread+0x5c/0x7c
  Instruction dump:
  3d22001d 892966b1 7d908026 91810008 f821ff71 69290001 0b090000 2e290000
  40920200 e9230018 7d2a0074 794ad182 <0b0a0000> 2fa90000 419e01e8 7c0802a6
  ---[ end trace 5ccf335d4f90fcb8 ]---

Fixes: 1021f0bc2f3d6 ("scsi: qla2xxx: allow session delete to finish before create.")
Cc: Quinn Tran <qtran@marvell.com>
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: stable@vger.kernel.org
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
---
 drivers/scsi/qla2xxx/qla_init.c   | 1 +
 drivers/scsi/qla2xxx/qla_target.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index cb08004ef3f8..c8d89912d044 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -4803,6 +4803,7 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vha, gfp_t flags)
 	}
 
 	INIT_WORK(&fcport->del_work, qla24xx_delete_sess_fn);
+	INIT_WORK(&fcport->free_work, qlt_free_session_done);
 	INIT_WORK(&fcport->reg_work, qla_register_fcport_fn);
 	INIT_LIST_HEAD(&fcport->gnl_entry);
 	INIT_LIST_HEAD(&fcport->list);
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index 0ffda6171614..bd1d30524de5 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -1160,7 +1160,6 @@ void qlt_unreg_sess(struct fc_port *sess)
 	sess->last_rscn_gen = sess->rscn_gen;
 	sess->last_login_gen = sess->login_gen;
 
-	INIT_WORK(&sess->free_work, qlt_free_session_done);
 	schedule_work(&sess->free_work);
 }
 EXPORT_SYMBOL(qlt_unreg_sess);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 4/4] scsi: qla2xxx: Change discovery state before PLOGI
       [not found] <20190912003919.8488-1-r.bolshakov@yadro.com>
  2019-09-12  0:39 ` [PATCH 1/4] scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd Roman Bolshakov
  2019-09-12  0:39 ` [PATCH 2/4] scsi: qla2xxx: Initialize free_work before flushing it Roman Bolshakov
@ 2019-09-12  0:39 ` Roman Bolshakov
  2 siblings, 0 replies; 3+ messages in thread
From: Roman Bolshakov @ 2019-09-12  0:39 UTC (permalink / raw)
  To: linux-scsi; +Cc: Roman Bolshakov, Quinn Tran, Himanshu Madhani, stable

When a port sends PLOGI, discovery state should be changed to login
pending, otherwise RELOGIN_NEEDED bit is set in
qla24xx_handle_plogi_done_event(). RELOGIN_NEEDED triggers another
PLOGI, and it never goes out of the loop until login timer expires.

Fixes: 8777e4314d397 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine")
Fixes: 8b5292bcfcacf ("scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag")
Cc: Quinn Tran <qtran@marvell.com>
Cc: Himanshu Madhani <hmadhani@marvell.com>
Cc: stable@vger.kernel.org
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
---
 drivers/scsi/qla2xxx/qla_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index c8d89912d044..e4857ef0e5c4 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -516,6 +516,7 @@ static int qla_post_els_plogi_work(struct scsi_qla_host *vha, fc_port_t *fcport)
 
 	e->u.fcport.fcport = fcport;
 	fcport->flags |= FCF_ASYNC_ACTIVE;
+	fcport->disc_state = DSC_LOGIN_PEND;
 	return qla2x00_post_work(vha, e);
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-09-12  0:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190912003919.8488-1-r.bolshakov@yadro.com>
2019-09-12  0:39 ` [PATCH 1/4] scsi: qla2xxx: Ignore NULL pointer in tcm_qla2xxx_free_mcmd Roman Bolshakov
2019-09-12  0:39 ` [PATCH 2/4] scsi: qla2xxx: Initialize free_work before flushing it Roman Bolshakov
2019-09-12  0:39 ` [PATCH 4/4] scsi: qla2xxx: Change discovery state before PLOGI Roman Bolshakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).