All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] scsi: libicsi and qedi tmf fixes
@ 2021-04-13 23:06 Mike Christie
  2021-04-13 23:06 ` [PATCH 01/13] scsi: iscsi: add task completion helper Mike Christie
                   ` (12 more replies)
  0 siblings, 13 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

The following patches can apply to Linus's tree or Martin's 5.13 staging
branch.

The patches fix tmf bugs in the qedi driver and libiscsi issues that are
common to all offload drivers like qedi.

I've dropped the RFC tag becuase Manish has reviewed the qedi bits and
I've tested with drivers like ib_iser that work slightly different from
qedi.

V2:
- Dropped patch that reverted the in-kernel conn failure handling. I
posted a full patchset to fix that separately.
- Modfied the tmf vs cmd queueing patch. The RFC patch only supported
qedi and offload drivers. iscsi_tcp/cxgbgi do not need it.
- Added patch for another issue found where cmds can still be queued to the
driver while it does ep_disconnect.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/13] scsi: iscsi: add task completion helper
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling Mike Christie
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

This adds a helper to detect if a cmd has completed but not yet
freed.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 include/scsi/libiscsi.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 02f966e9358f..8c6d358a8abc 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -145,6 +145,13 @@ static inline void* iscsi_next_hdr(struct iscsi_task *task)
 	return (void*)task->hdr + task->hdr_len;
 }
 
+static inline bool iscsi_task_is_completed(struct iscsi_task *task)
+{
+	return task->state == ISCSI_TASK_COMPLETED ||
+	       task->state == ISCSI_TASK_ABRT_TMF ||
+	       task->state == ISCSI_TASK_ABRT_SESS_RECOV;
+}
+
 /* Connection's states */
 enum {
 	ISCSI_CONN_INITIAL_STAGE,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
  2021-04-13 23:06 ` [PATCH 01/13] scsi: iscsi: add task completion helper Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-14  4:17     ` kernel test robot
  2021-04-14  4:17     ` kernel test robot
  2021-04-13 23:06 ` [PATCH 03/13] scsi: iscsi: prevent cmds from queueing to driver during ep_disconnect Mike Christie
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we are handling a sg io reset there is a small window where cmds can
sneak into iscsi_queuecommand even though tmf_in_progress is set. This
seems to not be a problem for normal drivers because they know exactly
what was sent to the fw. For libiscsi this is a problem because it knows
what it sent to the driver but not what the driver sent to the fw.
When we go to cleanup then libiscsi might cleanup the sneaky cmd but
the driver/fw may not have seen the sneaky cmd and it's left running
in fw.

This has libiscsi just stop queueing cmds when it sends the tmf down
to the driver. Both then know they see the same set of cmds.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 104 +++++++++++++++++++++++++++-------------
 1 file changed, 72 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 04633e5157e9..3ff440d37a36 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1669,29 +1669,11 @@ enum {
 	FAILURE_SESSION_NOT_READY,
 };
 
-int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
+static bool iscsi_eh_running(struct iscsi_conn *conn, struct scsi_cmnd *sc,
+			     int *reason)
 {
-	struct iscsi_cls_session *cls_session;
-	struct iscsi_host *ihost;
-	int reason = 0;
-	struct iscsi_session *session;
-	struct iscsi_conn *conn;
-	struct iscsi_task *task = NULL;
-
-	sc->result = 0;
-	sc->SCp.ptr = NULL;
-
-	ihost = shost_priv(host);
-
-	cls_session = starget_to_session(scsi_target(sc->device));
-	session = cls_session->dd_data;
-	spin_lock_bh(&session->frwd_lock);
-
-	reason = iscsi_session_chkready(cls_session);
-	if (reason) {
-		sc->result = reason;
-		goto fault;
-	}
+	struct iscsi_session *session = conn->session;
+	struct iscsi_tm *tmf;
 
 	if (session->state != ISCSI_STATE_LOGGED_IN) {
 		/*
@@ -1707,31 +1689,92 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
 			 * state is bad, allowing completion to happen
 			 */
 			if (unlikely(system_state != SYSTEM_RUNNING)) {
-				reason = FAILURE_SESSION_FAILED;
+				*reason = FAILURE_SESSION_FAILED;
 				sc->result = DID_NO_CONNECT << 16;
 				break;
 			}
 			fallthrough;
 		case ISCSI_STATE_IN_RECOVERY:
-			reason = FAILURE_SESSION_IN_RECOVERY;
+			*reason = FAILURE_SESSION_IN_RECOVERY;
 			sc->result = DID_IMM_RETRY << 16;
 			break;
 		case ISCSI_STATE_LOGGING_OUT:
-			reason = FAILURE_SESSION_LOGGING_OUT;
+			*reason = FAILURE_SESSION_LOGGING_OUT;
 			sc->result = DID_IMM_RETRY << 16;
 			break;
 		case ISCSI_STATE_RECOVERY_FAILED:
-			reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
+			*reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
 			sc->result = DID_TRANSPORT_FAILFAST << 16;
 			break;
 		case ISCSI_STATE_TERMINATE:
-			reason = FAILURE_SESSION_TERMINATE;
+			*reason = FAILURE_SESSION_TERMINATE;
 			sc->result = DID_NO_CONNECT << 16;
 			break;
 		default:
-			reason = FAILURE_SESSION_FREED;
+			*reason = FAILURE_SESSION_FREED;
 			sc->result = DID_NO_CONNECT << 16;
 		}
+		goto eh_running;
+	}
+
+	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
+		*reason = FAILURE_SESSION_IN_RECOVERY;
+		sc->result = DID_REQUEUE << 16;
+		goto eh_running;
+	}
+
+	/*
+	 * For sg resets make sure libiscsi, the drivers and their fw see the
+	 * same cmds. Once we get a TMF that can affect multiple drivers stop
+	 * queueing.
+	 */
+	if (conn->tmf_state != TMF_INITIAL) {
+		tmf = &conn->tmhdr;
+
+		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
+		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
+			if (sc->device->lun != scsilun_to_int(&tmf->lun))
+				return 0;
+
+			ISCSI_DBG_EH(conn->session,
+				     "Requeue cmd sent during LU RESET processing.\n");
+			sc->result = DID_REQUEUE << 16;
+			goto eh_running;
+		case ISCSI_TM_FUNC_TARGET_WARM_RESET:
+			ISCSI_DBG_EH(conn->session,
+				     "Requeue cmd sent during TARGET RESET processing.\n");
+			sc->result = DID_REQUEUE << 16;
+			goto eh_running;
+		}
+	}
+
+	return false;
+
+eh_running:
+	return true;
+}
+
+int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
+{
+	struct iscsi_cls_session *cls_session;
+	struct iscsi_host *ihost;
+	int reason = 0;
+	struct iscsi_session *session;
+	struct iscsi_conn *conn;
+	struct iscsi_task *task = NULL;
+
+	sc->result = 0;
+	sc->SCp.ptr = NULL;
+
+	ihost = shost_priv(host);
+
+	cls_session = starget_to_session(scsi_target(sc->device));
+	session = cls_session->dd_data;
+	spin_lock_bh(&session->frwd_lock);
+
+	reason = iscsi_session_chkready(cls_session);
+	if (reason) {
+		sc->result = reason;
 		goto fault;
 	}
 
@@ -1742,11 +1785,8 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
 		goto fault;
 	}
 
-	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
-		reason = FAILURE_SESSION_IN_RECOVERY;
-		sc->result = DID_REQUEUE << 16;
+	if (iscsi_eh_running(conn, sc, &reason))
 		goto fault;
-	}
 
 	if (iscsi_check_cmdsn_window_closed(conn)) {
 		reason = FAILURE_WINDOW_CLOSED;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/13] scsi: iscsi: prevent cmds from queueing to driver during ep_disconnect
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
  2021-04-13 23:06 ` [PATCH 01/13] scsi: iscsi: add task completion helper Mike Christie
  2021-04-13 23:06 ` [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 04/13] scsi: qedi: fix null ref during abort handling Mike Christie
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

When we added iser and all thes offload drivers I goofed and didn't add
a unbind_conn nl cmd to undo the bind_conn. So during ep_disconnect we
have been doing iscsi_suspend_tx/queue to stop new IO but depending on
the driver we can still get IO from:

1. __iscsi_conn_send_pdu for TMFs and nops if we haven't called
iscsi_conn_failure before ep_disconnect.
2. Userspace did a ep_disconnect during shutdown before we saw a
logout command and userspace didn't do an session unbind event.

This patch fixes the issue by adding a helper which drivers implementing
ep_disconnect can use to make sure new IO is not queued to them after
calling it and until we do a new conn_bind.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c |  2 +
 drivers/scsi/be2iscsi/be_iscsi.c         |  2 +-
 drivers/scsi/bnx2i/bnx2i_iscsi.c         |  2 +-
 drivers/scsi/cxgbi/libcxgbi.c            |  3 +-
 drivers/scsi/libiscsi.c                  | 56 +++++++++++++++++++++---
 drivers/scsi/qedi/qedi_iscsi.c           |  2 +-
 drivers/scsi/qla4xxx/ql4_def.h           |  1 +
 drivers/scsi/qla4xxx/ql4_os.c            |  2 +
 include/scsi/libiscsi.h                  |  1 +
 9 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 8fcaa1136f2c..3089502116a5 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -899,6 +899,8 @@ iscsi_iser_ep_disconnect(struct iscsi_endpoint *ep)
 
 	iser_info("ep %p iser conn %p\n", ep, iser_conn);
 
+	iscsi_ep_prep_disconnect(iser_conn->iscsi_conn);
+
 	mutex_lock(&iser_conn->state_mutex);
 	iser_conn_terminate(iser_conn);
 
diff --git a/drivers/scsi/be2iscsi/be_iscsi.c b/drivers/scsi/be2iscsi/be_iscsi.c
index a13c203ef7a9..a3b5f7f6ccc8 100644
--- a/drivers/scsi/be2iscsi/be_iscsi.c
+++ b/drivers/scsi/be2iscsi/be_iscsi.c
@@ -1314,7 +1314,7 @@ void beiscsi_ep_disconnect(struct iscsi_endpoint *ep)
 
 	if (beiscsi_ep->conn) {
 		beiscsi_conn = beiscsi_ep->conn;
-		iscsi_suspend_queue(beiscsi_conn->conn);
+		iscsi_ep_prep_disconnect(beiscsi_conn->conn);
 	}
 
 	if (!beiscsi_hba_is_online(phba)) {
diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
index 1e6d8f62ea3c..5cbca9657c35 100644
--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
+++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
@@ -2129,7 +2129,7 @@ static void bnx2i_ep_disconnect(struct iscsi_endpoint *ep)
 	if (bnx2i_ep->conn) {
 		bnx2i_conn = bnx2i_ep->conn;
 		conn = bnx2i_conn->cls_conn->dd_data;
-		iscsi_suspend_queue(conn);
+		iscsi_ep_prep_disconnect(conn);
 	}
 	hba = bnx2i_ep->hba;
 
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index f078b3c4e083..dd7b41a092f1 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -2968,7 +2968,8 @@ void cxgbi_ep_disconnect(struct iscsi_endpoint *ep)
 		ep, cep, cconn, csk, csk->state, csk->flags);
 
 	if (cconn && cconn->iconn) {
-		iscsi_suspend_tx(cconn->iconn);
+		iscsi_ep_prep_disconnect(cconn->iconn);
+
 		write_lock_bh(&csk->callback_lock);
 		cep->csk->user_data = NULL;
 		cconn->cep = NULL;
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 3ff440d37a36..3a37628e4024 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1387,22 +1387,28 @@ void iscsi_session_failure(struct iscsi_session *session,
 }
 EXPORT_SYMBOL_GPL(iscsi_session_failure);
 
-void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err)
+static void iscsi_set_conn_failed(struct iscsi_conn *conn)
 {
 	struct iscsi_session *session = conn->session;
 
-	spin_lock_bh(&session->frwd_lock);
-	if (session->state == ISCSI_STATE_FAILED) {
-		spin_unlock_bh(&session->frwd_lock);
+	if (session->state == ISCSI_STATE_FAILED)
 		return;
-	}
 
 	if (conn->stop_stage == 0)
 		session->state = ISCSI_STATE_FAILED;
-	spin_unlock_bh(&session->frwd_lock);
 
 	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx);
 	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx);
+}
+
+void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err)
+{
+	struct iscsi_session *session = conn->session;
+
+	spin_lock_bh(&session->frwd_lock);
+	iscsi_set_conn_failed(conn);
+	spin_unlock_bh(&session->frwd_lock);
+
 	iscsi_conn_error_event(conn->cls_conn, err);
 }
 EXPORT_SYMBOL_GPL(iscsi_conn_failure);
@@ -2220,6 +2226,44 @@ static void iscsi_check_transport_timeouts(struct timer_list *t)
 	spin_unlock(&session->frwd_lock);
 }
 
+/*
+ * iscsi_ep_prep_disconnect - prepare the conn for a ep disconnect
+ * @conn: iscsi conn ep is bound to.
+ *
+ * This must be called by drivers implementing the ep_disconnect callout.
+ */
+void iscsi_ep_prep_disconnect(struct iscsi_conn *conn)
+{
+	struct iscsi_session *session;
+
+	/* Check if bound for the driver */
+	if (!conn)
+		return;
+
+	session = conn->session;
+	/*
+	 * Wait for iscsi_eh calls to exit. We don't wait for the tmf to
+	 * complete or timeout. The caller just wants to know what's running
+	 * is everything that needs to be cleaned up, and no cmds will be
+	 * queued.
+	 */
+	mutex_lock(&session->eh_mutex);
+	spin_lock_bh(&session->frwd_lock);
+
+	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx);
+	/*
+	 * if logout timed out before userspace could even send a PDU the
+	 * state might still be in ISCSI_STATE_LOGGED_IN and allowing new cmds
+	 * and TMFs.
+	 */
+	if (session->state == ISCSI_STATE_LOGGED_IN)
+		iscsi_set_conn_failed(conn);
+
+	spin_unlock_bh(&session->frwd_lock);
+	mutex_unlock(&session->eh_mutex);
+}
+EXPORT_SYMBOL_GPL(iscsi_ep_prep_disconnect);
+
 static void iscsi_prep_abort_task_pdu(struct iscsi_task *task,
 				      struct iscsi_tm *hdr)
 {
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 08c05403cd72..8ed1852627e7 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -1008,7 +1008,7 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 	if (qedi_ep->conn) {
 		qedi_conn = qedi_ep->conn;
 		conn = qedi_conn->cls_conn->dd_data;
-		iscsi_suspend_queue(conn);
+		iscsi_ep_prep_disconnect(conn);
 		abrt_conn = qedi_conn->abrt_conn;
 
 		while (count--)	{
diff --git a/drivers/scsi/qla4xxx/ql4_def.h b/drivers/scsi/qla4xxx/ql4_def.h
index 031569c496e5..4ca7f4228b4b 100644
--- a/drivers/scsi/qla4xxx/ql4_def.h
+++ b/drivers/scsi/qla4xxx/ql4_def.h
@@ -838,6 +838,7 @@ struct ql4_task_data {
 struct qla_endpoint {
 	struct Scsi_Host *host;
 	struct sockaddr_storage dst_addr;
+	struct iscsi_conn *conn;
 };
 
 struct qla_conn {
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 7bd9a4a04ad5..5977fe403086 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -1772,6 +1772,7 @@ static void qla4xxx_ep_disconnect(struct iscsi_endpoint *ep)
 	ha = to_qla_host(qla_ep->host);
 	DEBUG2(ql4_printk(KERN_INFO, ha, "%s: host: %ld\n", __func__,
 			  ha->host_no));
+	iscsi_ep_prep_disconnect(qla_ep->conn);
 	iscsi_destroy_endpoint(ep);
 }
 
@@ -3234,6 +3235,7 @@ static int qla4xxx_conn_bind(struct iscsi_cls_session *cls_session,
 	conn = cls_conn->dd_data;
 	qla_conn = conn->dd_data;
 	qla_conn->qla_ep = ep->dd_data;
+	qla_conn->qla_ep->conn = conn;
 	return 0;
 }
 
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 8c6d358a8abc..f409f521681d 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -441,6 +441,7 @@ extern int iscsi_conn_get_addr_param(struct sockaddr_storage *addr,
 extern void iscsi_suspend_tx(struct iscsi_conn *conn);
 extern void iscsi_suspend_queue(struct iscsi_conn *conn);
 extern void iscsi_conn_queue_work(struct iscsi_conn *conn);
+extern void iscsi_ep_prep_disconnect(struct iscsi_conn *conn);
 
 #define iscsi_conn_printk(prefix, _c, fmt, a...) \
 	iscsi_cls_conn_printk(prefix, ((struct iscsi_conn *)_c)->cls_conn, \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/13] scsi: qedi: fix null ref during abort handling
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (2 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 03/13] scsi: iscsi: prevent cmds from queueing to driver during ep_disconnect Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race Mike Christie
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If qedi_process_cmd_cleanup_resp finds the cmd it frees the work and sets
list_tmf_work to NULL, so qedi_tmf_work should check if list_tmf_work is
non-NULL when it wants to force cleanup.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 440ddd2309f1..cf57b4e49700 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -1453,7 +1453,7 @@ static void qedi_tmf_work(struct work_struct *work)
 
 ldel_exit:
 	spin_lock_bh(&qedi_conn->tmf_work_lock);
-	if (!qedi_cmd->list_tmf_work) {
+	if (qedi_cmd->list_tmf_work) {
 		list_del_init(&list_work->list);
 		qedi_cmd->list_tmf_work = NULL;
 		kfree(list_work);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (3 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 04/13] scsi: qedi: fix null ref during abort handling Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-14 12:19     ` kernel test robot
  2021-04-15 14:21     ` kernel test robot
  2021-04-13 23:06 ` [PATCH 06/13] scsi: qedi: fix use after free during abort cleanup Mike Christie
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If the scsi cmd completes after qedi_tmf_work calls iscsi_itt_to_task
then the qedi qedi_cmd->task_id could be freed and used for another
cmd. If we then call qedi_iscsi_cleanup_task with that task_id we will
be cleaning up the wrong cmd.

This patch has us wait to release the task_id until the last put has been
done on the iscsi_task. Because libiscsi grabs a ref to the task when
sending the abort, we know that for the non abort timeout case that the
task_id we are referencing is for the cmd that was supposed to be aborted.

The next patch will fix the case where the abort timesout while we are
running qedi_tmf_work.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 13 -------------
 drivers/scsi/qedi/qedi_iscsi.c | 20 +++++++++++++++++---
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index cf57b4e49700..ad4357e4c15d 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -73,7 +73,6 @@ static void qedi_process_logout_resp(struct qedi_ctx *qedi,
 	spin_unlock(&qedi_conn->list_lock);
 
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr, NULL, 0);
 
 	spin_unlock(&session->back_lock);
@@ -138,7 +137,6 @@ static void qedi_process_text_resp(struct qedi_ctx *qedi,
 	spin_unlock(&qedi_conn->list_lock);
 
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr,
 			     qedi_conn->gen_pdu.resp_buf,
@@ -164,13 +162,11 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 	iscsi_block_session(session->cls_session);
 	rval = qedi_cleanup_all_io(qedi, qedi_conn, qedi_cmd->task, true);
 	if (rval) {
-		qedi_clear_task_idx(qedi, qedi_cmd->task_id);
 		iscsi_unblock_session(session->cls_session);
 		goto exit_tmf_resp;
 	}
 
 	iscsi_unblock_session(session->cls_session);
-	qedi_clear_task_idx(qedi, qedi_cmd->task_id);
 
 	spin_lock(&session->back_lock);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
@@ -245,8 +241,6 @@ static void qedi_process_tmf_resp(struct qedi_ctx *qedi,
 		goto unblock_sess;
 	}
 
-	qedi_clear_task_idx(qedi, qedi_cmd->task_id);
-
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
 	kfree(resp_hdr_ptr);
 
@@ -314,7 +308,6 @@ static void qedi_process_login_resp(struct qedi_ctx *qedi,
 		  "Freeing tid=0x%x for cid=0x%x\n",
 		  cmd->task_id, qedi_conn->iscsi_conn_id);
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 }
 
 static void qedi_get_rq_bdq_buf(struct qedi_ctx *qedi,
@@ -468,7 +461,6 @@ static int qedi_process_nopin_mesg(struct qedi_ctx *qedi,
 		}
 
 		spin_unlock(&qedi_conn->list_lock);
-		qedi_clear_task_idx(qedi, cmd->task_id);
 	}
 
 done:
@@ -673,7 +665,6 @@ static void qedi_scsi_completion(struct qedi_ctx *qedi,
 	if (qedi_io_tracing)
 		qedi_trace_io(qedi, task, cmd->task_id, QEDI_IO_TRACE_RSP);
 
-	qedi_clear_task_idx(qedi, cmd->task_id);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)hdr,
 			     conn->data, datalen);
 error:
@@ -730,7 +721,6 @@ static void qedi_process_nopin_local_cmpl(struct qedi_ctx *qedi,
 		  cqe->itid, cmd->task_id);
 
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 
 	spin_lock_bh(&session->back_lock);
 	__iscsi_put_task(task);
@@ -821,8 +811,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 			if (qedi_cmd->state == CLEANUP_WAIT_FAILED)
 				qedi_cmd->state = CLEANUP_RECV;
 
-			qedi_clear_task_idx(qedi_conn->qedi, rtid);
-
 			spin_lock(&qedi_conn->list_lock);
 			if (likely(dbg_cmd->io_cmd_in_list)) {
 				dbg_cmd->io_cmd_in_list = false;
@@ -856,7 +844,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
-		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
 
 	} else {
 		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 8ed1852627e7..9b794afbdead 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -772,7 +772,6 @@ static int qedi_mtask_xmit(struct iscsi_conn *conn, struct iscsi_task *task)
 	}
 
 	cmd->conn = conn->dd_data;
-	cmd->scsi_cmd = NULL;
 	return qedi_iscsi_send_generic_request(task);
 }
 
@@ -783,6 +782,10 @@ static int qedi_task_xmit(struct iscsi_task *task)
 	struct qedi_cmd *cmd = task->dd_data;
 	struct scsi_cmnd *sc = task->sc;
 
+	/* Clear now so in cleanup_task we know it didn't make it */
+	cmd->scsi_cmd = NULL;
+	cmd->task_id = -1;
+
 	if (test_bit(QEDI_IN_SHUTDOWN, &qedi_conn->qedi->flags))
 		return -ENODEV;
 
@@ -1383,13 +1386,24 @@ static umode_t qedi_attr_is_visible(int param_type, int param)
 
 static void qedi_cleanup_task(struct iscsi_task *task)
 {
-	if (!task->sc || task->state == ISCSI_TASK_PENDING) {
+	struct qedi_cmd *cmd;
+
+	if (task->state == ISCSI_TASK_PENDING) {
 		QEDI_INFO(NULL, QEDI_LOG_IO, "Returning ref_cnt=%d\n",
 			  refcount_read(&task->refcount));
 		return;
 	}
 
-	qedi_iscsi_unmap_sg_list(task->dd_data);
+	if (task->sc)
+		qedi_iscsi_unmap_sg_list(task->dd_data);
+
+	cmd = task->dd_data;
+	if (cmd->task_id != -1)
+		qedi_clear_task_idx(iscsi_host_priv(task->conn->session->host),
+				    cmd->task_id);
+
+	cmd->task_id = -1;
+	cmd->scsi_cmd = NULL;
 }
 
 struct iscsi_transport qedi_iscsi_transport = {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/13] scsi: qedi: fix use after free during abort cleanup
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (4 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 07/13] scsi: qedi: fix tmf tid allocation Mike Christie
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

This fixes two bugs:

1. The scsi cmd task could be completed and the abort could timeout
while we are running qedi_tmf_work so we need to get a ref to the task.

2. If qedi_tmf_work's qedi_wait_for_cleanup_request call times out then
we will also force the clean up of the qedi_work_map but
qedi_process_cmd_cleanup_resp could still be accessing the qedi_cmd
for the abort tmf. We can then race where qedi_process_cmd_cleanup_resp is
still accessing the mtask's qedi_cmd but libiscsi has escalated to session
level cleanup and is cleaning up the mtask while we are still accessing it.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 110 ++++++++++++++++++---------------
 drivers/scsi/qedi/qedi_iscsi.h |   1 +
 2 files changed, 61 insertions(+), 50 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index ad4357e4c15d..c5699421ec37 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -729,7 +729,6 @@ static void qedi_process_nopin_local_cmpl(struct qedi_ctx *qedi,
 
 static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 					  struct iscsi_cqe_solicited *cqe,
-					  struct iscsi_task *task,
 					  struct iscsi_conn *conn)
 {
 	struct qedi_work_map *work, *work_tmp;
@@ -742,7 +741,7 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 	u32 iscsi_cid;
 	struct qedi_conn *qedi_conn;
 	struct qedi_cmd *dbg_cmd;
-	struct iscsi_task *mtask;
+	struct iscsi_task *mtask, *task;
 	struct iscsi_tm *tmf_hdr = NULL;
 
 	iscsi_cid = cqe->conn_id;
@@ -768,6 +767,7 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 			}
 			found = 1;
 			mtask = qedi_cmd->task;
+			task = work->ctask;
 			tmf_hdr = (struct iscsi_tm *)mtask->hdr;
 			rtid = work->rtid;
 
@@ -776,52 +776,47 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 			qedi_cmd->list_tmf_work = NULL;
 		}
 	}
-	spin_unlock_bh(&qedi_conn->tmf_work_lock);
-
-	if (found) {
-		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-			  "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n",
-			  proto_itt, tmf_hdr->flags, qedi_conn->iscsi_conn_id);
-
-		if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_ABORT_TASK) {
-			spin_lock_bh(&conn->session->back_lock);
 
-			protoitt = build_itt(get_itt(tmf_hdr->rtt),
-					     conn->session->age);
-			task = iscsi_itt_to_task(conn, protoitt);
-
-			spin_unlock_bh(&conn->session->back_lock);
+	if (!found) {
+		spin_unlock_bh(&qedi_conn->tmf_work_lock);
+		goto check_cleanup_reqs;
+	}
 
-			if (!task) {
-				QEDI_NOTICE(&qedi->dbg_ctx,
-					    "IO task completed, tmf rtt=0x%x, cid=0x%x\n",
-					    get_itt(tmf_hdr->rtt),
-					    qedi_conn->iscsi_conn_id);
-				return;
-			}
+	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
+		  "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n",
+		  proto_itt, tmf_hdr->flags, qedi_conn->iscsi_conn_id);
+
+	spin_lock_bh(&conn->session->back_lock);
+	if (iscsi_task_is_completed(task)) {
+		QEDI_NOTICE(&qedi->dbg_ctx,
+			    "IO task completed, tmf rtt=0x%x, cid=0x%x\n",
+			   get_itt(tmf_hdr->rtt), qedi_conn->iscsi_conn_id);
+		goto unlock;
+	}
 
-			dbg_cmd = task->dd_data;
+	dbg_cmd = task->dd_data;
 
-			QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-				  "Abort tmf rtt=0x%x, i/o itt=0x%x, i/o tid=0x%x, cid=0x%x\n",
-				  get_itt(tmf_hdr->rtt), get_itt(task->itt),
-				  dbg_cmd->task_id, qedi_conn->iscsi_conn_id);
+	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
+		  "Abort tmf rtt=0x%x, i/o itt=0x%x, i/o tid=0x%x, cid=0x%x\n",
+		  get_itt(tmf_hdr->rtt), get_itt(task->itt), dbg_cmd->task_id,
+		  qedi_conn->iscsi_conn_id);
 
-			if (qedi_cmd->state == CLEANUP_WAIT_FAILED)
-				qedi_cmd->state = CLEANUP_RECV;
+	spin_lock(&qedi_conn->list_lock);
+	if (likely(dbg_cmd->io_cmd_in_list)) {
+		dbg_cmd->io_cmd_in_list = false;
+		list_del_init(&dbg_cmd->io_cmd);
+		qedi_conn->active_cmd_count--;
+	}
+	spin_unlock(&qedi_conn->list_lock);
+	qedi_cmd->state = CLEANUP_RECV;
+unlock:
+	spin_unlock_bh(&conn->session->back_lock);
+	spin_unlock_bh(&qedi_conn->tmf_work_lock);
+	wake_up_interruptible(&qedi_conn->wait_queue);
+	return;
 
-			spin_lock(&qedi_conn->list_lock);
-			if (likely(dbg_cmd->io_cmd_in_list)) {
-				dbg_cmd->io_cmd_in_list = false;
-				list_del_init(&dbg_cmd->io_cmd);
-				qedi_conn->active_cmd_count--;
-			}
-			spin_unlock(&qedi_conn->list_lock);
-			qedi_cmd->state = CLEANUP_RECV;
-			wake_up_interruptible(&qedi_conn->wait_queue);
-		}
-	} else if (qedi_conn->cmd_cleanup_req > 0) {
+check_cleanup_reqs:
+	if (qedi_conn->cmd_cleanup_req > 0) {
 		spin_lock_bh(&conn->session->back_lock);
 		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
 		protoitt = build_itt(ptmp_itt, conn->session->age);
@@ -844,6 +839,7 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
+		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
 
 	} else {
 		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
@@ -946,8 +942,7 @@ void qedi_fp_process_cqes(struct qedi_work *work)
 		goto exit_fp_process;
 	case ISCSI_CQE_TYPE_TASK_CLEANUP:
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM, "CleanUp CqE\n");
-		qedi_process_cmd_cleanup_resp(qedi, &cqe->cqe_solicited, task,
-					      conn);
+		qedi_process_cmd_cleanup_resp(qedi, &cqe->cqe_solicited, conn);
 		goto exit_fp_process;
 	default:
 		QEDI_ERR(&qedi->dbg_ctx, "Error cqe.\n");
@@ -1374,12 +1369,22 @@ static void qedi_tmf_work(struct work_struct *work)
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
 	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 
-	ctask = iscsi_itt_to_task(conn, tmf_hdr->rtt);
-	if (!ctask || !ctask->sc) {
+	spin_lock_bh(&conn->session->back_lock);
+	ctask = iscsi_itt_to_ctask(conn, tmf_hdr->rtt);
+	if (!ctask || iscsi_task_is_completed(ctask)) {
+		spin_unlock_bh(&conn->session->back_lock);
 		QEDI_ERR(&qedi->dbg_ctx, "Task already completed\n");
-		goto abort_ret;
+		goto clear_cleanup;
 	}
 
+	/*
+	 * libiscsi gets a ref before sending the abort, but if libiscsi times
+	 * it out then it could release it and the cmd could complete from
+	 * under us.
+	 */
+	__iscsi_get_task(ctask);
+	spin_unlock_bh(&conn->session->back_lock);
+
 	cmd = (struct qedi_cmd *)ctask->dd_data;
 	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
 		  "Abort tmf rtt=0x%x, cmd itt=0x%x, cmd tid=0x%x, cid=0x%x\n",
@@ -1389,19 +1394,20 @@ static void qedi_tmf_work(struct work_struct *work)
 	if (qedi_do_not_recover) {
 		QEDI_ERR(&qedi->dbg_ctx, "DONT SEND CLEANUP/ABORT %d\n",
 			 qedi_do_not_recover);
-		goto abort_ret;
+		goto put_task;
 	}
 
 	list_work = kzalloc(sizeof(*list_work), GFP_ATOMIC);
 	if (!list_work) {
 		QEDI_ERR(&qedi->dbg_ctx, "Memory allocation failed\n");
-		goto abort_ret;
+		goto put_task;
 	}
 
 	qedi_cmd->type = TYPEIO;
 	list_work->qedi_cmd = qedi_cmd;
 	list_work->rtid = cmd->task_id;
 	list_work->state = QEDI_WORK_SCHEDULED;
+	list_work->ctask = ctask;
 	qedi_cmd->list_tmf_work = list_work;
 
 	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
@@ -1434,7 +1440,9 @@ static void qedi_tmf_work(struct work_struct *work)
 	qedi_cmd->task_id = tid;
 	qedi_send_iscsi_tmf(qedi_conn, qedi_cmd->task);
 
-abort_ret:
+put_task:
+	iscsi_put_task(ctask);
+clear_cleanup:
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 	return;
 
@@ -1455,6 +1463,8 @@ static void qedi_tmf_work(struct work_struct *work)
 	}
 	spin_unlock(&qedi_conn->list_lock);
 
+	iscsi_put_task(ctask);
+
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 }
 
diff --git a/drivers/scsi/qedi/qedi_iscsi.h b/drivers/scsi/qedi/qedi_iscsi.h
index 39dc27c85e3c..68ef519f5480 100644
--- a/drivers/scsi/qedi/qedi_iscsi.h
+++ b/drivers/scsi/qedi/qedi_iscsi.h
@@ -212,6 +212,7 @@ struct qedi_cmd {
 struct qedi_work_map {
 	struct list_head list;
 	struct qedi_cmd *qedi_cmd;
+	struct iscsi_task *ctask;
 	int rtid;
 
 	int state;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/13] scsi: qedi: fix tmf tid allocation
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (5 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 06/13] scsi: qedi: fix use after free during abort cleanup Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 08/13] scsi: qedi: use GFP_NOIO for tmf allocation Mike Christie
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

qedi_iscsi_abort_work and qedi_tmf_work allocates a tid then calls
qedi_send_iscsi_tmf which also allcoates a tid. This removes the tid
allocation from the callers.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 76 ++++++++++------------------------
 drivers/scsi/qedi/qedi_gbl.h   |  3 +-
 drivers/scsi/qedi/qedi_iscsi.c |  2 +-
 3 files changed, 25 insertions(+), 56 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index c5699421ec37..542255c94d96 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -14,8 +14,8 @@
 #include "qedi_fw_iscsi.h"
 #include "qedi_fw_scsi.h"
 
-static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
-			       struct iscsi_task *mtask);
+static int send_iscsi_tmf(struct qedi_conn *qedi_conn,
+			  struct iscsi_task *mtask);
 
 void qedi_iscsi_unmap_sg_list(struct qedi_cmd *cmd)
 {
@@ -1350,7 +1350,7 @@ static int qedi_wait_for_cleanup_request(struct qedi_ctx *qedi,
 	return 0;
 }
 
-static void qedi_tmf_work(struct work_struct *work)
+static void qedi_abort_work(struct work_struct *work)
 {
 	struct qedi_cmd *qedi_cmd =
 		container_of(work, struct qedi_cmd, tmf_work);
@@ -1363,7 +1363,6 @@ static void qedi_tmf_work(struct work_struct *work)
 	struct iscsi_task *ctask;
 	struct iscsi_tm *tmf_hdr;
 	s16 rval = 0;
-	s16 tid = 0;
 
 	mtask = qedi_cmd->task;
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
@@ -1404,6 +1403,7 @@ static void qedi_tmf_work(struct work_struct *work)
 	}
 
 	qedi_cmd->type = TYPEIO;
+	qedi_cmd->state = CLEANUP_WAIT;
 	list_work->qedi_cmd = qedi_cmd;
 	list_work->rtid = cmd->task_id;
 	list_work->state = QEDI_WORK_SCHEDULED;
@@ -1430,15 +1430,7 @@ static void qedi_tmf_work(struct work_struct *work)
 		goto ldel_exit;
 	}
 
-	tid = qedi_get_task_idx(qedi);
-	if (tid == -1) {
-		QEDI_ERR(&qedi->dbg_ctx, "Invalid tid, cid=0x%x\n",
-			 qedi_conn->iscsi_conn_id);
-		goto ldel_exit;
-	}
-
-	qedi_cmd->task_id = tid;
-	qedi_send_iscsi_tmf(qedi_conn, qedi_cmd->task);
+	send_iscsi_tmf(qedi_conn, qedi_cmd->task);
 
 put_task:
 	iscsi_put_task(ctask);
@@ -1468,8 +1460,7 @@ static void qedi_tmf_work(struct work_struct *work)
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 }
 
-static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
-			       struct iscsi_task *mtask)
+static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 {
 	struct iscsi_tmf_request_hdr tmf_pdu_header;
 	struct iscsi_task_params task_params;
@@ -1484,7 +1475,6 @@ static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
 	u32 scsi_lun[2];
 	s16 tid = 0;
 	u16 sq_idx = 0;
-	int rval = 0;
 
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
 	qedi_cmd = (struct qedi_cmd *)mtask->dd_data;
@@ -1548,10 +1538,7 @@ static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
 	task_params.sqe = &ep->sq[sq_idx];
 
 	memset(task_params.sqe, 0, sizeof(struct iscsi_wqe));
-	rval = init_initiator_tmf_request_task(&task_params,
-					       &tmf_pdu_header);
-	if (rval)
-		return -1;
+	init_initiator_tmf_request_task(&task_params, &tmf_pdu_header);
 
 	spin_lock(&qedi_conn->list_lock);
 	list_add_tail(&qedi_cmd->io_cmd, &qedi_conn->active_cmd_list);
@@ -1563,47 +1550,30 @@ static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
 	return 0;
 }
 
-int qedi_iscsi_abort_work(struct qedi_conn *qedi_conn,
-			  struct iscsi_task *mtask)
+int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 {
+	struct iscsi_tm *tmf_hdr = (struct iscsi_tm *)mtask->hdr;
+	struct qedi_cmd *qedi_cmd = mtask->dd_data;
 	struct qedi_ctx *qedi = qedi_conn->qedi;
-	struct iscsi_tm *tmf_hdr;
-	struct qedi_cmd *qedi_cmd = (struct qedi_cmd *)mtask->dd_data;
-	s16 tid = 0;
+	int rc = 0;
 
-	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
-	qedi_cmd->task = mtask;
-
-	/* If abort task then schedule the work and return */
-	if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	    ISCSI_TM_FUNC_ABORT_TASK) {
-		qedi_cmd->state = CLEANUP_WAIT;
-		INIT_WORK(&qedi_cmd->tmf_work, qedi_tmf_work);
+	switch (tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) {
+	case ISCSI_TM_FUNC_ABORT_TASK:
+		INIT_WORK(&qedi_cmd->tmf_work, qedi_abort_work);
 		queue_work(qedi->tmf_thread, &qedi_cmd->tmf_work);
-
-	} else if (((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_LOGICAL_UNIT_RESET) ||
-		   ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_TARGET_WARM_RESET) ||
-		   ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_TARGET_COLD_RESET)) {
-		tid = qedi_get_task_idx(qedi);
-		if (tid == -1) {
-			QEDI_ERR(&qedi->dbg_ctx, "Invalid tid, cid=0x%x\n",
-				 qedi_conn->iscsi_conn_id);
-			return -1;
-		}
-		qedi_cmd->task_id = tid;
-
-		qedi_send_iscsi_tmf(qedi_conn, qedi_cmd->task);
-
-	} else {
+		break;
+	case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
+	case ISCSI_TM_FUNC_TARGET_WARM_RESET:
+	case ISCSI_TM_FUNC_TARGET_COLD_RESET:
+		rc = send_iscsi_tmf(qedi_conn, mtask);
+		break;
+	default:
 		QEDI_ERR(&qedi->dbg_ctx, "Invalid tmf, cid=0x%x\n",
 			 qedi_conn->iscsi_conn_id);
-		return -1;
+		return -EINVAL;
 	}
 
-	return 0;
+	return rc;
 }
 
 int qedi_send_iscsi_text(struct qedi_conn *qedi_conn,
diff --git a/drivers/scsi/qedi/qedi_gbl.h b/drivers/scsi/qedi/qedi_gbl.h
index 116645c08c71..fb44a282613e 100644
--- a/drivers/scsi/qedi/qedi_gbl.h
+++ b/drivers/scsi/qedi/qedi_gbl.h
@@ -31,8 +31,7 @@ int qedi_send_iscsi_login(struct qedi_conn *qedi_conn,
 			  struct iscsi_task *task);
 int qedi_send_iscsi_logout(struct qedi_conn *qedi_conn,
 			   struct iscsi_task *task);
-int qedi_iscsi_abort_work(struct qedi_conn *qedi_conn,
-			  struct iscsi_task *mtask);
+int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask);
 int qedi_send_iscsi_text(struct qedi_conn *qedi_conn,
 			 struct iscsi_task *task);
 int qedi_send_iscsi_nopout(struct qedi_conn *qedi_conn,
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 9b794afbdead..b06ebbb3ed39 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -742,7 +742,7 @@ static int qedi_iscsi_send_generic_request(struct iscsi_task *task)
 		rc = qedi_send_iscsi_logout(qedi_conn, task);
 		break;
 	case ISCSI_OP_SCSI_TMFUNC:
-		rc = qedi_iscsi_abort_work(qedi_conn, task);
+		rc = qedi_send_iscsi_tmf(qedi_conn, task);
 		break;
 	case ISCSI_OP_TEXT:
 		rc = qedi_send_iscsi_text(qedi_conn, task);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/13] scsi: qedi: use GFP_NOIO for tmf allocation
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (6 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 07/13] scsi: qedi: fix tmf tid allocation Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 09/13] scsi: qedi: fix session block/unblock abuse during tmf handling Mike Christie
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

We run from a workqueue with no locks held so use GFP_NOIO.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 542255c94d96..84402e827d42 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -1396,7 +1396,7 @@ static void qedi_abort_work(struct work_struct *work)
 		goto put_task;
 	}
 
-	list_work = kzalloc(sizeof(*list_work), GFP_ATOMIC);
+	list_work = kzalloc(sizeof(*list_work), GFP_NOIO);
 	if (!list_work) {
 		QEDI_ERR(&qedi->dbg_ctx, "Memory allocation failed\n");
 		goto put_task;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/13] scsi: qedi: fix session block/unblock abuse during tmf handling
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (7 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 08/13] scsi: qedi: use GFP_NOIO for tmf allocation Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 10/13] scsi: qedi: fix session block/unblock abuse during cleanup Mike Christie
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

Drivers shouldn't be calling block/unblock session for tmf handling
because the functions can change the session state from under libiscsi.
We now block the session for the drivers during tmf processing, so we can
remove these calls.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 84402e827d42..f13f8af6d931 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -159,14 +159,9 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 	resp_hdr_ptr =  (struct iscsi_tm_rsp *)qedi_cmd->tmf_resp_buf;
 
-	iscsi_block_session(session->cls_session);
 	rval = qedi_cleanup_all_io(qedi, qedi_conn, qedi_cmd->task, true);
-	if (rval) {
-		iscsi_unblock_session(session->cls_session);
+	if (rval)
 		goto exit_tmf_resp;
-	}
-
-	iscsi_unblock_session(session->cls_session);
 
 	spin_lock(&session->back_lock);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/13] scsi: qedi: fix session block/unblock abuse during cleanup
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (8 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 09/13] scsi: qedi: fix session block/unblock abuse during tmf handling Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 11/13] scsi: qedi: pass send_iscsi_tmf task to abort Mike Christie
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

Drivers shouldn't be calling block/unblock session for cmd cleanup
because the functions can change the session state from under libiscsi.
This adds a new a driver level bit so it can block all IO the host
while it drains the card.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi.h       |  1 +
 drivers/scsi/qedi/qedi_iscsi.c | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qedi/qedi.h b/drivers/scsi/qedi/qedi.h
index c342defc3f52..ce199a7a16b8 100644
--- a/drivers/scsi/qedi/qedi.h
+++ b/drivers/scsi/qedi/qedi.h
@@ -284,6 +284,7 @@ struct qedi_ctx {
 #define QEDI_IN_RECOVERY	5
 #define QEDI_IN_OFFLINE		6
 #define QEDI_IN_SHUTDOWN	7
+#define QEDI_BLOCK_IO		8
 
 	u8 mac[ETH_ALEN];
 	u32 src_ip[4];
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index b06ebbb3ed39..9a2d9a29fc01 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -330,12 +330,22 @@ qedi_conn_create(struct iscsi_cls_session *cls_session, uint32_t cid)
 
 void qedi_mark_device_missing(struct iscsi_cls_session *cls_session)
 {
-	iscsi_block_session(cls_session);
+	struct iscsi_session *session = cls_session->dd_data;
+	struct qedi_conn *qedi_conn = session->leadconn->dd_data;
+
+	spin_lock_bh(&session->frwd_lock);
+	set_bit(QEDI_BLOCK_IO, &qedi_conn->qedi->flags);
+	spin_unlock_bh(&session->frwd_lock);
 }
 
 void qedi_mark_device_available(struct iscsi_cls_session *cls_session)
 {
-	iscsi_unblock_session(cls_session);
+	struct iscsi_session *session = cls_session->dd_data;
+	struct qedi_conn *qedi_conn = session->leadconn->dd_data;
+
+	spin_lock_bh(&session->frwd_lock);
+	clear_bit(QEDI_BLOCK_IO, &qedi_conn->qedi->flags);
+	spin_unlock_bh(&session->frwd_lock);
 }
 
 static int qedi_bind_conn_to_iscsi_cid(struct qedi_ctx *qedi,
@@ -789,6 +799,9 @@ static int qedi_task_xmit(struct iscsi_task *task)
 	if (test_bit(QEDI_IN_SHUTDOWN, &qedi_conn->qedi->flags))
 		return -ENODEV;
 
+	if (test_bit(QEDI_BLOCK_IO, &qedi_conn->qedi->flags))
+		return -EACCES;
+
 	cmd->state = 0;
 	cmd->task = NULL;
 	cmd->use_slowpath = false;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/13] scsi: qedi: pass send_iscsi_tmf task to abort
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (9 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 10/13] scsi: qedi: fix session block/unblock abuse during cleanup Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 12/13] scsi: qedi: make sure tmf works are done before disconnect Mike Christie
  2021-04-13 23:06 ` [PATCH 13/13] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

qedi_abort_work knows what task to abort so just pass it to send_iscsi_tmf.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index f13f8af6d931..475cb7823cf1 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -15,7 +15,7 @@
 #include "qedi_fw_scsi.h"
 
 static int send_iscsi_tmf(struct qedi_conn *qedi_conn,
-			  struct iscsi_task *mtask);
+			  struct iscsi_task *mtask, struct iscsi_task *ctask);
 
 void qedi_iscsi_unmap_sg_list(struct qedi_cmd *cmd)
 {
@@ -1425,7 +1425,7 @@ static void qedi_abort_work(struct work_struct *work)
 		goto ldel_exit;
 	}
 
-	send_iscsi_tmf(qedi_conn, qedi_cmd->task);
+	send_iscsi_tmf(qedi_conn, qedi_cmd->task, ctask);
 
 put_task:
 	iscsi_put_task(ctask);
@@ -1455,14 +1455,13 @@ static void qedi_abort_work(struct work_struct *work)
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 }
 
-static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
+static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask,
+			  struct iscsi_task *ctask)
 {
 	struct iscsi_tmf_request_hdr tmf_pdu_header;
 	struct iscsi_task_params task_params;
 	struct qedi_ctx *qedi = qedi_conn->qedi;
 	struct e4_iscsi_task_context *fw_task_ctx;
-	struct iscsi_conn *conn = qedi_conn->cls_conn->dd_data;
-	struct iscsi_task *ctask;
 	struct iscsi_tm *tmf_hdr;
 	struct qedi_cmd *qedi_cmd;
 	struct qedi_cmd *cmd;
@@ -1502,12 +1501,6 @@ static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 
 	if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
 	     ISCSI_TM_FUNC_ABORT_TASK) {
-		ctask = iscsi_itt_to_task(conn, tmf_hdr->rtt);
-		if (!ctask || !ctask->sc) {
-			QEDI_ERR(&qedi->dbg_ctx,
-				 "Could not get reference task\n");
-			return 0;
-		}
 		cmd = (struct qedi_cmd *)ctask->dd_data;
 		tmf_pdu_header.rtt =
 				qedi_set_itt(cmd->task_id,
@@ -1560,7 +1553,7 @@ int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 	case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
 	case ISCSI_TM_FUNC_TARGET_WARM_RESET:
 	case ISCSI_TM_FUNC_TARGET_COLD_RESET:
-		rc = send_iscsi_tmf(qedi_conn, mtask);
+		rc = send_iscsi_tmf(qedi_conn, mtask, NULL);
 		break;
 	default:
 		QEDI_ERR(&qedi->dbg_ctx, "Invalid tmf, cid=0x%x\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 12/13] scsi: qedi: make sure tmf works are done before disconnect
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (10 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 11/13] scsi: qedi: pass send_iscsi_tmf task to abort Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  2021-04-13 23:06 ` [PATCH 13/13] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

We need to make sure that abort and reset completion work has completed
before ep_disconnect returns. After ep_disconnect we can't manipulate
cmds because libiscsi will call conn_stop and take onwership.

We are trying to make sure abort work and reset completion work has
completed before we do the cmd clean up in ep_disconnect. The problem is
that:

1. the work function sets the QEDI_CONN_FW_CLEANUP bit, so if the work
was still pending we would not see the bit set. We need to do this before
the work is queued.

2. If we had multiple works queued then we could break from the loop in
qedi_ep_disconnect early because when abort work 1 completes it could
clear QEDI_CONN_FW_CLEANUP. qedi_ep_disconnect could then see that before
work 2 has run.

3. A tmf reset completion work could run after ep_disconnect starts
cleaning up cmds via qedi_clearsq. ep_disconnect's call to qedi_clearsq
-> qedi_cleanup_all_io would might think it's done cleaning up cmds,
but the reset completion work could still be running. We then return
from ep_disconnect while still doing cleanup.

This replaces the bit with a counter and adds a bool to prevent new
works from starting from the completion path once a ep_disconnect starts.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 46 +++++++++++++++++++++-------------
 drivers/scsi/qedi/qedi_iscsi.c | 23 +++++++++++------
 drivers/scsi/qedi/qedi_iscsi.h |  4 +--
 3 files changed, 47 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 475cb7823cf1..13dd06915d74 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -156,7 +156,6 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 	struct iscsi_tm_rsp *resp_hdr_ptr;
 	int rval = 0;
 
-	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 	resp_hdr_ptr =  (struct iscsi_tm_rsp *)qedi_cmd->tmf_resp_buf;
 
 	rval = qedi_cleanup_all_io(qedi, qedi_conn, qedi_cmd->task, true);
@@ -169,7 +168,10 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 
 exit_tmf_resp:
 	kfree(resp_hdr_ptr);
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
+
+	spin_lock(&qedi_conn->tmf_work_lock);
+	qedi_conn->fw_cleanup_works--;
+	spin_unlock(&qedi_conn->tmf_work_lock);
 }
 
 static void qedi_process_tmf_resp(struct qedi_ctx *qedi,
@@ -225,16 +227,25 @@ static void qedi_process_tmf_resp(struct qedi_ctx *qedi,
 	}
 	spin_unlock(&qedi_conn->list_lock);
 
-	if (((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	      ISCSI_TM_FUNC_LOGICAL_UNIT_RESET) ||
-	    ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	      ISCSI_TM_FUNC_TARGET_WARM_RESET) ||
-	    ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	      ISCSI_TM_FUNC_TARGET_COLD_RESET)) {
+	spin_lock(&qedi_conn->tmf_work_lock);
+	switch (tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) {
+	case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
+	case ISCSI_TM_FUNC_TARGET_WARM_RESET:
+	case ISCSI_TM_FUNC_TARGET_COLD_RESET:
+		if (qedi_conn->ep_disconnect_starting) {
+			/* Session is down so ep_disconnect will clean up */
+			spin_unlock(&qedi_conn->tmf_work_lock);
+			goto unblock_sess;
+		}
+
+		qedi_conn->fw_cleanup_works++;
+		spin_unlock(&qedi_conn->tmf_work_lock);
+
 		INIT_WORK(&qedi_cmd->tmf_work, qedi_tmf_resp_work);
 		queue_work(qedi->tmf_thread, &qedi_cmd->tmf_work);
 		goto unblock_sess;
 	}
+	spin_unlock(&qedi_conn->tmf_work_lock);
 
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
 	kfree(resp_hdr_ptr);
@@ -1361,7 +1372,6 @@ static void qedi_abort_work(struct work_struct *work)
 
 	mtask = qedi_cmd->task;
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
-	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 
 	spin_lock_bh(&conn->session->back_lock);
 	ctask = iscsi_itt_to_ctask(conn, tmf_hdr->rtt);
@@ -1427,11 +1437,7 @@ static void qedi_abort_work(struct work_struct *work)
 
 	send_iscsi_tmf(qedi_conn, qedi_cmd->task, ctask);
 
-put_task:
-	iscsi_put_task(ctask);
-clear_cleanup:
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
-	return;
+	goto put_task;
 
 ldel_exit:
 	spin_lock_bh(&qedi_conn->tmf_work_lock);
@@ -1449,10 +1455,12 @@ static void qedi_abort_work(struct work_struct *work)
 		qedi_conn->active_cmd_count--;
 	}
 	spin_unlock(&qedi_conn->list_lock);
-
+put_task:
 	iscsi_put_task(ctask);
-
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
+clear_cleanup:
+	spin_lock(&qedi_conn->tmf_work_lock);
+	qedi_conn->fw_cleanup_works--;
+	spin_unlock(&qedi_conn->tmf_work_lock);
 }
 
 static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask,
@@ -1547,6 +1555,10 @@ int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 
 	switch (tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) {
 	case ISCSI_TM_FUNC_ABORT_TASK:
+		spin_lock(&qedi_conn->tmf_work_lock);
+		qedi_conn->fw_cleanup_works++;
+		spin_unlock(&qedi_conn->tmf_work_lock);
+
 		INIT_WORK(&qedi_cmd->tmf_work, qedi_abort_work);
 		queue_work(qedi->tmf_thread, &qedi_cmd->tmf_work);
 		break;
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 9a2d9a29fc01..17f1cbb376a4 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -592,7 +592,11 @@ static int qedi_conn_start(struct iscsi_cls_conn *cls_conn)
 		goto start_err;
 	}
 
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
+	spin_lock(&qedi_conn->tmf_work_lock);
+	qedi_conn->fw_cleanup_works = 0;
+	qedi_conn->ep_disconnect_starting = false;
+	spin_unlock(&qedi_conn->tmf_work_lock);
+
 	qedi_conn->abrt_conn = 0;
 
 	rval = iscsi_conn_start(cls_conn);
@@ -1009,7 +1013,6 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 	int ret = 0;
 	int wait_delay;
 	int abrt_conn = 0;
-	int count = 10;
 
 	wait_delay = 60 * HZ + DEF_MAX_RT_TIME;
 	qedi_ep = ep->dd_data;
@@ -1027,13 +1030,19 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 		iscsi_ep_prep_disconnect(conn);
 		abrt_conn = qedi_conn->abrt_conn;
 
-		while (count--)	{
-			if (!test_bit(QEDI_CONN_FW_CLEANUP,
-				      &qedi_conn->flags)) {
-				break;
-			}
+		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
+			  "cid=0x%x qedi_ep=%p waiting for %d tmfs\n",
+			  qedi_ep->iscsi_cid, qedi_ep,
+			  qedi_conn->fw_cleanup_works);
+
+		spin_lock(&qedi_conn->tmf_work_lock);
+		qedi_conn->ep_disconnect_starting = true;
+		while (qedi_conn->fw_cleanup_works > 0) {
+			spin_unlock(&qedi_conn->tmf_work_lock);
 			msleep(1000);
+			spin_lock(&qedi_conn->tmf_work_lock);
 		}
+		spin_unlock(&qedi_conn->tmf_work_lock);
 
 		if (test_bit(QEDI_IN_RECOVERY, &qedi->flags)) {
 			if (qedi_do_not_recover) {
diff --git a/drivers/scsi/qedi/qedi_iscsi.h b/drivers/scsi/qedi/qedi_iscsi.h
index 68ef519f5480..758735209e15 100644
--- a/drivers/scsi/qedi/qedi_iscsi.h
+++ b/drivers/scsi/qedi/qedi_iscsi.h
@@ -169,8 +169,8 @@ struct qedi_conn {
 	struct list_head tmf_work_list;
 	wait_queue_head_t wait_queue;
 	spinlock_t tmf_work_lock;	/* tmf work lock */
-	unsigned long flags;
-#define QEDI_CONN_FW_CLEANUP	1
+	bool ep_disconnect_starting;
+	int fw_cleanup_works;
 };
 
 struct qedi_cmd {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 13/13] scsi: qedi: always wake up if cmd_cleanup_req is set
  2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
                   ` (11 preceding siblings ...)
  2021-04-13 23:06 ` [PATCH 12/13] scsi: qedi: make sure tmf works are done before disconnect Mike Christie
@ 2021-04-13 23:06 ` Mike Christie
  12 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-13 23:06 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we got a response then we should always wake up the conn. For both
the cmd_cleanup_req == 0 or cmd_cleanup_req > 0, we shouldn't dig into
iscsi_itt_to_task because we don't know what the upper layers are doing.

We can also remove the qedi_clear_task_idx call here because once we
signal success libiscsi will loop over the affected commands and end
up calling the cleanup_task callout which will release it.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 31 ++++---------------------------
 1 file changed, 4 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 13dd06915d74..13d1250951a6 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -739,7 +739,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 {
 	struct qedi_work_map *work, *work_tmp;
 	u32 proto_itt = cqe->itid;
-	u32 ptmp_itt = 0;
 	itt_t protoitt = 0;
 	int found = 0;
 	struct qedi_cmd *qedi_cmd = NULL;
@@ -823,37 +822,15 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 
 check_cleanup_reqs:
 	if (qedi_conn->cmd_cleanup_req > 0) {
-		spin_lock_bh(&conn->session->back_lock);
-		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
-		protoitt = build_itt(ptmp_itt, conn->session->age);
-		task = iscsi_itt_to_task(conn, protoitt);
-		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-			  "cleanup io itid=0x%x, protoitt=0x%x, cmd_cleanup_cmpl=%d, cid=0x%x\n",
-			  cqe->itid, protoitt, qedi_conn->cmd_cleanup_cmpl,
-			  qedi_conn->iscsi_conn_id);
-
-		spin_unlock_bh(&conn->session->back_lock);
-		if (!task) {
-			QEDI_NOTICE(&qedi->dbg_ctx,
-				    "task is null, itid=0x%x, cid=0x%x\n",
-				    cqe->itid, qedi_conn->iscsi_conn_id);
-			return;
-		}
-		qedi_conn->cmd_cleanup_cmpl++;
-		wake_up(&qedi_conn->wait_queue);
-
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
-		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
-
+		qedi_conn->cmd_cleanup_cmpl++;
+		wake_up(&qedi_conn->wait_queue);
 	} else {
-		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
-		protoitt = build_itt(ptmp_itt, conn->session->age);
-		task = iscsi_itt_to_task(conn, protoitt);
 		QEDI_ERR(&qedi->dbg_ctx,
-			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x, task=%p\n",
-			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id, task);
+			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x\n",
+			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id);
 	}
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling
  2021-04-13 23:06 ` [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling Mike Christie
@ 2021-04-14  4:17     ` kernel test robot
  2021-04-14  4:17     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-14  4:17 UTC (permalink / raw)
  To: Mike Christie, lduncan, martin.petersen, mrangankar, svernekar,
	linux-scsi, jejb
  Cc: kbuild-all, Mike Christie

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: i386-randconfig-c001-20210413 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> drivers/scsi/libiscsi.c:1737:11-12: WARNING: return of 0/1 in function 'iscsi_eh_running' with return type bool

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34921 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling
@ 2021-04-14  4:17     ` kernel test robot
  0 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-14  4:17 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1135 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: i386-randconfig-c001-20210413 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


cocci warnings: (new ones prefixed by >>)
>> drivers/scsi/libiscsi.c:1737:11-12: WARNING: return of 0/1 in function 'iscsi_eh_running' with return type bool

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 34921 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] scsi: iscsi: fix boolreturn.cocci warnings
  2021-04-13 23:06 ` [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling Mike Christie
@ 2021-04-14  4:17     ` kernel test robot
  2021-04-14  4:17     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-14  4:17 UTC (permalink / raw)
  To: Mike Christie, lduncan, martin.petersen, mrangankar, svernekar,
	linux-scsi, jejb
  Cc: kbuild-all, Mike Christie

From: kernel test robot <lkp@intel.com>

drivers/scsi/libiscsi.c:1737:11-12: WARNING: return of 0/1 in function 'iscsi_eh_running' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

CC: Mike Christie <michael.christie@oracle.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next

 libiscsi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1734,7 +1734,7 @@ static bool iscsi_eh_running(struct iscs
 		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
 		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
 			if (sc->device->lun != scsilun_to_int(&tmf->lun))
-				return 0;
+				return false;
 
 			ISCSI_DBG_EH(conn->session,
 				     "Requeue cmd sent during LU RESET processing.\n");

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] scsi: iscsi: fix boolreturn.cocci warnings
@ 2021-04-14  4:17     ` kernel test robot
  0 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-14  4:17 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1120 bytes --]

From: kernel test robot <lkp@intel.com>

drivers/scsi/libiscsi.c:1737:11-12: WARNING: return of 0/1 in function 'iscsi_eh_running' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

CC: Mike Christie <michael.christie@oracle.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
---

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next

 libiscsi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1734,7 +1734,7 @@ static bool iscsi_eh_running(struct iscs
 		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
 		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
 			if (sc->device->lun != scsilun_to_int(&tmf->lun))
-				return 0;
+				return false;
 
 			ISCSI_DBG_EH(conn->session,
 				     "Requeue cmd sent during LU RESET processing.\n");

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
  2021-04-13 23:06 ` [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race Mike Christie
@ 2021-04-14 12:19     ` kernel test robot
  2021-04-15 14:21     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-14 12:19 UTC (permalink / raw)
  To: Mike Christie, lduncan, martin.petersen, mrangankar, svernekar,
	linux-scsi, jejb
  Cc: kbuild-all, Mike Christie

[-- Attachment #1: Type: text/plain, Size: 1722 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-randconfig-m001-20210414 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

smatch warnings:
drivers/scsi/qedi/qedi_iscsi.c:1401 qedi_cleanup_task() warn: always true condition '(cmd->task_id != -1) => (0-u16max != (-1))'

vim +1401 drivers/scsi/qedi/qedi_iscsi.c

  1386	
  1387	static void qedi_cleanup_task(struct iscsi_task *task)
  1388	{
  1389		struct qedi_cmd *cmd;
  1390	
  1391		if (task->state == ISCSI_TASK_PENDING) {
  1392			QEDI_INFO(NULL, QEDI_LOG_IO, "Returning ref_cnt=%d\n",
  1393				  refcount_read(&task->refcount));
  1394			return;
  1395		}
  1396	
  1397		if (task->sc)
  1398			qedi_iscsi_unmap_sg_list(task->dd_data);
  1399	
  1400		cmd = task->dd_data;
> 1401		if (cmd->task_id != -1)
  1402			qedi_clear_task_idx(iscsi_host_priv(task->conn->session->host),
  1403					    cmd->task_id);
  1404	
  1405		cmd->task_id = -1;
  1406		cmd->scsi_cmd = NULL;
  1407	}
  1408	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34076 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
@ 2021-04-14 12:19     ` kernel test robot
  0 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-14 12:19 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1774 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-randconfig-m001-20210414 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

smatch warnings:
drivers/scsi/qedi/qedi_iscsi.c:1401 qedi_cleanup_task() warn: always true condition '(cmd->task_id != -1) => (0-u16max != (-1))'

vim +1401 drivers/scsi/qedi/qedi_iscsi.c

  1386	
  1387	static void qedi_cleanup_task(struct iscsi_task *task)
  1388	{
  1389		struct qedi_cmd *cmd;
  1390	
  1391		if (task->state == ISCSI_TASK_PENDING) {
  1392			QEDI_INFO(NULL, QEDI_LOG_IO, "Returning ref_cnt=%d\n",
  1393				  refcount_read(&task->refcount));
  1394			return;
  1395		}
  1396	
  1397		if (task->sc)
  1398			qedi_iscsi_unmap_sg_list(task->dd_data);
  1399	
  1400		cmd = task->dd_data;
> 1401		if (cmd->task_id != -1)
  1402			qedi_clear_task_idx(iscsi_host_priv(task->conn->session->host),
  1403					    cmd->task_id);
  1404	
  1405		cmd->task_id = -1;
  1406		cmd->scsi_cmd = NULL;
  1407	}
  1408	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 34076 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
  2021-04-14 12:19     ` kernel test robot
@ 2021-04-14 14:58       ` Mike Christie
  -1 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-14 14:58 UTC (permalink / raw)
  To: kernel test robot, lduncan, martin.petersen, mrangankar,
	svernekar, linux-scsi, jejb
  Cc: kbuild-all

On 4/14/21 7:19 AM, kernel test robot wrote:
> Hi Mike,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on mkp-scsi/for-next]
> [also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210413]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://urldefense.com/v3/__https://git-scm.com/docs/git-format-patch__;!!GqivPVa7Brio!OZEBNc0F9LSpphInzwL5oSUljGf2ZBn4C0J0SckH9hGW1t0C6b_iBOEFrpR9CJQaMAkU$ ]
> 
> url:    https://urldefense.com/v3/__https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516__;!!GqivPVa7Brio!OZEBNc0F9LSpphInzwL5oSUljGf2ZBn4C0J0SckH9hGW1t0C6b_iBOEFrpR9CG2g2-Ve$ 
> base:   https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git__;!!GqivPVa7Brio!OZEBNc0F9LSpphInzwL5oSUljGf2ZBn4C0J0SckH9hGW1t0C6b_iBOEFrpR9CGv0P6EJ$  for-next
> config: x86_64-randconfig-m001-20210414 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> smatch warnings:
> drivers/scsi/qedi/qedi_iscsi.c:1401 qedi_cleanup_task() warn: always true condition '(cmd->task_id != -1) => (0-u16max != (-1))'
> 

That and the qedi_task_xmit one were supposed to be U16_MAXs.

Will resend.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
@ 2021-04-14 14:58       ` Mike Christie
  0 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-14 14:58 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1466 bytes --]

On 4/14/21 7:19 AM, kernel test robot wrote:
> Hi Mike,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on mkp-scsi/for-next]
> [also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210413]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://urldefense.com/v3/__https://git-scm.com/docs/git-format-patch__;!!GqivPVa7Brio!OZEBNc0F9LSpphInzwL5oSUljGf2ZBn4C0J0SckH9hGW1t0C6b_iBOEFrpR9CJQaMAkU$ ]
> 
> url:    https://urldefense.com/v3/__https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516__;!!GqivPVa7Brio!OZEBNc0F9LSpphInzwL5oSUljGf2ZBn4C0J0SckH9hGW1t0C6b_iBOEFrpR9CG2g2-Ve$ 
> base:   https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git__;!!GqivPVa7Brio!OZEBNc0F9LSpphInzwL5oSUljGf2ZBn4C0J0SckH9hGW1t0C6b_iBOEFrpR9CGv0P6EJ$  for-next
> config: x86_64-randconfig-m001-20210414 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> smatch warnings:
> drivers/scsi/qedi/qedi_iscsi.c:1401 qedi_cleanup_task() warn: always true condition '(cmd->task_id != -1) => (0-u16max != (-1))'
> 

That and the qedi_task_xmit one were supposed to be U16_MAXs.

Will resend.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
  2021-04-13 23:06 ` [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race Mike Christie
@ 2021-04-15 14:21     ` kernel test robot
  2021-04-15 14:21     ` kernel test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-15 14:21 UTC (permalink / raw)
  To: Mike Christie, lduncan, martin.petersen, mrangankar, svernekar,
	linux-scsi, jejb
  Cc: kbuild-all, clang-built-linux, Mike Christie

[-- Attachment #1: Type: text/plain, Size: 2737 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210415]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-randconfig-a015-20210415 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 6a18cc23efad410db48a3ccfc233d215de7d4cb9)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/8f3fc14a13ee50e57e645238aad4a5177e097c10
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
        git checkout 8f3fc14a13ee50e57e645238aad4a5177e097c10
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/scsi/qedi/qedi_iscsi.c:1401:19: warning: result of comparison of constant -1 with expression of type 'u16' (aka 'unsigned short') is always true [-Wtautological-constant-out-of-range-compare]
           if (cmd->task_id != -1)
               ~~~~~~~~~~~~ ^  ~~
   1 warning generated.


vim +1401 drivers/scsi/qedi/qedi_iscsi.c

  1386	
  1387	static void qedi_cleanup_task(struct iscsi_task *task)
  1388	{
  1389		struct qedi_cmd *cmd;
  1390	
  1391		if (task->state == ISCSI_TASK_PENDING) {
  1392			QEDI_INFO(NULL, QEDI_LOG_IO, "Returning ref_cnt=%d\n",
  1393				  refcount_read(&task->refcount));
  1394			return;
  1395		}
  1396	
  1397		if (task->sc)
  1398			qedi_iscsi_unmap_sg_list(task->dd_data);
  1399	
  1400		cmd = task->dd_data;
> 1401		if (cmd->task_id != -1)
  1402			qedi_clear_task_idx(iscsi_host_priv(task->conn->session->host),
  1403					    cmd->task_id);
  1404	
  1405		cmd->task_id = -1;
  1406		cmd->scsi_cmd = NULL;
  1407	}
  1408	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36370 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race
@ 2021-04-15 14:21     ` kernel test robot
  0 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2021-04-15 14:21 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2805 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on scsi/for-next rdma/for-next v5.12-rc7 next-20210415]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-randconfig-a015-20210415 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 6a18cc23efad410db48a3ccfc233d215de7d4cb9)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/8f3fc14a13ee50e57e645238aad4a5177e097c10
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mike-Christie/scsi-libicsi-and-qedi-tmf-fixes/20210414-072516
        git checkout 8f3fc14a13ee50e57e645238aad4a5177e097c10
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/scsi/qedi/qedi_iscsi.c:1401:19: warning: result of comparison of constant -1 with expression of type 'u16' (aka 'unsigned short') is always true [-Wtautological-constant-out-of-range-compare]
           if (cmd->task_id != -1)
               ~~~~~~~~~~~~ ^  ~~
   1 warning generated.


vim +1401 drivers/scsi/qedi/qedi_iscsi.c

  1386	
  1387	static void qedi_cleanup_task(struct iscsi_task *task)
  1388	{
  1389		struct qedi_cmd *cmd;
  1390	
  1391		if (task->state == ISCSI_TASK_PENDING) {
  1392			QEDI_INFO(NULL, QEDI_LOG_IO, "Returning ref_cnt=%d\n",
  1393				  refcount_read(&task->refcount));
  1394			return;
  1395		}
  1396	
  1397		if (task->sc)
  1398			qedi_iscsi_unmap_sg_list(task->dd_data);
  1399	
  1400		cmd = task->dd_data;
> 1401		if (cmd->task_id != -1)
  1402			qedi_clear_task_idx(iscsi_host_priv(task->conn->session->host),
  1403					    cmd->task_id);
  1404	
  1405		cmd->task_id = -1;
  1406		cmd->scsi_cmd = NULL;
  1407	}
  1408	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 36370 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 13/13] scsi: qedi: always wake up if cmd_cleanup_req is set
  2021-04-10 18:40 [RFC PATCH 0/13]: qedi tmf fixes Mike Christie
@ 2021-04-10 18:40 ` Mike Christie
  0 siblings, 0 replies; 25+ messages in thread
From: Mike Christie @ 2021-04-10 18:40 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we got a response then we should always wake up the conn. For both
the cmd_cleanup_req == 0 or cmd_cleanup_req > 0, we shouldn't dig into
iscsi_itt_to_task because we don't know what the upper layers are doing.

We can also remove the qedi_clear_task_idx call here because once we
signal success libiscsi will loop over the affected commands and end
up calling the cleanup_task callout which will release it.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/qedi/qedi_fw.c | 31 ++++---------------------------
 1 file changed, 4 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 13dd06915d74..13d1250951a6 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -739,7 +739,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 {
 	struct qedi_work_map *work, *work_tmp;
 	u32 proto_itt = cqe->itid;
-	u32 ptmp_itt = 0;
 	itt_t protoitt = 0;
 	int found = 0;
 	struct qedi_cmd *qedi_cmd = NULL;
@@ -823,37 +822,15 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 
 check_cleanup_reqs:
 	if (qedi_conn->cmd_cleanup_req > 0) {
-		spin_lock_bh(&conn->session->back_lock);
-		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
-		protoitt = build_itt(ptmp_itt, conn->session->age);
-		task = iscsi_itt_to_task(conn, protoitt);
-		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-			  "cleanup io itid=0x%x, protoitt=0x%x, cmd_cleanup_cmpl=%d, cid=0x%x\n",
-			  cqe->itid, protoitt, qedi_conn->cmd_cleanup_cmpl,
-			  qedi_conn->iscsi_conn_id);
-
-		spin_unlock_bh(&conn->session->back_lock);
-		if (!task) {
-			QEDI_NOTICE(&qedi->dbg_ctx,
-				    "task is null, itid=0x%x, cid=0x%x\n",
-				    cqe->itid, qedi_conn->iscsi_conn_id);
-			return;
-		}
-		qedi_conn->cmd_cleanup_cmpl++;
-		wake_up(&qedi_conn->wait_queue);
-
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
-		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
-
+		qedi_conn->cmd_cleanup_cmpl++;
+		wake_up(&qedi_conn->wait_queue);
 	} else {
-		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
-		protoitt = build_itt(ptmp_itt, conn->session->age);
-		task = iscsi_itt_to_task(conn, protoitt);
 		QEDI_ERR(&qedi->dbg_ctx,
-			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x, task=%p\n",
-			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id, task);
+			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x\n",
+			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id);
 	}
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-04-15 14:21 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 23:06 [PATCH 00/13] scsi: libicsi and qedi tmf fixes Mike Christie
2021-04-13 23:06 ` [PATCH 01/13] scsi: iscsi: add task completion helper Mike Christie
2021-04-13 23:06 ` [PATCH 02/13] scsi: iscsi: fix lun/target reset cmd cleanup handling Mike Christie
2021-04-14  4:17   ` kernel test robot
2021-04-14  4:17     ` kernel test robot
2021-04-14  4:17   ` [PATCH] scsi: iscsi: fix boolreturn.cocci warnings kernel test robot
2021-04-14  4:17     ` kernel test robot
2021-04-13 23:06 ` [PATCH 03/13] scsi: iscsi: prevent cmds from queueing to driver during ep_disconnect Mike Christie
2021-04-13 23:06 ` [PATCH 04/13] scsi: qedi: fix null ref during abort handling Mike Christie
2021-04-13 23:06 ` [PATCH 05/13] scsi: qedi: fix abort vs cmd re-use race Mike Christie
2021-04-14 12:19   ` kernel test robot
2021-04-14 12:19     ` kernel test robot
2021-04-14 14:58     ` Mike Christie
2021-04-14 14:58       ` Mike Christie
2021-04-15 14:21   ` kernel test robot
2021-04-15 14:21     ` kernel test robot
2021-04-13 23:06 ` [PATCH 06/13] scsi: qedi: fix use after free during abort cleanup Mike Christie
2021-04-13 23:06 ` [PATCH 07/13] scsi: qedi: fix tmf tid allocation Mike Christie
2021-04-13 23:06 ` [PATCH 08/13] scsi: qedi: use GFP_NOIO for tmf allocation Mike Christie
2021-04-13 23:06 ` [PATCH 09/13] scsi: qedi: fix session block/unblock abuse during tmf handling Mike Christie
2021-04-13 23:06 ` [PATCH 10/13] scsi: qedi: fix session block/unblock abuse during cleanup Mike Christie
2021-04-13 23:06 ` [PATCH 11/13] scsi: qedi: pass send_iscsi_tmf task to abort Mike Christie
2021-04-13 23:06 ` [PATCH 12/13] scsi: qedi: make sure tmf works are done before disconnect Mike Christie
2021-04-13 23:06 ` [PATCH 13/13] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie
  -- strict thread matches above, loose matches on Subject: below --
2021-04-10 18:40 [RFC PATCH 0/13]: qedi tmf fixes Mike Christie
2021-04-10 18:40 ` [PATCH 13/13] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.