All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/17] libicsi and qedi TMF fixes
@ 2021-04-16  2:04 Mike Christie
  2021-04-16  2:04 ` [PATCH v3 01/17] scsi: iscsi: add task completion helper Mike Christie
                   ` (16 more replies)
  0 siblings, 17 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

The following patches were made over Martin's 5.12 scsi-fixes branch
which has this fix: 73603e995a37 ("scsi: iscsi: Fix iSCSI cls conn state")
that has conflicts with a patch in this set.

The patches fix TMF bugs in the qedi driver, libiscsi EH issues that are
common to all offload drivers like qedi and some fixes for cases where
userspace is not doing an unbind target nl cmd and we are doing TMFs
during session termination.

V3:
- Fix u16 initialization and test.
- Fix bool return value use.
- Added patches for cases where EH is running then userspace terminates
the connection without removing the target first.
- Made patch that stops IO during ep_disconnect more driver friendly
by handling if the ep is bound or not.

V2:
- Dropped patch that reverted the in-kernel conn failure handling. I
posted a full patchset to fix that separately.
- Modfied the tmf vs cmd queueing patch. The RFC patch only supported
qedi and offload drivers. iscsi_tcp/cxgbgi do not need it.
- Added patch for another issue found where cmds can still be queued to the
driver while it does ep_disconnect.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v3 01/17] scsi: iscsi: add task completion helper
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16 21:56   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup Mike Christie
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

This adds a helper to detect if a cmd has completed but not yet freed.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 include/scsi/libiscsi.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 02f966e9358f..8c6d358a8abc 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -145,6 +145,13 @@ static inline void* iscsi_next_hdr(struct iscsi_task *task)
 	return (void*)task->hdr + task->hdr_len;
 }
 
+static inline bool iscsi_task_is_completed(struct iscsi_task *task)
+{
+	return task->state == ISCSI_TASK_COMPLETED ||
+	       task->state == ISCSI_TASK_ABRT_TMF ||
+	       task->state == ISCSI_TASK_ABRT_SESS_RECOV;
+}
+
 /* Connection's states */
 enum {
 	ISCSI_CONN_INITIAL_STAGE,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
  2021-04-16  2:04 ` [PATCH v3 01/17] scsi: iscsi: add task completion helper Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-17 17:22   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 03/17] scsi: iscsi: stop queueing during ep_disconnect Mike Christie
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we are handling a sg io reset there is a small window where cmds can
sneak into iscsi_queuecommand even though libiscsi has sent a TMF to the
driver. This does seems to not be a problem for normal drivers because they
know exactly what was sent to the FW. For libiscsi this is a problem
because it knows what it has sent to the driver but not what the driver
sent to the FW. When we go to cleanup the running tasks, libiscsi might
cleanup the sneaky cmd but the driver/FQ may not have seen the sneaky cmd
and it's left running in FW.

This has libiscsi just stop queueing cmds when it sends the TMF down to the
driver. Both then know they see the same set of cmds.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 104 +++++++++++++++++++++++++++-------------
 1 file changed, 72 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 4834219497ee..aa5ceaffc697 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1669,29 +1669,11 @@ enum {
 	FAILURE_SESSION_NOT_READY,
 };
 
-int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
+static bool iscsi_eh_running(struct iscsi_conn *conn, struct scsi_cmnd *sc,
+			     int *reason)
 {
-	struct iscsi_cls_session *cls_session;
-	struct iscsi_host *ihost;
-	int reason = 0;
-	struct iscsi_session *session;
-	struct iscsi_conn *conn;
-	struct iscsi_task *task = NULL;
-
-	sc->result = 0;
-	sc->SCp.ptr = NULL;
-
-	ihost = shost_priv(host);
-
-	cls_session = starget_to_session(scsi_target(sc->device));
-	session = cls_session->dd_data;
-	spin_lock_bh(&session->frwd_lock);
-
-	reason = iscsi_session_chkready(cls_session);
-	if (reason) {
-		sc->result = reason;
-		goto fault;
-	}
+	struct iscsi_session *session = conn->session;
+	struct iscsi_tm *tmf;
 
 	if (session->state != ISCSI_STATE_LOGGED_IN) {
 		/*
@@ -1707,31 +1689,92 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
 			 * state is bad, allowing completion to happen
 			 */
 			if (unlikely(system_state != SYSTEM_RUNNING)) {
-				reason = FAILURE_SESSION_FAILED;
+				*reason = FAILURE_SESSION_FAILED;
 				sc->result = DID_NO_CONNECT << 16;
 				break;
 			}
 			fallthrough;
 		case ISCSI_STATE_IN_RECOVERY:
-			reason = FAILURE_SESSION_IN_RECOVERY;
+			*reason = FAILURE_SESSION_IN_RECOVERY;
 			sc->result = DID_IMM_RETRY << 16;
 			break;
 		case ISCSI_STATE_LOGGING_OUT:
-			reason = FAILURE_SESSION_LOGGING_OUT;
+			*reason = FAILURE_SESSION_LOGGING_OUT;
 			sc->result = DID_IMM_RETRY << 16;
 			break;
 		case ISCSI_STATE_RECOVERY_FAILED:
-			reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
+			*reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
 			sc->result = DID_TRANSPORT_FAILFAST << 16;
 			break;
 		case ISCSI_STATE_TERMINATE:
-			reason = FAILURE_SESSION_TERMINATE;
+			*reason = FAILURE_SESSION_TERMINATE;
 			sc->result = DID_NO_CONNECT << 16;
 			break;
 		default:
-			reason = FAILURE_SESSION_FREED;
+			*reason = FAILURE_SESSION_FREED;
 			sc->result = DID_NO_CONNECT << 16;
 		}
+		goto eh_running;
+	}
+
+	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
+		*reason = FAILURE_SESSION_IN_RECOVERY;
+		sc->result = DID_REQUEUE << 16;
+		goto eh_running;
+	}
+
+	/*
+	 * For sg resets make sure libiscsi, the drivers and their fw see the
+	 * same cmds. Once we get a TMF that can affect multiple cmds stop
+	 * queueing.
+	 */
+	if (conn->tmf_state != TMF_INITIAL) {
+		tmf = &conn->tmhdr;
+
+		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
+		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
+			if (sc->device->lun != scsilun_to_int(&tmf->lun))
+				break;
+
+			ISCSI_DBG_EH(conn->session,
+				     "Requeue cmd sent during LU RESET processing.\n");
+			sc->result = DID_REQUEUE << 16;
+			goto eh_running;
+		case ISCSI_TM_FUNC_TARGET_WARM_RESET:
+			ISCSI_DBG_EH(conn->session,
+				     "Requeue cmd sent during TARGET RESET processing.\n");
+			sc->result = DID_REQUEUE << 16;
+			goto eh_running;
+		}
+	}
+
+	return false;
+
+eh_running:
+	return true;
+}
+
+int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
+{
+	struct iscsi_cls_session *cls_session;
+	struct iscsi_host *ihost;
+	int reason = 0;
+	struct iscsi_session *session;
+	struct iscsi_conn *conn;
+	struct iscsi_task *task = NULL;
+
+	sc->result = 0;
+	sc->SCp.ptr = NULL;
+
+	ihost = shost_priv(host);
+
+	cls_session = starget_to_session(scsi_target(sc->device));
+	session = cls_session->dd_data;
+	spin_lock_bh(&session->frwd_lock);
+
+	reason = iscsi_session_chkready(cls_session);
+	if (reason) {
+		sc->result = reason;
 		goto fault;
 	}
 
@@ -1742,11 +1785,8 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
 		goto fault;
 	}
 
-	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
-		reason = FAILURE_SESSION_IN_RECOVERY;
-		sc->result = DID_REQUEUE << 16;
+	if (iscsi_eh_running(conn, sc, &reason))
 		goto fault;
-	}
 
 	if (iscsi_check_cmdsn_window_closed(conn)) {
 		reason = FAILURE_WINDOW_CLOSED;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 03/17] scsi: iscsi: stop queueing during ep_disconnect
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
  2021-04-16  2:04 ` [PATCH v3 01/17] scsi: iscsi: add task completion helper Mike Christie
  2021-04-16  2:04 ` [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-20 14:28   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 04/17] scsi: iscsi: drop suspend calls from ep_disconnect Mike Christie
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

During ep_disconnect we have been doing iscsi_suspend_tx/queue to block
new IO but every driver except cxgbi and iscsi_tcp can still get IO from
__iscsi_conn_send_pdu if we haven't called iscsi_conn_failure before
ep_disconnect. This could happen if we were terminating the session, and
the logout timedout before it was even sent to libiscsi.

This patch fixes the issue by adding a helper which reverses the bind_conn
call that allows new IO to be queued. Drivers implementing ep_disconnect
can use this to make sure new IO is not queued to them when handling the
disconnect.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c |  1 +
 drivers/scsi/be2iscsi/be_main.c          |  1 +
 drivers/scsi/bnx2i/bnx2i_iscsi.c         |  1 +
 drivers/scsi/cxgbi/cxgb3i/cxgb3i.c       |  1 +
 drivers/scsi/cxgbi/cxgb4i/cxgb4i.c       |  1 +
 drivers/scsi/libiscsi.c                  | 61 +++++++++++++++++++++---
 drivers/scsi/qedi/qedi_iscsi.c           |  1 +
 drivers/scsi/qla4xxx/ql4_os.c            |  1 +
 drivers/scsi/scsi_transport_iscsi.c      |  3 ++
 include/scsi/libiscsi.h                  |  1 +
 include/scsi/scsi_transport_iscsi.h      |  1 +
 11 files changed, 67 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 8fcaa1136f2c..6baebcb6d14d 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -1002,6 +1002,7 @@ static struct iscsi_transport iscsi_iser_transport = {
 	/* connection management */
 	.create_conn            = iscsi_iser_conn_create,
 	.bind_conn              = iscsi_iser_conn_bind,
+	.unbind_conn		= iscsi_conn_unbind,
 	.destroy_conn           = iscsi_conn_teardown,
 	.attr_is_visible	= iser_attr_is_visible,
 	.set_param              = iscsi_iser_set_param,
diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index 90fcddb76f46..e9658a67d9da 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -5809,6 +5809,7 @@ struct iscsi_transport beiscsi_iscsi_transport = {
 	.destroy_session = beiscsi_session_destroy,
 	.create_conn = beiscsi_conn_create,
 	.bind_conn = beiscsi_conn_bind,
+	.unbind_conn = iscsi_conn_unbind,
 	.destroy_conn = iscsi_conn_teardown,
 	.attr_is_visible = beiscsi_attr_is_visible,
 	.set_iface_param = beiscsi_iface_set_param,
diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
index 1e6d8f62ea3c..b6c1da46d582 100644
--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
+++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
@@ -2276,6 +2276,7 @@ struct iscsi_transport bnx2i_iscsi_transport = {
 	.destroy_session	= bnx2i_session_destroy,
 	.create_conn		= bnx2i_conn_create,
 	.bind_conn		= bnx2i_conn_bind,
+	.unbind_conn		= iscsi_conn_unbind,
 	.destroy_conn		= bnx2i_conn_destroy,
 	.attr_is_visible	= bnx2i_attr_is_visible,
 	.set_param		= iscsi_set_param,
diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
index 37d99357120f..edcd3fab6973 100644
--- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
+++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
@@ -117,6 +117,7 @@ static struct iscsi_transport cxgb3i_iscsi_transport = {
 	/* connection management */
 	.create_conn	= cxgbi_create_conn,
 	.bind_conn	= cxgbi_bind_conn,
+	.unbind_conn	= iscsi_conn_unbind,
 	.destroy_conn	= iscsi_tcp_conn_teardown,
 	.start_conn	= iscsi_conn_start,
 	.stop_conn	= iscsi_conn_stop,
diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
index 2c3491528d42..efb3e2b3398e 100644
--- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
@@ -134,6 +134,7 @@ static struct iscsi_transport cxgb4i_iscsi_transport = {
 	/* connection management */
 	.create_conn	= cxgbi_create_conn,
 	.bind_conn		= cxgbi_bind_conn,
+	.unbind_conn	= iscsi_conn_unbind,
 	.destroy_conn	= iscsi_tcp_conn_teardown,
 	.start_conn		= iscsi_conn_start,
 	.stop_conn		= iscsi_conn_stop,
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index aa5ceaffc697..ce3898fdb10f 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1387,22 +1387,28 @@ void iscsi_session_failure(struct iscsi_session *session,
 }
 EXPORT_SYMBOL_GPL(iscsi_session_failure);
 
-void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err)
+static void iscsi_set_conn_failed(struct iscsi_conn *conn)
 {
 	struct iscsi_session *session = conn->session;
 
-	spin_lock_bh(&session->frwd_lock);
-	if (session->state == ISCSI_STATE_FAILED) {
-		spin_unlock_bh(&session->frwd_lock);
+	if (session->state == ISCSI_STATE_FAILED)
 		return;
-	}
 
 	if (conn->stop_stage == 0)
 		session->state = ISCSI_STATE_FAILED;
-	spin_unlock_bh(&session->frwd_lock);
 
 	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx);
 	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx);
+}
+
+void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err)
+{
+	struct iscsi_session *session = conn->session;
+
+	spin_lock_bh(&session->frwd_lock);
+	iscsi_set_conn_failed(conn);
+	spin_unlock_bh(&session->frwd_lock);
+
 	iscsi_conn_error_event(conn->cls_conn, err);
 }
 EXPORT_SYMBOL_GPL(iscsi_conn_failure);
@@ -2220,6 +2226,49 @@ static void iscsi_check_transport_timeouts(struct timer_list *t)
 	spin_unlock(&session->frwd_lock);
 }
 
+/*
+ * iscsi_conn_unbind - prevent queueing to conn.
+ * @conn: iscsi conn ep is bound to.
+ *
+ * This must be called by drivers implementing the ep_disconnect callout.
+ * It disables queueing to the connection from libiscsi in preparation for
+ * an ep_disconnect call.
+ */
+void iscsi_conn_unbind(struct iscsi_cls_conn *cls_conn)
+{
+	struct iscsi_session *session;
+	struct iscsi_conn *conn;
+
+	if (!cls_conn)
+		return;
+
+	conn = cls_conn->dd_data;
+	session = conn->session;
+	/*
+	 * Wait for iscsi_eh calls to exit. We don't wait for the tmf to
+	 * complete or timeout. The caller just wants to know what's running
+	 * is everything that needs to be cleaned up, and no cmds will be
+	 * queued.
+	 */
+	mutex_lock(&session->eh_mutex);
+
+	iscsi_suspend_queue(conn);
+	iscsi_suspend_tx(conn);
+
+	spin_lock_bh(&session->frwd_lock);
+	/*
+	 * if logout timed out before userspace could even send a PDU the
+	 * state might still be in ISCSI_STATE_LOGGED_IN and allowing new cmds
+	 * and TMFs.
+	 */
+	if (session->state == ISCSI_STATE_LOGGED_IN)
+		iscsi_set_conn_failed(conn);
+
+	spin_unlock_bh(&session->frwd_lock);
+	mutex_unlock(&session->eh_mutex);
+}
+EXPORT_SYMBOL_GPL(iscsi_conn_unbind);
+
 static void iscsi_prep_abort_task_pdu(struct iscsi_task *task,
 				      struct iscsi_tm *hdr)
 {
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 08c05403cd72..ef16537c523c 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -1401,6 +1401,7 @@ struct iscsi_transport qedi_iscsi_transport = {
 	.destroy_session = qedi_session_destroy,
 	.create_conn = qedi_conn_create,
 	.bind_conn = qedi_conn_bind,
+	.unbind_conn = iscsi_conn_unbind,
 	.start_conn = qedi_conn_start,
 	.stop_conn = iscsi_conn_stop,
 	.destroy_conn = qedi_conn_destroy,
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 7bd9a4a04ad5..ff663cb330c2 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -259,6 +259,7 @@ static struct iscsi_transport qla4xxx_iscsi_transport = {
 	.start_conn             = qla4xxx_conn_start,
 	.create_conn            = qla4xxx_conn_create,
 	.bind_conn              = qla4xxx_conn_bind,
+	.unbind_conn		= iscsi_conn_unbind,
 	.stop_conn              = iscsi_conn_stop,
 	.destroy_conn           = qla4xxx_conn_destroy,
 	.set_param              = iscsi_set_param,
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 441f0152193f..833114c8e197 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2981,6 +2981,8 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport *transport,
 		conn->ep = NULL;
 		mutex_unlock(&conn->ep_mutex);
 		conn->state = ISCSI_CONN_FAILED;
+
+		transport->unbind_conn(conn);
 	}
 
 	transport->ep_disconnect(ep);
@@ -4656,6 +4658,7 @@ iscsi_register_transport(struct iscsi_transport *tt)
 	int err;
 
 	BUG_ON(!tt);
+	WARN_ON(tt->ep_disconnect && !tt->unbind_conn);
 
 	priv = iscsi_if_transport_lookup(tt);
 	if (priv)
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 8c6d358a8abc..ec6d508e7a4a 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -431,6 +431,7 @@ extern int iscsi_conn_start(struct iscsi_cls_conn *);
 extern void iscsi_conn_stop(struct iscsi_cls_conn *, int);
 extern int iscsi_conn_bind(struct iscsi_cls_session *, struct iscsi_cls_conn *,
 			   int);
+extern void iscsi_conn_unbind(struct iscsi_cls_conn *cls_conn);
 extern void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err);
 extern void iscsi_session_failure(struct iscsi_session *session,
 				  enum iscsi_err err);
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index fc5a39839b4b..afc61a23628d 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -82,6 +82,7 @@ struct iscsi_transport {
 	void (*destroy_session) (struct iscsi_cls_session *session);
 	struct iscsi_cls_conn *(*create_conn) (struct iscsi_cls_session *sess,
 				uint32_t cid);
+	void (*unbind_conn) (struct iscsi_cls_conn *conn);
 	int (*bind_conn) (struct iscsi_cls_session *session,
 			  struct iscsi_cls_conn *cls_conn,
 			  uint64_t transport_eph, int is_leading);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 04/17] scsi: iscsi: drop suspend calls from ep_disconnect
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (2 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 03/17] scsi: iscsi: stop queueing during ep_disconnect Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-20 14:29   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn Mike Christie
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

libiscsi will now suspend the send/tx queue for the drivers so we can drop
it from the drivers ep_disconnect.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/be2iscsi/be_iscsi.c | 6 ------
 drivers/scsi/bnx2i/bnx2i_iscsi.c | 6 +-----
 drivers/scsi/cxgbi/libcxgbi.c    | 1 -
 drivers/scsi/qedi/qedi_iscsi.c   | 8 ++++----
 4 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/be2iscsi/be_iscsi.c b/drivers/scsi/be2iscsi/be_iscsi.c
index a13c203ef7a9..a03d0ebc2312 100644
--- a/drivers/scsi/be2iscsi/be_iscsi.c
+++ b/drivers/scsi/be2iscsi/be_iscsi.c
@@ -1293,7 +1293,6 @@ static int beiscsi_conn_close(struct beiscsi_endpoint *beiscsi_ep)
 void beiscsi_ep_disconnect(struct iscsi_endpoint *ep)
 {
 	struct beiscsi_endpoint *beiscsi_ep;
-	struct beiscsi_conn *beiscsi_conn;
 	struct beiscsi_hba *phba;
 	uint16_t cri_index;
 
@@ -1312,11 +1311,6 @@ void beiscsi_ep_disconnect(struct iscsi_endpoint *ep)
 		return;
 	}
 
-	if (beiscsi_ep->conn) {
-		beiscsi_conn = beiscsi_ep->conn;
-		iscsi_suspend_queue(beiscsi_conn->conn);
-	}
-
 	if (!beiscsi_hba_is_online(phba)) {
 		beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_CONFIG,
 			    "BS_%d : HBA in error 0x%lx\n", phba->state);
diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
index b6c1da46d582..9a4f4776a78a 100644
--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
+++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
@@ -2113,7 +2113,6 @@ static void bnx2i_ep_disconnect(struct iscsi_endpoint *ep)
 {
 	struct bnx2i_endpoint *bnx2i_ep;
 	struct bnx2i_conn *bnx2i_conn = NULL;
-	struct iscsi_conn *conn = NULL;
 	struct bnx2i_hba *hba;
 
 	bnx2i_ep = ep->dd_data;
@@ -2126,11 +2125,8 @@ static void bnx2i_ep_disconnect(struct iscsi_endpoint *ep)
 		!time_after(jiffies, bnx2i_ep->timestamp + (12 * HZ)))
 		msleep(250);
 
-	if (bnx2i_ep->conn) {
+	if (bnx2i_ep->conn)
 		bnx2i_conn = bnx2i_ep->conn;
-		conn = bnx2i_conn->cls_conn->dd_data;
-		iscsi_suspend_queue(conn);
-	}
 	hba = bnx2i_ep->hba;
 
 	mutex_lock(&hba->net_dev_lock);
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index f078b3c4e083..215dd0eb3f48 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -2968,7 +2968,6 @@ void cxgbi_ep_disconnect(struct iscsi_endpoint *ep)
 		ep, cep, cconn, csk, csk->state, csk->flags);
 
 	if (cconn && cconn->iconn) {
-		iscsi_suspend_tx(cconn->iconn);
 		write_lock_bh(&csk->callback_lock);
 		cep->csk->user_data = NULL;
 		cconn->cep = NULL;
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index ef16537c523c..30dc345b011c 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -988,7 +988,6 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 {
 	struct qedi_endpoint *qedi_ep;
 	struct qedi_conn *qedi_conn = NULL;
-	struct iscsi_conn *conn = NULL;
 	struct qedi_ctx *qedi;
 	int ret = 0;
 	int wait_delay;
@@ -1007,8 +1006,6 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 
 	if (qedi_ep->conn) {
 		qedi_conn = qedi_ep->conn;
-		conn = qedi_conn->cls_conn->dd_data;
-		iscsi_suspend_queue(conn);
 		abrt_conn = qedi_conn->abrt_conn;
 
 		while (count--)	{
@@ -1621,8 +1618,11 @@ void qedi_clear_session_ctx(struct iscsi_cls_session *cls_sess)
 	struct iscsi_conn *conn = session->leadconn;
 	struct qedi_conn *qedi_conn = conn->dd_data;
 
-	if (iscsi_is_session_online(cls_sess))
+	if (iscsi_is_session_online(cls_sess)) {
+		if (conn)
+			iscsi_suspend_queue(conn);
 		qedi_ep_disconnect(qedi_conn->iscsi_ep);
+	}
 
 	qedi_conn_destroy(qedi_conn->cls_conn);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (3 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 04/17] scsi: iscsi: drop suspend calls from ep_disconnect Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-22 15:02   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 06/17] scsi: iscsi: fix use conn use after free Mike Christie
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we haven't done an unbind target call, we can race during conn
destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
still accessing a task while iscsi_conn_teardown is freeing the conn. This
patch has us wait for all threads to drop their refs to outstanding tasks
during conn destruction.

There is also an issue where we could be accessing the conn directly via
fields like conn->ehwait in the eh callbacks. The next patch will fix
those.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index ce3898fdb10f..ce6d04035c64 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -3120,6 +3120,24 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
 }
 EXPORT_SYMBOL_GPL(iscsi_conn_setup);
 
+static bool iscsi_session_has_tasks(struct iscsi_session *session)
+{
+	struct iscsi_task *task;
+	int i;
+
+	spin_lock_bh(&session->back_lock);
+	for (i = 0; i < session->cmds_max; i++) {
+		task = session->cmds[i];
+
+		if (task->sc) {
+			spin_unlock_bh(&session->back_lock);
+			return true;
+		}
+	}
+	spin_unlock_bh(&session->back_lock);
+	return false;
+}
+
 /**
  * iscsi_conn_teardown - teardown iscsi connection
  * @cls_conn: iscsi class connection
@@ -3144,7 +3162,17 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
 		session->state = ISCSI_STATE_TERMINATE;
 		wake_up(&conn->ehwait);
 	}
+
 	spin_unlock_bh(&session->frwd_lock);
+	mutex_unlock(&session->eh_mutex);
+	/*
+	 * If the caller didn't do a target unbind we could be exiting a
+	 * scsi-ml entry point that had a task ref. Wait on them here.
+	 */
+	while (iscsi_session_has_tasks(session))
+		msleep(50);
+
+	mutex_lock(&session->eh_mutex);
 
 	/* flush queued up work because we free the connection below */
 	iscsi_suspend_tx(conn);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 06/17] scsi: iscsi: fix use conn use after free
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (4 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-24 21:11   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 07/17] scsi: iscsi: move pool freeing Mike Christie
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we haven't done a unbind target call we can race where
iscsi_conn_teardown wakes up the eh/abort thread and then frees the conn
while those threads are still accessing the conn ehwait.

We can only do one TMF per session so this just moves the TMF fields from
the conn to the session. We can then rely on the
iscsi_session_teardown->iscsi_remove_session->__iscsi_unbind_session call
to remove the target and it's devices, and know after that point there is
no device or scsi-ml callout trying to access the session.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 123 +++++++++++++++++++---------------------
 include/scsi/libiscsi.h |  11 ++--
 2 files changed, 64 insertions(+), 70 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index ce6d04035c64..56b41d8fff02 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -230,11 +230,11 @@ static int iscsi_prep_ecdb_ahs(struct iscsi_task *task)
  */
 static int iscsi_check_tmf_restrictions(struct iscsi_task *task, int opcode)
 {
-	struct iscsi_conn *conn = task->conn;
-	struct iscsi_tm *tmf = &conn->tmhdr;
+	struct iscsi_session *session = task->conn->session;
+	struct iscsi_tm *tmf = &session->tmhdr;
 	u64 hdr_lun;
 
-	if (conn->tmf_state == TMF_INITIAL)
+	if (session->tmf_state == TMF_INITIAL)
 		return 0;
 
 	if ((tmf->opcode & ISCSI_OPCODE_MASK) != ISCSI_OP_SCSI_TMFUNC)
@@ -254,24 +254,19 @@ static int iscsi_check_tmf_restrictions(struct iscsi_task *task, int opcode)
 		 * Fail all SCSI cmd PDUs
 		 */
 		if (opcode != ISCSI_OP_SCSI_DATA_OUT) {
-			iscsi_conn_printk(KERN_INFO, conn,
-					  "task [op %x itt "
-					  "0x%x/0x%x] "
-					  "rejected.\n",
-					  opcode, task->itt,
-					  task->hdr_itt);
+			iscsi_session_printk(KERN_INFO, session,
+					"task [op %x itt 0x%x/0x%x] rejected.\n",
+					opcode, task->itt, task->hdr_itt);
 			return -EACCES;
 		}
 		/*
 		 * And also all data-out PDUs in response to R2T
 		 * if fast_abort is set.
 		 */
-		if (conn->session->fast_abort) {
-			iscsi_conn_printk(KERN_INFO, conn,
-					  "task [op %x itt "
-					  "0x%x/0x%x] fast abort.\n",
-					  opcode, task->itt,
-					  task->hdr_itt);
+		if (session->fast_abort) {
+			iscsi_session_printk(KERN_INFO, session,
+					"task [op %x itt 0x%x/0x%x] fast abort.\n",
+					opcode, task->itt, task->hdr_itt);
 			return -EACCES;
 		}
 		break;
@@ -284,7 +279,7 @@ static int iscsi_check_tmf_restrictions(struct iscsi_task *task, int opcode)
 		 */
 		if (opcode == ISCSI_OP_SCSI_DATA_OUT &&
 		    task->hdr_itt == tmf->rtt) {
-			ISCSI_DBG_SESSION(conn->session,
+			ISCSI_DBG_SESSION(session,
 					  "Preventing task %x/%x from sending "
 					  "data-out due to abort task in "
 					  "progress\n", task->itt,
@@ -936,20 +931,21 @@ iscsi_data_in_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr,
 static void iscsi_tmf_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr)
 {
 	struct iscsi_tm_rsp *tmf = (struct iscsi_tm_rsp *)hdr;
+	struct iscsi_session *session = conn->session;
 
 	conn->exp_statsn = be32_to_cpu(hdr->statsn) + 1;
 	conn->tmfrsp_pdus_cnt++;
 
-	if (conn->tmf_state != TMF_QUEUED)
+	if (session->tmf_state != TMF_QUEUED)
 		return;
 
 	if (tmf->response == ISCSI_TMF_RSP_COMPLETE)
-		conn->tmf_state = TMF_SUCCESS;
+		session->tmf_state = TMF_SUCCESS;
 	else if (tmf->response == ISCSI_TMF_RSP_NO_TASK)
-		conn->tmf_state = TMF_NOT_FOUND;
+		session->tmf_state = TMF_NOT_FOUND;
 	else
-		conn->tmf_state = TMF_FAILED;
-	wake_up(&conn->ehwait);
+		session->tmf_state = TMF_FAILED;
+	wake_up(&session->ehwait);
 }
 
 static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
@@ -1734,20 +1730,20 @@ static bool iscsi_eh_running(struct iscsi_conn *conn, struct scsi_cmnd *sc,
 	 * same cmds. Once we get a TMF that can affect multiple cmds stop
 	 * queueing.
 	 */
-	if (conn->tmf_state != TMF_INITIAL) {
-		tmf = &conn->tmhdr;
+	if (session->tmf_state != TMF_INITIAL) {
+		tmf = &session->tmhdr;
 
 		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
 		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
 			if (sc->device->lun != scsilun_to_int(&tmf->lun))
 				break;
 
-			ISCSI_DBG_EH(conn->session,
+			ISCSI_DBG_EH(session,
 				     "Requeue cmd sent during LU RESET processing.\n");
 			sc->result = DID_REQUEUE << 16;
 			goto eh_running;
 		case ISCSI_TM_FUNC_TARGET_WARM_RESET:
-			ISCSI_DBG_EH(conn->session,
+			ISCSI_DBG_EH(session,
 				     "Requeue cmd sent during TARGET RESET processing.\n");
 			sc->result = DID_REQUEUE << 16;
 			goto eh_running;
@@ -1866,15 +1862,14 @@ EXPORT_SYMBOL_GPL(iscsi_target_alloc);
 
 static void iscsi_tmf_timedout(struct timer_list *t)
 {
-	struct iscsi_conn *conn = from_timer(conn, t, tmf_timer);
-	struct iscsi_session *session = conn->session;
+	struct iscsi_session *session = from_timer(session, t, tmf_timer);
 
 	spin_lock(&session->frwd_lock);
-	if (conn->tmf_state == TMF_QUEUED) {
-		conn->tmf_state = TMF_TIMEDOUT;
+	if (session->tmf_state == TMF_QUEUED) {
+		session->tmf_state = TMF_TIMEDOUT;
 		ISCSI_DBG_EH(session, "tmf timedout\n");
 		/* unblock eh_abort() */
-		wake_up(&conn->ehwait);
+		wake_up(&session->ehwait);
 	}
 	spin_unlock(&session->frwd_lock);
 }
@@ -1897,8 +1892,8 @@ static int iscsi_exec_task_mgmt_fn(struct iscsi_conn *conn,
 		return -EPERM;
 	}
 	conn->tmfcmd_pdus_cnt++;
-	conn->tmf_timer.expires = timeout * HZ + jiffies;
-	add_timer(&conn->tmf_timer);
+	session->tmf_timer.expires = timeout * HZ + jiffies;
+	add_timer(&session->tmf_timer);
 	ISCSI_DBG_EH(session, "tmf set timeout\n");
 
 	spin_unlock_bh(&session->frwd_lock);
@@ -1912,12 +1907,12 @@ static int iscsi_exec_task_mgmt_fn(struct iscsi_conn *conn,
 	 * 3) session is terminated or restarted or userspace has
 	 * given up on recovery
 	 */
-	wait_event_interruptible(conn->ehwait, age != session->age ||
+	wait_event_interruptible(session->ehwait, age != session->age ||
 				 session->state != ISCSI_STATE_LOGGED_IN ||
-				 conn->tmf_state != TMF_QUEUED);
+				 session->tmf_state != TMF_QUEUED);
 	if (signal_pending(current))
 		flush_signals(current);
-	del_timer_sync(&conn->tmf_timer);
+	del_timer_sync(&session->tmf_timer);
 
 	mutex_lock(&session->eh_mutex);
 	spin_lock_bh(&session->frwd_lock);
@@ -2347,17 +2342,17 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
 	}
 
 	/* only have one tmf outstanding at a time */
-	if (conn->tmf_state != TMF_INITIAL)
+	if (session->tmf_state != TMF_INITIAL)
 		goto failed;
-	conn->tmf_state = TMF_QUEUED;
+	session->tmf_state = TMF_QUEUED;
 
-	hdr = &conn->tmhdr;
+	hdr = &session->tmhdr;
 	iscsi_prep_abort_task_pdu(task, hdr);
 
 	if (iscsi_exec_task_mgmt_fn(conn, hdr, age, session->abort_timeout))
 		goto failed;
 
-	switch (conn->tmf_state) {
+	switch (session->tmf_state) {
 	case TMF_SUCCESS:
 		spin_unlock_bh(&session->frwd_lock);
 		/*
@@ -2372,7 +2367,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
 		 */
 		spin_lock_bh(&session->frwd_lock);
 		fail_scsi_task(task, DID_ABORT);
-		conn->tmf_state = TMF_INITIAL;
+		session->tmf_state = TMF_INITIAL;
 		memset(hdr, 0, sizeof(*hdr));
 		spin_unlock_bh(&session->frwd_lock);
 		iscsi_start_tx(conn);
@@ -2383,7 +2378,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
 		goto failed_unlocked;
 	case TMF_NOT_FOUND:
 		if (!sc->SCp.ptr) {
-			conn->tmf_state = TMF_INITIAL;
+			session->tmf_state = TMF_INITIAL;
 			memset(hdr, 0, sizeof(*hdr));
 			/* task completed before tmf abort response */
 			ISCSI_DBG_EH(session, "sc completed while abort	in "
@@ -2392,7 +2387,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
 		}
 		fallthrough;
 	default:
-		conn->tmf_state = TMF_INITIAL;
+		session->tmf_state = TMF_INITIAL;
 		goto failed;
 	}
 
@@ -2451,11 +2446,11 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
 	conn = session->leadconn;
 
 	/* only have one tmf outstanding at a time */
-	if (conn->tmf_state != TMF_INITIAL)
+	if (session->tmf_state != TMF_INITIAL)
 		goto unlock;
-	conn->tmf_state = TMF_QUEUED;
+	session->tmf_state = TMF_QUEUED;
 
-	hdr = &conn->tmhdr;
+	hdr = &session->tmhdr;
 	iscsi_prep_lun_reset_pdu(sc, hdr);
 
 	if (iscsi_exec_task_mgmt_fn(conn, hdr, session->age,
@@ -2464,7 +2459,7 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
 		goto unlock;
 	}
 
-	switch (conn->tmf_state) {
+	switch (session->tmf_state) {
 	case TMF_SUCCESS:
 		break;
 	case TMF_TIMEDOUT:
@@ -2472,7 +2467,7 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
 		iscsi_conn_failure(conn, ISCSI_ERR_SCSI_EH_SESSION_RST);
 		goto done;
 	default:
-		conn->tmf_state = TMF_INITIAL;
+		session->tmf_state = TMF_INITIAL;
 		goto unlock;
 	}
 
@@ -2484,7 +2479,7 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
 	spin_lock_bh(&session->frwd_lock);
 	memset(hdr, 0, sizeof(*hdr));
 	fail_scsi_tasks(conn, sc->device->lun, DID_ERROR);
-	conn->tmf_state = TMF_INITIAL;
+	session->tmf_state = TMF_INITIAL;
 	spin_unlock_bh(&session->frwd_lock);
 
 	iscsi_start_tx(conn);
@@ -2507,8 +2502,7 @@ void iscsi_session_recovery_timedout(struct iscsi_cls_session *cls_session)
 	spin_lock_bh(&session->frwd_lock);
 	if (session->state != ISCSI_STATE_LOGGED_IN) {
 		session->state = ISCSI_STATE_RECOVERY_FAILED;
-		if (session->leadconn)
-			wake_up(&session->leadconn->ehwait);
+		wake_up(&session->ehwait);
 	}
 	spin_unlock_bh(&session->frwd_lock);
 }
@@ -2553,7 +2547,7 @@ int iscsi_eh_session_reset(struct scsi_cmnd *sc)
 	iscsi_conn_failure(conn, ISCSI_ERR_SCSI_EH_SESSION_RST);
 
 	ISCSI_DBG_EH(session, "wait for relogin\n");
-	wait_event_interruptible(conn->ehwait,
+	wait_event_interruptible(session->ehwait,
 				 session->state == ISCSI_STATE_TERMINATE ||
 				 session->state == ISCSI_STATE_LOGGED_IN ||
 				 session->state == ISCSI_STATE_RECOVERY_FAILED);
@@ -2614,11 +2608,11 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
 	conn = session->leadconn;
 
 	/* only have one tmf outstanding at a time */
-	if (conn->tmf_state != TMF_INITIAL)
+	if (session->tmf_state != TMF_INITIAL)
 		goto unlock;
-	conn->tmf_state = TMF_QUEUED;
+	session->tmf_state = TMF_QUEUED;
 
-	hdr = &conn->tmhdr;
+	hdr = &session->tmhdr;
 	iscsi_prep_tgt_reset_pdu(sc, hdr);
 
 	if (iscsi_exec_task_mgmt_fn(conn, hdr, session->age,
@@ -2627,7 +2621,7 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
 		goto unlock;
 	}
 
-	switch (conn->tmf_state) {
+	switch (session->tmf_state) {
 	case TMF_SUCCESS:
 		break;
 	case TMF_TIMEDOUT:
@@ -2635,7 +2629,7 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
 		iscsi_conn_failure(conn, ISCSI_ERR_SCSI_EH_SESSION_RST);
 		goto done;
 	default:
-		conn->tmf_state = TMF_INITIAL;
+		session->tmf_state = TMF_INITIAL;
 		goto unlock;
 	}
 
@@ -2647,7 +2641,7 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
 	spin_lock_bh(&session->frwd_lock);
 	memset(hdr, 0, sizeof(*hdr));
 	fail_scsi_tasks(conn, -1, DID_ERROR);
-	conn->tmf_state = TMF_INITIAL;
+	session->tmf_state = TMF_INITIAL;
 	spin_unlock_bh(&session->frwd_lock);
 
 	iscsi_start_tx(conn);
@@ -2977,7 +2971,10 @@ iscsi_session_setup(struct iscsi_transport *iscsit, struct Scsi_Host *shost,
 	session->tt = iscsit;
 	session->dd_data = cls_session->dd_data + sizeof(*session);
 
+	session->tmf_state = TMF_INITIAL;
+	timer_setup(&session->tmf_timer, iscsi_tmf_timedout, 0);
 	mutex_init(&session->eh_mutex);
+
 	spin_lock_init(&session->frwd_lock);
 	spin_lock_init(&session->back_lock);
 
@@ -3081,7 +3078,6 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
 	conn->c_stage = ISCSI_CONN_INITIAL_STAGE;
 	conn->id = conn_idx;
 	conn->exp_statsn = 0;
-	conn->tmf_state = TMF_INITIAL;
 
 	timer_setup(&conn->transport_timer, iscsi_check_transport_timeouts, 0);
 
@@ -3106,8 +3102,7 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
 		goto login_task_data_alloc_fail;
 	conn->login_task->data = conn->data = data;
 
-	timer_setup(&conn->tmf_timer, iscsi_tmf_timedout, 0);
-	init_waitqueue_head(&conn->ehwait);
+	init_waitqueue_head(&session->ehwait);
 
 	return cls_conn;
 
@@ -3160,7 +3155,7 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
 		 * leading connection? then give up on recovery.
 		 */
 		session->state = ISCSI_STATE_TERMINATE;
-		wake_up(&conn->ehwait);
+		wake_up(&session->ehwait);
 	}
 
 	spin_unlock_bh(&session->frwd_lock);
@@ -3245,7 +3240,7 @@ int iscsi_conn_start(struct iscsi_cls_conn *cls_conn)
 		 * commands after successful recovery
 		 */
 		conn->stop_stage = 0;
-		conn->tmf_state = TMF_INITIAL;
+		session->tmf_state = TMF_INITIAL;
 		session->age++;
 		if (session->age == 16)
 			session->age = 0;
@@ -3259,7 +3254,7 @@ int iscsi_conn_start(struct iscsi_cls_conn *cls_conn)
 	spin_unlock_bh(&session->frwd_lock);
 
 	iscsi_unblock_session(session->cls_session);
-	wake_up(&conn->ehwait);
+	wake_up(&session->ehwait);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(iscsi_conn_start);
@@ -3353,7 +3348,7 @@ void iscsi_conn_stop(struct iscsi_cls_conn *cls_conn, int flag)
 	spin_lock_bh(&session->frwd_lock);
 	fail_scsi_tasks(conn, -1, DID_TRANSPORT_DISRUPTED);
 	fail_mgmt_tasks(session, conn);
-	memset(&conn->tmhdr, 0, sizeof(conn->tmhdr));
+	memset(&session->tmhdr, 0, sizeof(session->tmhdr));
 	spin_unlock_bh(&session->frwd_lock);
 	mutex_unlock(&session->eh_mutex);
 }
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index ec6d508e7a4a..545dfefffe9b 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -202,12 +202,6 @@ struct iscsi_conn {
 	unsigned long		suspend_tx;	/* suspend Tx */
 	unsigned long		suspend_rx;	/* suspend Rx */
 
-	/* abort */
-	wait_queue_head_t	ehwait;		/* used in eh_abort() */
-	struct iscsi_tm		tmhdr;
-	struct timer_list	tmf_timer;
-	int			tmf_state;	/* see TMF_INITIAL, etc.*/
-
 	/* negotiated params */
 	unsigned		max_recv_dlength; /* initiator_max_recv_dsl*/
 	unsigned		max_xmit_dlength; /* target_max_recv_dsl */
@@ -277,6 +271,11 @@ struct iscsi_session {
 	 * and recv lock.
 	 */
 	struct mutex		eh_mutex;
+	/* abort */
+	wait_queue_head_t	ehwait;		/* used in eh_abort() */
+	struct iscsi_tm		tmhdr;
+	struct timer_list	tmf_timer;
+	int			tmf_state;	/* see TMF_INITIAL, etc.*/
 
 	/* iSCSI session-wide sequencing */
 	uint32_t		cmdsn;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 07/17] scsi: iscsi: move pool freeing
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (5 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 06/17] scsi: iscsi: fix use conn use after free Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-24 21:12   ` Lee Duncan
  2021-04-16  2:04 ` [PATCH v3 08/17] scsi: qedi: fix null ref during abort handling Mike Christie
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

This doesn't fix any bugs, but it makes more sense to free the pool after
we have removed the session. At that time we know nothing is touching any
of the session fields, because all devices have been removed and scans are
stopped.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 56b41d8fff02..b2970054558a 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -3025,10 +3025,9 @@ void iscsi_session_teardown(struct iscsi_cls_session *cls_session)
 	struct module *owner = cls_session->transport->owner;
 	struct Scsi_Host *shost = session->host;
 
-	iscsi_pool_free(&session->cmdpool);
-
 	iscsi_remove_session(cls_session);
 
+	iscsi_pool_free(&session->cmdpool);
 	kfree(session->password);
 	kfree(session->password_in);
 	kfree(session->username);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 08/17] scsi: qedi: fix null ref during abort handling
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (6 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 07/17] scsi: iscsi: move pool freeing Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts Mike Christie
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If qedi_process_cmd_cleanup_resp finds the cmd it frees the work and sets
list_tmf_work to NULL, so qedi_tmf_work should check if list_tmf_work is
non-NULL when it wants to force cleanup.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 440ddd2309f1..cf57b4e49700 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -1453,7 +1453,7 @@ static void qedi_tmf_work(struct work_struct *work)
 
 ldel_exit:
 	spin_lock_bh(&qedi_conn->tmf_work_lock);
-	if (!qedi_cmd->list_tmf_work) {
+	if (qedi_cmd->list_tmf_work) {
 		list_del_init(&list_work->list);
 		qedi_cmd->list_tmf_work = NULL;
 		kfree(list_work);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (7 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 08/17] scsi: qedi: fix null ref during abort handling Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16 11:39     ` kernel test robot
  2021-04-16  2:04 ` [PATCH v3 10/17] scsi: qedi: fix use after free during abort cleanup Mike Christie
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If the SCSI cmd completes after qedi_tmf_work calls iscsi_itt_to_task then
the qedi qedi_cmd->task_id could be freed and used for another cmd. If we
then call qedi_iscsi_cleanup_task with that task_id we will be cleaning up
the wrong cmd.

This patch has us wait to release the task_id until the last put has been
done on the iscsi_task. Because libiscsi grabs a ref to the task when
sending the abort, we know that for the non abort timeout case that the
task_id we are referencing is for the cmd that was supposed to be aborted.

The next patch will fix the case where the abort timesout while we are
running qedi_tmf_work.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 13 -------------
 drivers/scsi/qedi/qedi_iscsi.c | 20 +++++++++++++++++---
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index cf57b4e49700..ad4357e4c15d 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -73,7 +73,6 @@ static void qedi_process_logout_resp(struct qedi_ctx *qedi,
 	spin_unlock(&qedi_conn->list_lock);
 
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr, NULL, 0);
 
 	spin_unlock(&session->back_lock);
@@ -138,7 +137,6 @@ static void qedi_process_text_resp(struct qedi_ctx *qedi,
 	spin_unlock(&qedi_conn->list_lock);
 
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr,
 			     qedi_conn->gen_pdu.resp_buf,
@@ -164,13 +162,11 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 	iscsi_block_session(session->cls_session);
 	rval = qedi_cleanup_all_io(qedi, qedi_conn, qedi_cmd->task, true);
 	if (rval) {
-		qedi_clear_task_idx(qedi, qedi_cmd->task_id);
 		iscsi_unblock_session(session->cls_session);
 		goto exit_tmf_resp;
 	}
 
 	iscsi_unblock_session(session->cls_session);
-	qedi_clear_task_idx(qedi, qedi_cmd->task_id);
 
 	spin_lock(&session->back_lock);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
@@ -245,8 +241,6 @@ static void qedi_process_tmf_resp(struct qedi_ctx *qedi,
 		goto unblock_sess;
 	}
 
-	qedi_clear_task_idx(qedi, qedi_cmd->task_id);
-
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
 	kfree(resp_hdr_ptr);
 
@@ -314,7 +308,6 @@ static void qedi_process_login_resp(struct qedi_ctx *qedi,
 		  "Freeing tid=0x%x for cid=0x%x\n",
 		  cmd->task_id, qedi_conn->iscsi_conn_id);
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 }
 
 static void qedi_get_rq_bdq_buf(struct qedi_ctx *qedi,
@@ -468,7 +461,6 @@ static int qedi_process_nopin_mesg(struct qedi_ctx *qedi,
 		}
 
 		spin_unlock(&qedi_conn->list_lock);
-		qedi_clear_task_idx(qedi, cmd->task_id);
 	}
 
 done:
@@ -673,7 +665,6 @@ static void qedi_scsi_completion(struct qedi_ctx *qedi,
 	if (qedi_io_tracing)
 		qedi_trace_io(qedi, task, cmd->task_id, QEDI_IO_TRACE_RSP);
 
-	qedi_clear_task_idx(qedi, cmd->task_id);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)hdr,
 			     conn->data, datalen);
 error:
@@ -730,7 +721,6 @@ static void qedi_process_nopin_local_cmpl(struct qedi_ctx *qedi,
 		  cqe->itid, cmd->task_id);
 
 	cmd->state = RESPONSE_RECEIVED;
-	qedi_clear_task_idx(qedi, cmd->task_id);
 
 	spin_lock_bh(&session->back_lock);
 	__iscsi_put_task(task);
@@ -821,8 +811,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 			if (qedi_cmd->state == CLEANUP_WAIT_FAILED)
 				qedi_cmd->state = CLEANUP_RECV;
 
-			qedi_clear_task_idx(qedi_conn->qedi, rtid);
-
 			spin_lock(&qedi_conn->list_lock);
 			if (likely(dbg_cmd->io_cmd_in_list)) {
 				dbg_cmd->io_cmd_in_list = false;
@@ -856,7 +844,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
-		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
 
 	} else {
 		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 30dc345b011c..416202bc4241 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -772,7 +772,6 @@ static int qedi_mtask_xmit(struct iscsi_conn *conn, struct iscsi_task *task)
 	}
 
 	cmd->conn = conn->dd_data;
-	cmd->scsi_cmd = NULL;
 	return qedi_iscsi_send_generic_request(task);
 }
 
@@ -783,6 +782,10 @@ static int qedi_task_xmit(struct iscsi_task *task)
 	struct qedi_cmd *cmd = task->dd_data;
 	struct scsi_cmnd *sc = task->sc;
 
+	/* Clear now so in cleanup_task we know it didn't make it */
+	cmd->scsi_cmd = NULL;
+	cmd->task_id = U16_MAX;
+
 	if (test_bit(QEDI_IN_SHUTDOWN, &qedi_conn->qedi->flags))
 		return -ENODEV;
 
@@ -1380,13 +1383,24 @@ static umode_t qedi_attr_is_visible(int param_type, int param)
 
 static void qedi_cleanup_task(struct iscsi_task *task)
 {
-	if (!task->sc || task->state == ISCSI_TASK_PENDING) {
+	struct qedi_cmd *cmd;
+
+	if (task->state == ISCSI_TASK_PENDING) {
 		QEDI_INFO(NULL, QEDI_LOG_IO, "Returning ref_cnt=%d\n",
 			  refcount_read(&task->refcount));
 		return;
 	}
 
-	qedi_iscsi_unmap_sg_list(task->dd_data);
+	if (task->sc)
+		qedi_iscsi_unmap_sg_list(task->dd_data);
+
+	cmd = task->dd_data;
+	if (cmd->task_id != U16_MAX)
+		qedi_clear_task_idx(iscsi_host_priv(task->conn->session->host),
+				    cmd->task_id);
+
+	cmd->task_id = U16_MAX;
+	cmd->scsi_cmd = NULL;
 }
 
 struct iscsi_transport qedi_iscsi_transport = {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 10/17] scsi: qedi: fix use after free during abort cleanup
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (8 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 11/17] scsi: qedi: fix TMF tid allocation Mike Christie
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

This fixes two bugs:

1. The scsi cmd task could be completed and the abort could timeout while
we are running qedi_tmf_work so we need to get a ref to the task.

2. If qedi_tmf_work's qedi_wait_for_cleanup_request call times out we will
also force the clean up of the qedi_work_map but
qedi_process_cmd_cleanup_resp could still be accessing the qedi_cmd for the
abort TMF. We can then race where qedi_process_cmd_cleanup_resp is still
accessing the mtask's qedi_cmd but libiscsi has escalated to session level
cleanup and is cleaning up the mtask while we are still accessing it.

To fix this issue we extend where we hold the tmf_work_lock and back_lock
so the qedi_process_cmd_cleanup_resp access is serialized with the cleanup
done in qedi_tmf_work and any completion handling for the iscsi_task.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 110 ++++++++++++++++++---------------
 drivers/scsi/qedi/qedi_iscsi.h |   1 +
 2 files changed, 61 insertions(+), 50 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index ad4357e4c15d..c5699421ec37 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -729,7 +729,6 @@ static void qedi_process_nopin_local_cmpl(struct qedi_ctx *qedi,
 
 static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 					  struct iscsi_cqe_solicited *cqe,
-					  struct iscsi_task *task,
 					  struct iscsi_conn *conn)
 {
 	struct qedi_work_map *work, *work_tmp;
@@ -742,7 +741,7 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 	u32 iscsi_cid;
 	struct qedi_conn *qedi_conn;
 	struct qedi_cmd *dbg_cmd;
-	struct iscsi_task *mtask;
+	struct iscsi_task *mtask, *task;
 	struct iscsi_tm *tmf_hdr = NULL;
 
 	iscsi_cid = cqe->conn_id;
@@ -768,6 +767,7 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 			}
 			found = 1;
 			mtask = qedi_cmd->task;
+			task = work->ctask;
 			tmf_hdr = (struct iscsi_tm *)mtask->hdr;
 			rtid = work->rtid;
 
@@ -776,52 +776,47 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 			qedi_cmd->list_tmf_work = NULL;
 		}
 	}
-	spin_unlock_bh(&qedi_conn->tmf_work_lock);
-
-	if (found) {
-		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-			  "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n",
-			  proto_itt, tmf_hdr->flags, qedi_conn->iscsi_conn_id);
-
-		if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_ABORT_TASK) {
-			spin_lock_bh(&conn->session->back_lock);
 
-			protoitt = build_itt(get_itt(tmf_hdr->rtt),
-					     conn->session->age);
-			task = iscsi_itt_to_task(conn, protoitt);
-
-			spin_unlock_bh(&conn->session->back_lock);
+	if (!found) {
+		spin_unlock_bh(&qedi_conn->tmf_work_lock);
+		goto check_cleanup_reqs;
+	}
 
-			if (!task) {
-				QEDI_NOTICE(&qedi->dbg_ctx,
-					    "IO task completed, tmf rtt=0x%x, cid=0x%x\n",
-					    get_itt(tmf_hdr->rtt),
-					    qedi_conn->iscsi_conn_id);
-				return;
-			}
+	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
+		  "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n",
+		  proto_itt, tmf_hdr->flags, qedi_conn->iscsi_conn_id);
+
+	spin_lock_bh(&conn->session->back_lock);
+	if (iscsi_task_is_completed(task)) {
+		QEDI_NOTICE(&qedi->dbg_ctx,
+			    "IO task completed, tmf rtt=0x%x, cid=0x%x\n",
+			   get_itt(tmf_hdr->rtt), qedi_conn->iscsi_conn_id);
+		goto unlock;
+	}
 
-			dbg_cmd = task->dd_data;
+	dbg_cmd = task->dd_data;
 
-			QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-				  "Abort tmf rtt=0x%x, i/o itt=0x%x, i/o tid=0x%x, cid=0x%x\n",
-				  get_itt(tmf_hdr->rtt), get_itt(task->itt),
-				  dbg_cmd->task_id, qedi_conn->iscsi_conn_id);
+	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
+		  "Abort tmf rtt=0x%x, i/o itt=0x%x, i/o tid=0x%x, cid=0x%x\n",
+		  get_itt(tmf_hdr->rtt), get_itt(task->itt), dbg_cmd->task_id,
+		  qedi_conn->iscsi_conn_id);
 
-			if (qedi_cmd->state == CLEANUP_WAIT_FAILED)
-				qedi_cmd->state = CLEANUP_RECV;
+	spin_lock(&qedi_conn->list_lock);
+	if (likely(dbg_cmd->io_cmd_in_list)) {
+		dbg_cmd->io_cmd_in_list = false;
+		list_del_init(&dbg_cmd->io_cmd);
+		qedi_conn->active_cmd_count--;
+	}
+	spin_unlock(&qedi_conn->list_lock);
+	qedi_cmd->state = CLEANUP_RECV;
+unlock:
+	spin_unlock_bh(&conn->session->back_lock);
+	spin_unlock_bh(&qedi_conn->tmf_work_lock);
+	wake_up_interruptible(&qedi_conn->wait_queue);
+	return;
 
-			spin_lock(&qedi_conn->list_lock);
-			if (likely(dbg_cmd->io_cmd_in_list)) {
-				dbg_cmd->io_cmd_in_list = false;
-				list_del_init(&dbg_cmd->io_cmd);
-				qedi_conn->active_cmd_count--;
-			}
-			spin_unlock(&qedi_conn->list_lock);
-			qedi_cmd->state = CLEANUP_RECV;
-			wake_up_interruptible(&qedi_conn->wait_queue);
-		}
-	} else if (qedi_conn->cmd_cleanup_req > 0) {
+check_cleanup_reqs:
+	if (qedi_conn->cmd_cleanup_req > 0) {
 		spin_lock_bh(&conn->session->back_lock);
 		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
 		protoitt = build_itt(ptmp_itt, conn->session->age);
@@ -844,6 +839,7 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
+		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
 
 	} else {
 		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
@@ -946,8 +942,7 @@ void qedi_fp_process_cqes(struct qedi_work *work)
 		goto exit_fp_process;
 	case ISCSI_CQE_TYPE_TASK_CLEANUP:
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM, "CleanUp CqE\n");
-		qedi_process_cmd_cleanup_resp(qedi, &cqe->cqe_solicited, task,
-					      conn);
+		qedi_process_cmd_cleanup_resp(qedi, &cqe->cqe_solicited, conn);
 		goto exit_fp_process;
 	default:
 		QEDI_ERR(&qedi->dbg_ctx, "Error cqe.\n");
@@ -1374,12 +1369,22 @@ static void qedi_tmf_work(struct work_struct *work)
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
 	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 
-	ctask = iscsi_itt_to_task(conn, tmf_hdr->rtt);
-	if (!ctask || !ctask->sc) {
+	spin_lock_bh(&conn->session->back_lock);
+	ctask = iscsi_itt_to_ctask(conn, tmf_hdr->rtt);
+	if (!ctask || iscsi_task_is_completed(ctask)) {
+		spin_unlock_bh(&conn->session->back_lock);
 		QEDI_ERR(&qedi->dbg_ctx, "Task already completed\n");
-		goto abort_ret;
+		goto clear_cleanup;
 	}
 
+	/*
+	 * libiscsi gets a ref before sending the abort, but if libiscsi times
+	 * it out then it could release it and the cmd could complete from
+	 * under us.
+	 */
+	__iscsi_get_task(ctask);
+	spin_unlock_bh(&conn->session->back_lock);
+
 	cmd = (struct qedi_cmd *)ctask->dd_data;
 	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
 		  "Abort tmf rtt=0x%x, cmd itt=0x%x, cmd tid=0x%x, cid=0x%x\n",
@@ -1389,19 +1394,20 @@ static void qedi_tmf_work(struct work_struct *work)
 	if (qedi_do_not_recover) {
 		QEDI_ERR(&qedi->dbg_ctx, "DONT SEND CLEANUP/ABORT %d\n",
 			 qedi_do_not_recover);
-		goto abort_ret;
+		goto put_task;
 	}
 
 	list_work = kzalloc(sizeof(*list_work), GFP_ATOMIC);
 	if (!list_work) {
 		QEDI_ERR(&qedi->dbg_ctx, "Memory allocation failed\n");
-		goto abort_ret;
+		goto put_task;
 	}
 
 	qedi_cmd->type = TYPEIO;
 	list_work->qedi_cmd = qedi_cmd;
 	list_work->rtid = cmd->task_id;
 	list_work->state = QEDI_WORK_SCHEDULED;
+	list_work->ctask = ctask;
 	qedi_cmd->list_tmf_work = list_work;
 
 	QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
@@ -1434,7 +1440,9 @@ static void qedi_tmf_work(struct work_struct *work)
 	qedi_cmd->task_id = tid;
 	qedi_send_iscsi_tmf(qedi_conn, qedi_cmd->task);
 
-abort_ret:
+put_task:
+	iscsi_put_task(ctask);
+clear_cleanup:
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 	return;
 
@@ -1455,6 +1463,8 @@ static void qedi_tmf_work(struct work_struct *work)
 	}
 	spin_unlock(&qedi_conn->list_lock);
 
+	iscsi_put_task(ctask);
+
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 }
 
diff --git a/drivers/scsi/qedi/qedi_iscsi.h b/drivers/scsi/qedi/qedi_iscsi.h
index 39dc27c85e3c..68ef519f5480 100644
--- a/drivers/scsi/qedi/qedi_iscsi.h
+++ b/drivers/scsi/qedi/qedi_iscsi.h
@@ -212,6 +212,7 @@ struct qedi_cmd {
 struct qedi_work_map {
 	struct list_head list;
 	struct qedi_cmd *qedi_cmd;
+	struct iscsi_task *ctask;
 	int rtid;
 
 	int state;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 11/17] scsi: qedi: fix TMF tid allocation
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (9 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 10/17] scsi: qedi: fix use after free during abort cleanup Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 12/17] scsi: qedi: use GFP_NOIO for tmf allocation Mike Christie
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

qedi_iscsi_abort_work and qedi_tmf_work allocates a tid then calls
qedi_send_iscsi_tmf which also allcoates a tid. This removes the tid
allocation from the callers.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 76 ++++++++++------------------------
 drivers/scsi/qedi/qedi_gbl.h   |  3 +-
 drivers/scsi/qedi/qedi_iscsi.c |  2 +-
 3 files changed, 25 insertions(+), 56 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index c5699421ec37..542255c94d96 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -14,8 +14,8 @@
 #include "qedi_fw_iscsi.h"
 #include "qedi_fw_scsi.h"
 
-static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
-			       struct iscsi_task *mtask);
+static int send_iscsi_tmf(struct qedi_conn *qedi_conn,
+			  struct iscsi_task *mtask);
 
 void qedi_iscsi_unmap_sg_list(struct qedi_cmd *cmd)
 {
@@ -1350,7 +1350,7 @@ static int qedi_wait_for_cleanup_request(struct qedi_ctx *qedi,
 	return 0;
 }
 
-static void qedi_tmf_work(struct work_struct *work)
+static void qedi_abort_work(struct work_struct *work)
 {
 	struct qedi_cmd *qedi_cmd =
 		container_of(work, struct qedi_cmd, tmf_work);
@@ -1363,7 +1363,6 @@ static void qedi_tmf_work(struct work_struct *work)
 	struct iscsi_task *ctask;
 	struct iscsi_tm *tmf_hdr;
 	s16 rval = 0;
-	s16 tid = 0;
 
 	mtask = qedi_cmd->task;
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
@@ -1404,6 +1403,7 @@ static void qedi_tmf_work(struct work_struct *work)
 	}
 
 	qedi_cmd->type = TYPEIO;
+	qedi_cmd->state = CLEANUP_WAIT;
 	list_work->qedi_cmd = qedi_cmd;
 	list_work->rtid = cmd->task_id;
 	list_work->state = QEDI_WORK_SCHEDULED;
@@ -1430,15 +1430,7 @@ static void qedi_tmf_work(struct work_struct *work)
 		goto ldel_exit;
 	}
 
-	tid = qedi_get_task_idx(qedi);
-	if (tid == -1) {
-		QEDI_ERR(&qedi->dbg_ctx, "Invalid tid, cid=0x%x\n",
-			 qedi_conn->iscsi_conn_id);
-		goto ldel_exit;
-	}
-
-	qedi_cmd->task_id = tid;
-	qedi_send_iscsi_tmf(qedi_conn, qedi_cmd->task);
+	send_iscsi_tmf(qedi_conn, qedi_cmd->task);
 
 put_task:
 	iscsi_put_task(ctask);
@@ -1468,8 +1460,7 @@ static void qedi_tmf_work(struct work_struct *work)
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 }
 
-static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
-			       struct iscsi_task *mtask)
+static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 {
 	struct iscsi_tmf_request_hdr tmf_pdu_header;
 	struct iscsi_task_params task_params;
@@ -1484,7 +1475,6 @@ static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
 	u32 scsi_lun[2];
 	s16 tid = 0;
 	u16 sq_idx = 0;
-	int rval = 0;
 
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
 	qedi_cmd = (struct qedi_cmd *)mtask->dd_data;
@@ -1548,10 +1538,7 @@ static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
 	task_params.sqe = &ep->sq[sq_idx];
 
 	memset(task_params.sqe, 0, sizeof(struct iscsi_wqe));
-	rval = init_initiator_tmf_request_task(&task_params,
-					       &tmf_pdu_header);
-	if (rval)
-		return -1;
+	init_initiator_tmf_request_task(&task_params, &tmf_pdu_header);
 
 	spin_lock(&qedi_conn->list_lock);
 	list_add_tail(&qedi_cmd->io_cmd, &qedi_conn->active_cmd_list);
@@ -1563,47 +1550,30 @@ static int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn,
 	return 0;
 }
 
-int qedi_iscsi_abort_work(struct qedi_conn *qedi_conn,
-			  struct iscsi_task *mtask)
+int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 {
+	struct iscsi_tm *tmf_hdr = (struct iscsi_tm *)mtask->hdr;
+	struct qedi_cmd *qedi_cmd = mtask->dd_data;
 	struct qedi_ctx *qedi = qedi_conn->qedi;
-	struct iscsi_tm *tmf_hdr;
-	struct qedi_cmd *qedi_cmd = (struct qedi_cmd *)mtask->dd_data;
-	s16 tid = 0;
+	int rc = 0;
 
-	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
-	qedi_cmd->task = mtask;
-
-	/* If abort task then schedule the work and return */
-	if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	    ISCSI_TM_FUNC_ABORT_TASK) {
-		qedi_cmd->state = CLEANUP_WAIT;
-		INIT_WORK(&qedi_cmd->tmf_work, qedi_tmf_work);
+	switch (tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) {
+	case ISCSI_TM_FUNC_ABORT_TASK:
+		INIT_WORK(&qedi_cmd->tmf_work, qedi_abort_work);
 		queue_work(qedi->tmf_thread, &qedi_cmd->tmf_work);
-
-	} else if (((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_LOGICAL_UNIT_RESET) ||
-		   ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_TARGET_WARM_RESET) ||
-		   ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-		    ISCSI_TM_FUNC_TARGET_COLD_RESET)) {
-		tid = qedi_get_task_idx(qedi);
-		if (tid == -1) {
-			QEDI_ERR(&qedi->dbg_ctx, "Invalid tid, cid=0x%x\n",
-				 qedi_conn->iscsi_conn_id);
-			return -1;
-		}
-		qedi_cmd->task_id = tid;
-
-		qedi_send_iscsi_tmf(qedi_conn, qedi_cmd->task);
-
-	} else {
+		break;
+	case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
+	case ISCSI_TM_FUNC_TARGET_WARM_RESET:
+	case ISCSI_TM_FUNC_TARGET_COLD_RESET:
+		rc = send_iscsi_tmf(qedi_conn, mtask);
+		break;
+	default:
 		QEDI_ERR(&qedi->dbg_ctx, "Invalid tmf, cid=0x%x\n",
 			 qedi_conn->iscsi_conn_id);
-		return -1;
+		return -EINVAL;
 	}
 
-	return 0;
+	return rc;
 }
 
 int qedi_send_iscsi_text(struct qedi_conn *qedi_conn,
diff --git a/drivers/scsi/qedi/qedi_gbl.h b/drivers/scsi/qedi/qedi_gbl.h
index 116645c08c71..fb44a282613e 100644
--- a/drivers/scsi/qedi/qedi_gbl.h
+++ b/drivers/scsi/qedi/qedi_gbl.h
@@ -31,8 +31,7 @@ int qedi_send_iscsi_login(struct qedi_conn *qedi_conn,
 			  struct iscsi_task *task);
 int qedi_send_iscsi_logout(struct qedi_conn *qedi_conn,
 			   struct iscsi_task *task);
-int qedi_iscsi_abort_work(struct qedi_conn *qedi_conn,
-			  struct iscsi_task *mtask);
+int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask);
 int qedi_send_iscsi_text(struct qedi_conn *qedi_conn,
 			 struct iscsi_task *task);
 int qedi_send_iscsi_nopout(struct qedi_conn *qedi_conn,
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 416202bc4241..0061866614b4 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -742,7 +742,7 @@ static int qedi_iscsi_send_generic_request(struct iscsi_task *task)
 		rc = qedi_send_iscsi_logout(qedi_conn, task);
 		break;
 	case ISCSI_OP_SCSI_TMFUNC:
-		rc = qedi_iscsi_abort_work(qedi_conn, task);
+		rc = qedi_send_iscsi_tmf(qedi_conn, task);
 		break;
 	case ISCSI_OP_TEXT:
 		rc = qedi_send_iscsi_text(qedi_conn, task);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 12/17] scsi: qedi: use GFP_NOIO for tmf allocation
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (10 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 11/17] scsi: qedi: fix TMF tid allocation Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 13/17] scsi: qedi: fix TMF session block/unblock use Mike Christie
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

We run from a workqueue with no locks held so use GFP_NOIO.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 542255c94d96..84402e827d42 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -1396,7 +1396,7 @@ static void qedi_abort_work(struct work_struct *work)
 		goto put_task;
 	}
 
-	list_work = kzalloc(sizeof(*list_work), GFP_ATOMIC);
+	list_work = kzalloc(sizeof(*list_work), GFP_NOIO);
 	if (!list_work) {
 		QEDI_ERR(&qedi->dbg_ctx, "Memory allocation failed\n");
 		goto put_task;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 13/17] scsi: qedi: fix TMF session block/unblock use
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (11 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 12/17] scsi: qedi: use GFP_NOIO for tmf allocation Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 14/17] scsi: qedi: fix cleanup " Mike Christie
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

Drivers shouldn't be calling block/unblock session for tmf handling
because the functions can change the session state from under libiscsi. We
now block the session for the drivers during tmf processing, so we can
remove these calls.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 84402e827d42..f13f8af6d931 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -159,14 +159,9 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 	resp_hdr_ptr =  (struct iscsi_tm_rsp *)qedi_cmd->tmf_resp_buf;
 
-	iscsi_block_session(session->cls_session);
 	rval = qedi_cleanup_all_io(qedi, qedi_conn, qedi_cmd->task, true);
-	if (rval) {
-		iscsi_unblock_session(session->cls_session);
+	if (rval)
 		goto exit_tmf_resp;
-	}
-
-	iscsi_unblock_session(session->cls_session);
 
 	spin_lock(&session->back_lock);
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 14/17] scsi: qedi: fix cleanup session block/unblock use
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (12 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 13/17] scsi: qedi: fix TMF session block/unblock use Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 15/17] scsi: qedi: pass send_iscsi_tmf task to abort Mike Christie
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

Drivers shouldn't be calling block/unblock session for cmd cleanup
because the functions can change the session state from under libiscsi.
This adds a new a driver level bit so it can block all IO the host while
it drains the card.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi.h       |  1 +
 drivers/scsi/qedi/qedi_iscsi.c | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qedi/qedi.h b/drivers/scsi/qedi/qedi.h
index c342defc3f52..ce199a7a16b8 100644
--- a/drivers/scsi/qedi/qedi.h
+++ b/drivers/scsi/qedi/qedi.h
@@ -284,6 +284,7 @@ struct qedi_ctx {
 #define QEDI_IN_RECOVERY	5
 #define QEDI_IN_OFFLINE		6
 #define QEDI_IN_SHUTDOWN	7
+#define QEDI_BLOCK_IO		8
 
 	u8 mac[ETH_ALEN];
 	u32 src_ip[4];
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 0061866614b4..6e4f7c427af1 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -330,12 +330,22 @@ qedi_conn_create(struct iscsi_cls_session *cls_session, uint32_t cid)
 
 void qedi_mark_device_missing(struct iscsi_cls_session *cls_session)
 {
-	iscsi_block_session(cls_session);
+	struct iscsi_session *session = cls_session->dd_data;
+	struct qedi_conn *qedi_conn = session->leadconn->dd_data;
+
+	spin_lock_bh(&session->frwd_lock);
+	set_bit(QEDI_BLOCK_IO, &qedi_conn->qedi->flags);
+	spin_unlock_bh(&session->frwd_lock);
 }
 
 void qedi_mark_device_available(struct iscsi_cls_session *cls_session)
 {
-	iscsi_unblock_session(cls_session);
+	struct iscsi_session *session = cls_session->dd_data;
+	struct qedi_conn *qedi_conn = session->leadconn->dd_data;
+
+	spin_lock_bh(&session->frwd_lock);
+	clear_bit(QEDI_BLOCK_IO, &qedi_conn->qedi->flags);
+	spin_unlock_bh(&session->frwd_lock);
 }
 
 static int qedi_bind_conn_to_iscsi_cid(struct qedi_ctx *qedi,
@@ -789,6 +799,9 @@ static int qedi_task_xmit(struct iscsi_task *task)
 	if (test_bit(QEDI_IN_SHUTDOWN, &qedi_conn->qedi->flags))
 		return -ENODEV;
 
+	if (test_bit(QEDI_BLOCK_IO, &qedi_conn->qedi->flags))
+		return -EACCES;
+
 	cmd->state = 0;
 	cmd->task = NULL;
 	cmd->use_slowpath = false;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 15/17] scsi: qedi: pass send_iscsi_tmf task to abort
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (13 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 14/17] scsi: qedi: fix cleanup " Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 16/17] scsi: qedi: complete TMF works before disconnect Mike Christie
  2021-04-16  2:04 ` [PATCH v3 17/17] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

qedi_abort_work knows what task to abort so just pass it to send_iscsi_tmf.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index f13f8af6d931..475cb7823cf1 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -15,7 +15,7 @@
 #include "qedi_fw_scsi.h"
 
 static int send_iscsi_tmf(struct qedi_conn *qedi_conn,
-			  struct iscsi_task *mtask);
+			  struct iscsi_task *mtask, struct iscsi_task *ctask);
 
 void qedi_iscsi_unmap_sg_list(struct qedi_cmd *cmd)
 {
@@ -1425,7 +1425,7 @@ static void qedi_abort_work(struct work_struct *work)
 		goto ldel_exit;
 	}
 
-	send_iscsi_tmf(qedi_conn, qedi_cmd->task);
+	send_iscsi_tmf(qedi_conn, qedi_cmd->task, ctask);
 
 put_task:
 	iscsi_put_task(ctask);
@@ -1455,14 +1455,13 @@ static void qedi_abort_work(struct work_struct *work)
 	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 }
 
-static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
+static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask,
+			  struct iscsi_task *ctask)
 {
 	struct iscsi_tmf_request_hdr tmf_pdu_header;
 	struct iscsi_task_params task_params;
 	struct qedi_ctx *qedi = qedi_conn->qedi;
 	struct e4_iscsi_task_context *fw_task_ctx;
-	struct iscsi_conn *conn = qedi_conn->cls_conn->dd_data;
-	struct iscsi_task *ctask;
 	struct iscsi_tm *tmf_hdr;
 	struct qedi_cmd *qedi_cmd;
 	struct qedi_cmd *cmd;
@@ -1502,12 +1501,6 @@ static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 
 	if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
 	     ISCSI_TM_FUNC_ABORT_TASK) {
-		ctask = iscsi_itt_to_task(conn, tmf_hdr->rtt);
-		if (!ctask || !ctask->sc) {
-			QEDI_ERR(&qedi->dbg_ctx,
-				 "Could not get reference task\n");
-			return 0;
-		}
 		cmd = (struct qedi_cmd *)ctask->dd_data;
 		tmf_pdu_header.rtt =
 				qedi_set_itt(cmd->task_id,
@@ -1560,7 +1553,7 @@ int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 	case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
 	case ISCSI_TM_FUNC_TARGET_WARM_RESET:
 	case ISCSI_TM_FUNC_TARGET_COLD_RESET:
-		rc = send_iscsi_tmf(qedi_conn, mtask);
+		rc = send_iscsi_tmf(qedi_conn, mtask, NULL);
 		break;
 	default:
 		QEDI_ERR(&qedi->dbg_ctx, "Invalid tmf, cid=0x%x\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 16/17] scsi: qedi: complete TMF works before disconnect
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (14 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 15/17] scsi: qedi: pass send_iscsi_tmf task to abort Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  2021-04-16  2:04 ` [PATCH v3 17/17] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

We need to make sure that abort and reset completion work has completed
before ep_disconnect returns. After ep_disconnect we can't manipulate
cmds because libiscsi will call conn_stop and take onwership.

We are trying to make sure abort work and reset completion work has
completed before we do the cmd clean up in ep_disconnect. The problem is
that:

1. the work function sets the QEDI_CONN_FW_CLEANUP bit, so if the work was
still pending we would not see the bit set. We need to do this before the
work is queued.

2. If we had multiple works queued then we could break from the loop in
qedi_ep_disconnect early because when abort work 1 completes it could
clear QEDI_CONN_FW_CLEANUP. qedi_ep_disconnect could then see that before
work 2 has run.

3. A TMF reset completion work could run after ep_disconnect starts
cleaning up cmds via qedi_clearsq. ep_disconnect's call to qedi_clearsq
-> qedi_cleanup_all_io would might think it's done cleaning up cmds, but
the reset completion work could still be running. We then return from
ep_disconnect while still doing cleanup.

This replaces the bit with a counter to track the number of queued TMF
works, and adds a bool to prevent new works from starting from the
completion path once a ep_disconnect starts.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c    | 46 +++++++++++++++++++++-------------
 drivers/scsi/qedi/qedi_iscsi.c | 23 +++++++++++------
 drivers/scsi/qedi/qedi_iscsi.h |  4 +--
 3 files changed, 47 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 475cb7823cf1..13dd06915d74 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -156,7 +156,6 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 	struct iscsi_tm_rsp *resp_hdr_ptr;
 	int rval = 0;
 
-	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 	resp_hdr_ptr =  (struct iscsi_tm_rsp *)qedi_cmd->tmf_resp_buf;
 
 	rval = qedi_cleanup_all_io(qedi, qedi_conn, qedi_cmd->task, true);
@@ -169,7 +168,10 @@ static void qedi_tmf_resp_work(struct work_struct *work)
 
 exit_tmf_resp:
 	kfree(resp_hdr_ptr);
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
+
+	spin_lock(&qedi_conn->tmf_work_lock);
+	qedi_conn->fw_cleanup_works--;
+	spin_unlock(&qedi_conn->tmf_work_lock);
 }
 
 static void qedi_process_tmf_resp(struct qedi_ctx *qedi,
@@ -225,16 +227,25 @@ static void qedi_process_tmf_resp(struct qedi_ctx *qedi,
 	}
 	spin_unlock(&qedi_conn->list_lock);
 
-	if (((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	      ISCSI_TM_FUNC_LOGICAL_UNIT_RESET) ||
-	    ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	      ISCSI_TM_FUNC_TARGET_WARM_RESET) ||
-	    ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
-	      ISCSI_TM_FUNC_TARGET_COLD_RESET)) {
+	spin_lock(&qedi_conn->tmf_work_lock);
+	switch (tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) {
+	case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
+	case ISCSI_TM_FUNC_TARGET_WARM_RESET:
+	case ISCSI_TM_FUNC_TARGET_COLD_RESET:
+		if (qedi_conn->ep_disconnect_starting) {
+			/* Session is down so ep_disconnect will clean up */
+			spin_unlock(&qedi_conn->tmf_work_lock);
+			goto unblock_sess;
+		}
+
+		qedi_conn->fw_cleanup_works++;
+		spin_unlock(&qedi_conn->tmf_work_lock);
+
 		INIT_WORK(&qedi_cmd->tmf_work, qedi_tmf_resp_work);
 		queue_work(qedi->tmf_thread, &qedi_cmd->tmf_work);
 		goto unblock_sess;
 	}
+	spin_unlock(&qedi_conn->tmf_work_lock);
 
 	__iscsi_complete_pdu(conn, (struct iscsi_hdr *)resp_hdr_ptr, NULL, 0);
 	kfree(resp_hdr_ptr);
@@ -1361,7 +1372,6 @@ static void qedi_abort_work(struct work_struct *work)
 
 	mtask = qedi_cmd->task;
 	tmf_hdr = (struct iscsi_tm *)mtask->hdr;
-	set_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
 
 	spin_lock_bh(&conn->session->back_lock);
 	ctask = iscsi_itt_to_ctask(conn, tmf_hdr->rtt);
@@ -1427,11 +1437,7 @@ static void qedi_abort_work(struct work_struct *work)
 
 	send_iscsi_tmf(qedi_conn, qedi_cmd->task, ctask);
 
-put_task:
-	iscsi_put_task(ctask);
-clear_cleanup:
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
-	return;
+	goto put_task;
 
 ldel_exit:
 	spin_lock_bh(&qedi_conn->tmf_work_lock);
@@ -1449,10 +1455,12 @@ static void qedi_abort_work(struct work_struct *work)
 		qedi_conn->active_cmd_count--;
 	}
 	spin_unlock(&qedi_conn->list_lock);
-
+put_task:
 	iscsi_put_task(ctask);
-
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
+clear_cleanup:
+	spin_lock(&qedi_conn->tmf_work_lock);
+	qedi_conn->fw_cleanup_works--;
+	spin_unlock(&qedi_conn->tmf_work_lock);
 }
 
 static int send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask,
@@ -1547,6 +1555,10 @@ int qedi_send_iscsi_tmf(struct qedi_conn *qedi_conn, struct iscsi_task *mtask)
 
 	switch (tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) {
 	case ISCSI_TM_FUNC_ABORT_TASK:
+		spin_lock(&qedi_conn->tmf_work_lock);
+		qedi_conn->fw_cleanup_works++;
+		spin_unlock(&qedi_conn->tmf_work_lock);
+
 		INIT_WORK(&qedi_cmd->tmf_work, qedi_abort_work);
 		queue_work(qedi->tmf_thread, &qedi_cmd->tmf_work);
 		break;
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
index 6e4f7c427af1..25ff2bda077b 100644
--- a/drivers/scsi/qedi/qedi_iscsi.c
+++ b/drivers/scsi/qedi/qedi_iscsi.c
@@ -592,7 +592,11 @@ static int qedi_conn_start(struct iscsi_cls_conn *cls_conn)
 		goto start_err;
 	}
 
-	clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags);
+	spin_lock(&qedi_conn->tmf_work_lock);
+	qedi_conn->fw_cleanup_works = 0;
+	qedi_conn->ep_disconnect_starting = false;
+	spin_unlock(&qedi_conn->tmf_work_lock);
+
 	qedi_conn->abrt_conn = 0;
 
 	rval = iscsi_conn_start(cls_conn);
@@ -1008,7 +1012,6 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 	int ret = 0;
 	int wait_delay;
 	int abrt_conn = 0;
-	int count = 10;
 
 	wait_delay = 60 * HZ + DEF_MAX_RT_TIME;
 	qedi_ep = ep->dd_data;
@@ -1024,13 +1027,19 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
 		qedi_conn = qedi_ep->conn;
 		abrt_conn = qedi_conn->abrt_conn;
 
-		while (count--)	{
-			if (!test_bit(QEDI_CONN_FW_CLEANUP,
-				      &qedi_conn->flags)) {
-				break;
-			}
+		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
+			  "cid=0x%x qedi_ep=%p waiting for %d tmfs\n",
+			  qedi_ep->iscsi_cid, qedi_ep,
+			  qedi_conn->fw_cleanup_works);
+
+		spin_lock(&qedi_conn->tmf_work_lock);
+		qedi_conn->ep_disconnect_starting = true;
+		while (qedi_conn->fw_cleanup_works > 0) {
+			spin_unlock(&qedi_conn->tmf_work_lock);
 			msleep(1000);
+			spin_lock(&qedi_conn->tmf_work_lock);
 		}
+		spin_unlock(&qedi_conn->tmf_work_lock);
 
 		if (test_bit(QEDI_IN_RECOVERY, &qedi->flags)) {
 			if (qedi_do_not_recover) {
diff --git a/drivers/scsi/qedi/qedi_iscsi.h b/drivers/scsi/qedi/qedi_iscsi.h
index 68ef519f5480..758735209e15 100644
--- a/drivers/scsi/qedi/qedi_iscsi.h
+++ b/drivers/scsi/qedi/qedi_iscsi.h
@@ -169,8 +169,8 @@ struct qedi_conn {
 	struct list_head tmf_work_list;
 	wait_queue_head_t wait_queue;
 	spinlock_t tmf_work_lock;	/* tmf work lock */
-	unsigned long flags;
-#define QEDI_CONN_FW_CLEANUP	1
+	bool ep_disconnect_starting;
+	int fw_cleanup_works;
 };
 
 struct qedi_cmd {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 17/17] scsi: qedi: always wake up if cmd_cleanup_req is set
  2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
                   ` (15 preceding siblings ...)
  2021-04-16  2:04 ` [PATCH v3 16/17] scsi: qedi: complete TMF works before disconnect Mike Christie
@ 2021-04-16  2:04 ` Mike Christie
  16 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-16  2:04 UTC (permalink / raw)
  To: lduncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb
  Cc: Mike Christie

If we got a response then we should always wake up the conn. For both
the cmd_cleanup_req == 0 or cmd_cleanup_req > 0, we shouldn't dig into
iscsi_itt_to_task because we don't know what the upper layers are doing.

We can also remove the qedi_clear_task_idx call here because once we
signal success libiscsi will loop over the affected commands and end
up calling the cleanup_task callout which will release it.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
---
 drivers/scsi/qedi/qedi_fw.c | 31 ++++---------------------------
 1 file changed, 4 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c
index 13dd06915d74..13d1250951a6 100644
--- a/drivers/scsi/qedi/qedi_fw.c
+++ b/drivers/scsi/qedi/qedi_fw.c
@@ -739,7 +739,6 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 {
 	struct qedi_work_map *work, *work_tmp;
 	u32 proto_itt = cqe->itid;
-	u32 ptmp_itt = 0;
 	itt_t protoitt = 0;
 	int found = 0;
 	struct qedi_cmd *qedi_cmd = NULL;
@@ -823,37 +822,15 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
 
 check_cleanup_reqs:
 	if (qedi_conn->cmd_cleanup_req > 0) {
-		spin_lock_bh(&conn->session->back_lock);
-		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
-		protoitt = build_itt(ptmp_itt, conn->session->age);
-		task = iscsi_itt_to_task(conn, protoitt);
-		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
-			  "cleanup io itid=0x%x, protoitt=0x%x, cmd_cleanup_cmpl=%d, cid=0x%x\n",
-			  cqe->itid, protoitt, qedi_conn->cmd_cleanup_cmpl,
-			  qedi_conn->iscsi_conn_id);
-
-		spin_unlock_bh(&conn->session->back_lock);
-		if (!task) {
-			QEDI_NOTICE(&qedi->dbg_ctx,
-				    "task is null, itid=0x%x, cid=0x%x\n",
-				    cqe->itid, qedi_conn->iscsi_conn_id);
-			return;
-		}
-		qedi_conn->cmd_cleanup_cmpl++;
-		wake_up(&qedi_conn->wait_queue);
-
 		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
 			  "Freeing tid=0x%x for cid=0x%x\n",
 			  cqe->itid, qedi_conn->iscsi_conn_id);
-		qedi_clear_task_idx(qedi_conn->qedi, cqe->itid);
-
+		qedi_conn->cmd_cleanup_cmpl++;
+		wake_up(&qedi_conn->wait_queue);
 	} else {
-		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
-		protoitt = build_itt(ptmp_itt, conn->session->age);
-		task = iscsi_itt_to_task(conn, protoitt);
 		QEDI_ERR(&qedi->dbg_ctx,
-			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x, task=%p\n",
-			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id, task);
+			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x\n",
+			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id);
 	}
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts
  2021-04-16  2:04 ` [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts Mike Christie
@ 2021-04-16 11:39     ` kernel test robot
  0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2021-04-16 11:39 UTC (permalink / raw)
  To: Mike Christie, lduncan, martin.petersen, mrangankar, svernekar,
	linux-scsi, jejb
  Cc: kbuild-all, Mike Christie

[-- Attachment #1: Type: text/plain, Size: 12085 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on scsi/for-next]
[also build test WARNING on next-20210415]
[cannot apply to mkp-scsi/for-next rdma/for-next v5.12-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: alpha-randconfig-r016-20210416 (attached as .config)
compiler: alpha-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/9d4a83c1316e3dad2bd5687563584509a3d6557c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636
        git checkout 9d4a83c1316e3dad2bd5687563584509a3d6557c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=alpha 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/scsi/qedi/qedi_fw.c: In function 'qedi_process_cmd_cleanup_resp':
>> drivers/scsi/qedi/qedi_fw.c:741:6: warning: variable 'rtid' set but not used [-Wunused-but-set-variable]
     741 |  u32 rtid = 0;
         |      ^~~~


vim +/rtid +741 drivers/scsi/qedi/qedi_fw.c

ace7f46ba5fde7 Manish Rangankar 2016-12-01  729  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  730  static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  731  					  struct iscsi_cqe_solicited *cqe,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  732  					  struct iscsi_task *task,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  733  					  struct iscsi_conn *conn)
ace7f46ba5fde7 Manish Rangankar 2016-12-01  734  {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  735  	struct qedi_work_map *work, *work_tmp;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  736  	u32 proto_itt = cqe->itid;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  737  	u32 ptmp_itt = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  738  	itt_t protoitt = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  739  	int found = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  740  	struct qedi_cmd *qedi_cmd = NULL;
ace7f46ba5fde7 Manish Rangankar 2016-12-01 @741  	u32 rtid = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  742  	u32 iscsi_cid;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  743  	struct qedi_conn *qedi_conn;
8712f467d4a560 Christos Gkekas  2017-10-14  744  	struct qedi_cmd *dbg_cmd;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  745  	struct iscsi_task *mtask;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  746  	struct iscsi_tm *tmf_hdr = NULL;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  747  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  748  	iscsi_cid = cqe->conn_id;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  749  	qedi_conn = qedi->cid_que.conn_cid_tbl[iscsi_cid];
967823d6c3980a Manish Rangankar 2018-02-26  750  	if (!qedi_conn) {
967823d6c3980a Manish Rangankar 2018-02-26  751  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
967823d6c3980a Manish Rangankar 2018-02-26  752  			  "icid not found 0x%x\n", cqe->conn_id);
967823d6c3980a Manish Rangankar 2018-02-26  753  		return;
967823d6c3980a Manish Rangankar 2018-02-26  754  	}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  755  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  756  	/* Based on this itt get the corresponding qedi_cmd */
ace7f46ba5fde7 Manish Rangankar 2016-12-01  757  	spin_lock_bh(&qedi_conn->tmf_work_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  758  	list_for_each_entry_safe(work, work_tmp, &qedi_conn->tmf_work_list,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  759  				 list) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  760  		if (work->rtid == proto_itt) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  761  			/* We found the command */
ace7f46ba5fde7 Manish Rangankar 2016-12-01  762  			qedi_cmd = work->qedi_cmd;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  763  			if (!qedi_cmd->list_tmf_work) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  764  				QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  765  					  "TMF work not found, cqe->tid=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  766  					  proto_itt, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  767  				WARN_ON(1);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  768  			}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  769  			found = 1;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  770  			mtask = qedi_cmd->task;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  771  			tmf_hdr = (struct iscsi_tm *)mtask->hdr;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  772  			rtid = work->rtid;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  773  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  774  			list_del_init(&work->list);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  775  			kfree(work);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  776  			qedi_cmd->list_tmf_work = NULL;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  777  		}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  778  	}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  779  	spin_unlock_bh(&qedi_conn->tmf_work_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  780  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  781  	if (found) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  782  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  783  			  "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  784  			  proto_itt, tmf_hdr->flags, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  785  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  786  		if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
ace7f46ba5fde7 Manish Rangankar 2016-12-01  787  		    ISCSI_TM_FUNC_ABORT_TASK) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  788  			spin_lock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  789  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  790  			protoitt = build_itt(get_itt(tmf_hdr->rtt),
ace7f46ba5fde7 Manish Rangankar 2016-12-01  791  					     conn->session->age);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  792  			task = iscsi_itt_to_task(conn, protoitt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  793  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  794  			spin_unlock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  795  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  796  			if (!task) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  797  				QEDI_NOTICE(&qedi->dbg_ctx,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  798  					    "IO task completed, tmf rtt=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  799  					    get_itt(tmf_hdr->rtt),
ace7f46ba5fde7 Manish Rangankar 2016-12-01  800  					    qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  801  				return;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  802  			}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  803  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  804  			dbg_cmd = task->dd_data;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  805  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  806  			QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  807  				  "Abort tmf rtt=0x%x, i/o itt=0x%x, i/o tid=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  808  				  get_itt(tmf_hdr->rtt), get_itt(task->itt),
ace7f46ba5fde7 Manish Rangankar 2016-12-01  809  				  dbg_cmd->task_id, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  810  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  811  			if (qedi_cmd->state == CLEANUP_WAIT_FAILED)
ace7f46ba5fde7 Manish Rangankar 2016-12-01  812  				qedi_cmd->state = CLEANUP_RECV;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  813  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  814  			spin_lock(&qedi_conn->list_lock);
28b35d17f9f857 Nilesh Javali    2020-09-08  815  			if (likely(dbg_cmd->io_cmd_in_list)) {
28b35d17f9f857 Nilesh Javali    2020-09-08  816  				dbg_cmd->io_cmd_in_list = false;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  817  				list_del_init(&dbg_cmd->io_cmd);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  818  				qedi_conn->active_cmd_count--;
28b35d17f9f857 Nilesh Javali    2020-09-08  819  			}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  820  			spin_unlock(&qedi_conn->list_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  821  			qedi_cmd->state = CLEANUP_RECV;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  822  			wake_up_interruptible(&qedi_conn->wait_queue);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  823  		}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  824  	} else if (qedi_conn->cmd_cleanup_req > 0) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  825  		spin_lock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  826  		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  827  		protoitt = build_itt(ptmp_itt, conn->session->age);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  828  		task = iscsi_itt_to_task(conn, protoitt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  829  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  830  			  "cleanup io itid=0x%x, protoitt=0x%x, cmd_cleanup_cmpl=%d, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  831  			  cqe->itid, protoitt, qedi_conn->cmd_cleanup_cmpl,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  832  			  qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  833  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  834  		spin_unlock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  835  		if (!task) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  836  			QEDI_NOTICE(&qedi->dbg_ctx,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  837  				    "task is null, itid=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  838  				    cqe->itid, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  839  			return;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  840  		}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  841  		qedi_conn->cmd_cleanup_cmpl++;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  842  		wake_up(&qedi_conn->wait_queue);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  843  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  844  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  845  			  "Freeing tid=0x%x for cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  846  			  cqe->itid, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  847  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  848  	} else {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  849  		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  850  		protoitt = build_itt(ptmp_itt, conn->session->age);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  851  		task = iscsi_itt_to_task(conn, protoitt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  852  		QEDI_ERR(&qedi->dbg_ctx,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  853  			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x, task=%p\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  854  			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id, task);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  855  	}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  856  }
ace7f46ba5fde7 Manish Rangankar 2016-12-01  857  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29656 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts
@ 2021-04-16 11:39     ` kernel test robot
  0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2021-04-16 11:39 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 12258 bytes --]

Hi Mike,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on scsi/for-next]
[also build test WARNING on next-20210415]
[cannot apply to mkp-scsi/for-next rdma/for-next v5.12-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: alpha-randconfig-r016-20210416 (attached as .config)
compiler: alpha-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/9d4a83c1316e3dad2bd5687563584509a3d6557c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636
        git checkout 9d4a83c1316e3dad2bd5687563584509a3d6557c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=alpha 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/scsi/qedi/qedi_fw.c: In function 'qedi_process_cmd_cleanup_resp':
>> drivers/scsi/qedi/qedi_fw.c:741:6: warning: variable 'rtid' set but not used [-Wunused-but-set-variable]
     741 |  u32 rtid = 0;
         |      ^~~~


vim +/rtid +741 drivers/scsi/qedi/qedi_fw.c

ace7f46ba5fde7 Manish Rangankar 2016-12-01  729  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  730  static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  731  					  struct iscsi_cqe_solicited *cqe,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  732  					  struct iscsi_task *task,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  733  					  struct iscsi_conn *conn)
ace7f46ba5fde7 Manish Rangankar 2016-12-01  734  {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  735  	struct qedi_work_map *work, *work_tmp;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  736  	u32 proto_itt = cqe->itid;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  737  	u32 ptmp_itt = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  738  	itt_t protoitt = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  739  	int found = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  740  	struct qedi_cmd *qedi_cmd = NULL;
ace7f46ba5fde7 Manish Rangankar 2016-12-01 @741  	u32 rtid = 0;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  742  	u32 iscsi_cid;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  743  	struct qedi_conn *qedi_conn;
8712f467d4a560 Christos Gkekas  2017-10-14  744  	struct qedi_cmd *dbg_cmd;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  745  	struct iscsi_task *mtask;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  746  	struct iscsi_tm *tmf_hdr = NULL;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  747  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  748  	iscsi_cid = cqe->conn_id;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  749  	qedi_conn = qedi->cid_que.conn_cid_tbl[iscsi_cid];
967823d6c3980a Manish Rangankar 2018-02-26  750  	if (!qedi_conn) {
967823d6c3980a Manish Rangankar 2018-02-26  751  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
967823d6c3980a Manish Rangankar 2018-02-26  752  			  "icid not found 0x%x\n", cqe->conn_id);
967823d6c3980a Manish Rangankar 2018-02-26  753  		return;
967823d6c3980a Manish Rangankar 2018-02-26  754  	}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  755  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  756  	/* Based on this itt get the corresponding qedi_cmd */
ace7f46ba5fde7 Manish Rangankar 2016-12-01  757  	spin_lock_bh(&qedi_conn->tmf_work_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  758  	list_for_each_entry_safe(work, work_tmp, &qedi_conn->tmf_work_list,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  759  				 list) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  760  		if (work->rtid == proto_itt) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  761  			/* We found the command */
ace7f46ba5fde7 Manish Rangankar 2016-12-01  762  			qedi_cmd = work->qedi_cmd;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  763  			if (!qedi_cmd->list_tmf_work) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  764  				QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  765  					  "TMF work not found, cqe->tid=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  766  					  proto_itt, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  767  				WARN_ON(1);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  768  			}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  769  			found = 1;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  770  			mtask = qedi_cmd->task;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  771  			tmf_hdr = (struct iscsi_tm *)mtask->hdr;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  772  			rtid = work->rtid;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  773  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  774  			list_del_init(&work->list);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  775  			kfree(work);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  776  			qedi_cmd->list_tmf_work = NULL;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  777  		}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  778  	}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  779  	spin_unlock_bh(&qedi_conn->tmf_work_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  780  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  781  	if (found) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  782  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  783  			  "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  784  			  proto_itt, tmf_hdr->flags, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  785  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  786  		if ((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) ==
ace7f46ba5fde7 Manish Rangankar 2016-12-01  787  		    ISCSI_TM_FUNC_ABORT_TASK) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  788  			spin_lock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  789  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  790  			protoitt = build_itt(get_itt(tmf_hdr->rtt),
ace7f46ba5fde7 Manish Rangankar 2016-12-01  791  					     conn->session->age);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  792  			task = iscsi_itt_to_task(conn, protoitt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  793  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  794  			spin_unlock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  795  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  796  			if (!task) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  797  				QEDI_NOTICE(&qedi->dbg_ctx,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  798  					    "IO task completed, tmf rtt=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  799  					    get_itt(tmf_hdr->rtt),
ace7f46ba5fde7 Manish Rangankar 2016-12-01  800  					    qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  801  				return;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  802  			}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  803  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  804  			dbg_cmd = task->dd_data;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  805  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  806  			QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  807  				  "Abort tmf rtt=0x%x, i/o itt=0x%x, i/o tid=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  808  				  get_itt(tmf_hdr->rtt), get_itt(task->itt),
ace7f46ba5fde7 Manish Rangankar 2016-12-01  809  				  dbg_cmd->task_id, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  810  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  811  			if (qedi_cmd->state == CLEANUP_WAIT_FAILED)
ace7f46ba5fde7 Manish Rangankar 2016-12-01  812  				qedi_cmd->state = CLEANUP_RECV;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  813  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  814  			spin_lock(&qedi_conn->list_lock);
28b35d17f9f857 Nilesh Javali    2020-09-08  815  			if (likely(dbg_cmd->io_cmd_in_list)) {
28b35d17f9f857 Nilesh Javali    2020-09-08  816  				dbg_cmd->io_cmd_in_list = false;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  817  				list_del_init(&dbg_cmd->io_cmd);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  818  				qedi_conn->active_cmd_count--;
28b35d17f9f857 Nilesh Javali    2020-09-08  819  			}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  820  			spin_unlock(&qedi_conn->list_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  821  			qedi_cmd->state = CLEANUP_RECV;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  822  			wake_up_interruptible(&qedi_conn->wait_queue);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  823  		}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  824  	} else if (qedi_conn->cmd_cleanup_req > 0) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  825  		spin_lock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  826  		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  827  		protoitt = build_itt(ptmp_itt, conn->session->age);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  828  		task = iscsi_itt_to_task(conn, protoitt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  829  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  830  			  "cleanup io itid=0x%x, protoitt=0x%x, cmd_cleanup_cmpl=%d, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  831  			  cqe->itid, protoitt, qedi_conn->cmd_cleanup_cmpl,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  832  			  qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  833  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  834  		spin_unlock_bh(&conn->session->back_lock);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  835  		if (!task) {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  836  			QEDI_NOTICE(&qedi->dbg_ctx,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  837  				    "task is null, itid=0x%x, cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  838  				    cqe->itid, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  839  			return;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  840  		}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  841  		qedi_conn->cmd_cleanup_cmpl++;
ace7f46ba5fde7 Manish Rangankar 2016-12-01  842  		wake_up(&qedi_conn->wait_queue);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  843  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  844  		QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_TID,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  845  			  "Freeing tid=0x%x for cid=0x%x\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  846  			  cqe->itid, qedi_conn->iscsi_conn_id);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  847  
ace7f46ba5fde7 Manish Rangankar 2016-12-01  848  	} else {
ace7f46ba5fde7 Manish Rangankar 2016-12-01  849  		qedi_get_proto_itt(qedi, cqe->itid, &ptmp_itt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  850  		protoitt = build_itt(ptmp_itt, conn->session->age);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  851  		task = iscsi_itt_to_task(conn, protoitt);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  852  		QEDI_ERR(&qedi->dbg_ctx,
ace7f46ba5fde7 Manish Rangankar 2016-12-01  853  			 "Delayed or untracked cleanup response, itt=0x%x, tid=0x%x, cid=0x%x, task=%p\n",
ace7f46ba5fde7 Manish Rangankar 2016-12-01  854  			 protoitt, cqe->itid, qedi_conn->iscsi_conn_id, task);
ace7f46ba5fde7 Manish Rangankar 2016-12-01  855  	}
ace7f46ba5fde7 Manish Rangankar 2016-12-01  856  }
ace7f46ba5fde7 Manish Rangankar 2016-12-01  857  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 29656 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts
  2021-04-16 11:39     ` kernel test robot
@ 2021-04-16 15:23       ` michael.christie
  -1 siblings, 0 replies; 31+ messages in thread
From: michael.christie @ 2021-04-16 15:23 UTC (permalink / raw)
  To: kernel test robot, lduncan, martin.petersen, mrangankar,
	svernekar, linux-scsi, jejb
  Cc: kbuild-all

On 4/16/21 6:39 AM, kernel test robot wrote:
> Hi Mike,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on scsi/for-next]
> [also build test WARNING on next-20210415]
> [cannot apply to mkp-scsi/for-next rdma/for-next v5.12-rc7]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://urldefense.com/v3/__https://git-scm.com/docs/git-format-patch__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2IFRM6gm$ ]
> 
> url:    https://urldefense.com/v3/__https://github.com/0day-ci/linux/commits/Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2ORa91HY$ 
> base:   https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2KfCj7gr$  for-next
> config: alpha-randconfig-r016-20210416 (attached as .config)
> compiler: alpha-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://urldefense.com/v3/__https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2CW-OqAH$  -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://urldefense.com/v3/__https://github.com/0day-ci/linux/commit/9d4a83c1316e3dad2bd5687563584509a3d6557c__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2PS1L82b$ 
>         git remote add linux-review https://urldefense.com/v3/__https://github.com/0day-ci/linux__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2PxZzIp1$ 
>         git fetch --no-tags linux-review Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636
>         git checkout 9d4a83c1316e3dad2bd5687563584509a3d6557c
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=alpha 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
>    drivers/scsi/qedi/qedi_fw.c: In function 'qedi_process_cmd_cleanup_resp':
>>> drivers/scsi/qedi/qedi_fw.c:741:6: warning: variable 'rtid' set but not used [-Wunused-but-set-variable]
>      741 |  u32 rtid = 0;
>          |      ^~~~

Darn. I used W=12 and just missed it between 2 other old warnings I normally
ignore.

Will fix and resend when Lee reviews the core iscsi patches.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts
@ 2021-04-16 15:23       ` michael.christie
  0 siblings, 0 replies; 31+ messages in thread
From: michael.christie @ 2021-04-16 15:23 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2697 bytes --]

On 4/16/21 6:39 AM, kernel test robot wrote:
> Hi Mike,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on scsi/for-next]
> [also build test WARNING on next-20210415]
> [cannot apply to mkp-scsi/for-next rdma/for-next v5.12-rc7]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://urldefense.com/v3/__https://git-scm.com/docs/git-format-patch__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2IFRM6gm$ ]
> 
> url:    https://urldefense.com/v3/__https://github.com/0day-ci/linux/commits/Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2ORa91HY$ 
> base:   https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2KfCj7gr$  for-next
> config: alpha-randconfig-r016-20210416 (attached as .config)
> compiler: alpha-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://urldefense.com/v3/__https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2CW-OqAH$  -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://urldefense.com/v3/__https://github.com/0day-ci/linux/commit/9d4a83c1316e3dad2bd5687563584509a3d6557c__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2PS1L82b$ 
>         git remote add linux-review https://urldefense.com/v3/__https://github.com/0day-ci/linux__;!!GqivPVa7Brio!K_yd3PgNpxVzRkCVNpUtU5QvHwf3ZeBn4YApf8XPaoGgBtPOUbOk2psKJy1r2PxZzIp1$ 
>         git fetch --no-tags linux-review Mike-Christie/libicsi-and-qedi-TMF-fixes/20210416-100636
>         git checkout 9d4a83c1316e3dad2bd5687563584509a3d6557c
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=alpha 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
>    drivers/scsi/qedi/qedi_fw.c: In function 'qedi_process_cmd_cleanup_resp':
>>> drivers/scsi/qedi/qedi_fw.c:741:6: warning: variable 'rtid' set but not used [-Wunused-but-set-variable]
>      741 |  u32 rtid = 0;
>          |      ^~~~

Darn. I used W=12 and just missed it between 2 other old warnings I normally
ignore.

Will fix and resend when Lee reviews the core iscsi patches.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 01/17] scsi: iscsi: add task completion helper
  2021-04-16  2:04 ` [PATCH v3 01/17] scsi: iscsi: add task completion helper Mike Christie
@ 2021-04-16 21:56   ` Lee Duncan
  0 siblings, 0 replies; 31+ messages in thread
From: Lee Duncan @ 2021-04-16 21:56 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> This adds a helper to detect if a cmd has completed but not yet freed.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  include/scsi/libiscsi.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
> index 02f966e9358f..8c6d358a8abc 100644
> --- a/include/scsi/libiscsi.h
> +++ b/include/scsi/libiscsi.h
> @@ -145,6 +145,13 @@ static inline void* iscsi_next_hdr(struct iscsi_task *task)
>  	return (void*)task->hdr + task->hdr_len;
>  }
>  
> +static inline bool iscsi_task_is_completed(struct iscsi_task *task)
> +{
> +	return task->state == ISCSI_TASK_COMPLETED ||
> +	       task->state == ISCSI_TASK_ABRT_TMF ||
> +	       task->state == ISCSI_TASK_ABRT_SESS_RECOV;
> +}
> +
>  /* Connection's states */
>  enum {
>  	ISCSI_CONN_INITIAL_STAGE,
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup
  2021-04-16  2:04 ` [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup Mike Christie
@ 2021-04-17 17:22   ` Lee Duncan
  2021-04-17 17:26     ` Mike Christie
  0 siblings, 1 reply; 31+ messages in thread
From: Lee Duncan @ 2021-04-17 17:22 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> If we are handling a sg io reset there is a small window where cmds can
> sneak into iscsi_queuecommand even though libiscsi has sent a TMF to the
> driver. This does seems to not be a problem for normal drivers because they
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"doesn't seem to be a problem"?

> know exactly what was sent to the FW. For libiscsi this is a problem
> because it knows what it has sent to the driver but not what the driver
> sent to the FW. When we go to cleanup the running tasks, libiscsi might
> cleanup the sneaky cmd but the driver/FQ may not have seen the sneaky cmd

FO?

> and it's left running in FW.
> 
> This has libiscsi just stop queueing cmds when it sends the TMF down to the
> driver. Both then know they see the same set of cmds.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/libiscsi.c | 104 +++++++++++++++++++++++++++-------------
>  1 file changed, 72 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 4834219497ee..aa5ceaffc697 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1669,29 +1669,11 @@ enum {
>  	FAILURE_SESSION_NOT_READY,
>  };
>  
> -int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
> +static bool iscsi_eh_running(struct iscsi_conn *conn, struct scsi_cmnd *sc,
> +			     int *reason)
>  {
> -	struct iscsi_cls_session *cls_session;
> -	struct iscsi_host *ihost;
> -	int reason = 0;
> -	struct iscsi_session *session;
> -	struct iscsi_conn *conn;
> -	struct iscsi_task *task = NULL;
> -
> -	sc->result = 0;
> -	sc->SCp.ptr = NULL;
> -
> -	ihost = shost_priv(host);
> -
> -	cls_session = starget_to_session(scsi_target(sc->device));
> -	session = cls_session->dd_data;
> -	spin_lock_bh(&session->frwd_lock);
> -
> -	reason = iscsi_session_chkready(cls_session);
> -	if (reason) {
> -		sc->result = reason;
> -		goto fault;
> -	}
> +	struct iscsi_session *session = conn->session;
> +	struct iscsi_tm *tmf;
>  
>  	if (session->state != ISCSI_STATE_LOGGED_IN) {
>  		/*
> @@ -1707,31 +1689,92 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
>  			 * state is bad, allowing completion to happen
>  			 */
>  			if (unlikely(system_state != SYSTEM_RUNNING)) {
> -				reason = FAILURE_SESSION_FAILED;
> +				*reason = FAILURE_SESSION_FAILED;
>  				sc->result = DID_NO_CONNECT << 16;
>  				break;
>  			}
>  			fallthrough;
>  		case ISCSI_STATE_IN_RECOVERY:
> -			reason = FAILURE_SESSION_IN_RECOVERY;
> +			*reason = FAILURE_SESSION_IN_RECOVERY;
>  			sc->result = DID_IMM_RETRY << 16;
>  			break;
>  		case ISCSI_STATE_LOGGING_OUT:
> -			reason = FAILURE_SESSION_LOGGING_OUT;
> +			*reason = FAILURE_SESSION_LOGGING_OUT;
>  			sc->result = DID_IMM_RETRY << 16;
>  			break;
>  		case ISCSI_STATE_RECOVERY_FAILED:
> -			reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
> +			*reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
>  			sc->result = DID_TRANSPORT_FAILFAST << 16;
>  			break;
>  		case ISCSI_STATE_TERMINATE:
> -			reason = FAILURE_SESSION_TERMINATE;
> +			*reason = FAILURE_SESSION_TERMINATE;
>  			sc->result = DID_NO_CONNECT << 16;
>  			break;
>  		default:
> -			reason = FAILURE_SESSION_FREED;
> +			*reason = FAILURE_SESSION_FREED;
>  			sc->result = DID_NO_CONNECT << 16;
>  		}
> +		goto eh_running;
> +	}
> +
> +	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
> +		*reason = FAILURE_SESSION_IN_RECOVERY;
> +		sc->result = DID_REQUEUE << 16;
> +		goto eh_running;
> +	}
> +
> +	/*
> +	 * For sg resets make sure libiscsi, the drivers and their fw see the
> +	 * same cmds. Once we get a TMF that can affect multiple cmds stop
> +	 * queueing.
> +	 */
> +	if (conn->tmf_state != TMF_INITIAL) {
> +		tmf = &conn->tmhdr;
> +
> +		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
> +		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
> +			if (sc->device->lun != scsilun_to_int(&tmf->lun))
> +				break;
> +
> +			ISCSI_DBG_EH(conn->session,
> +				     "Requeue cmd sent during LU RESET processing.\n");
> +			sc->result = DID_REQUEUE << 16;
> +			goto eh_running;
> +		case ISCSI_TM_FUNC_TARGET_WARM_RESET:
> +			ISCSI_DBG_EH(conn->session,
> +				     "Requeue cmd sent during TARGET RESET processing.\n");
> +			sc->result = DID_REQUEUE << 16;
> +			goto eh_running;
> +		}

Is it common to have no default case? I thought the compiler disliked
that. If the compiler and convention are fine with this, I'm fine with
it too.

> +	}
> +
> +	return false;
> +
> +eh_running:
> +	return true;
> +}
> +
> +int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
> +{
> +	struct iscsi_cls_session *cls_session;
> +	struct iscsi_host *ihost;
> +	int reason = 0;
> +	struct iscsi_session *session;
> +	struct iscsi_conn *conn;
> +	struct iscsi_task *task = NULL;
> +
> +	sc->result = 0;
> +	sc->SCp.ptr = NULL;
> +
> +	ihost = shost_priv(host);
> +
> +	cls_session = starget_to_session(scsi_target(sc->device));
> +	session = cls_session->dd_data;
> +	spin_lock_bh(&session->frwd_lock);
> +
> +	reason = iscsi_session_chkready(cls_session);
> +	if (reason) {
> +		sc->result = reason;
>  		goto fault;
>  	}
>  
> @@ -1742,11 +1785,8 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
>  		goto fault;
>  	}
>  
> -	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
> -		reason = FAILURE_SESSION_IN_RECOVERY;
> -		sc->result = DID_REQUEUE << 16;
> +	if (iscsi_eh_running(conn, sc, &reason))
>  		goto fault;
> -	}
>  
>  	if (iscsi_check_cmdsn_window_closed(conn)) {
>  		reason = FAILURE_WINDOW_CLOSED;
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup
  2021-04-17 17:22   ` Lee Duncan
@ 2021-04-17 17:26     ` Mike Christie
  0 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-17 17:26 UTC (permalink / raw)
  To: Lee Duncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/17/21 12:22 PM, Lee Duncan wrote:
> On 4/15/21 7:04 PM, Mike Christie wrote:
>> If we are handling a sg io reset there is a small window where cmds can
>> sneak into iscsi_queuecommand even though libiscsi has sent a TMF to the
>> driver. This does seems to not be a problem for normal drivers because they
>                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> "doesn't seem to be a problem"?

Yeah.

> 
>> know exactly what was sent to the FW. For libiscsi this is a problem
>> because it knows what it has sent to the driver but not what the driver
>> sent to the FW. When we go to cleanup the running tasks, libiscsi might
>> cleanup the sneaky cmd but the driver/FQ may not have seen the sneaky cmd
> 
> FO?
> 

FW.

>> and it's left running in FW.
>>
>> This has libiscsi just stop queueing cmds when it sends the TMF down to the
>> driver. Both then know they see the same set of cmds.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>  drivers/scsi/libiscsi.c | 104 +++++++++++++++++++++++++++-------------
>>  1 file changed, 72 insertions(+), 32 deletions(-)
>>
>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>> index 4834219497ee..aa5ceaffc697 100644
>> --- a/drivers/scsi/libiscsi.c
>> +++ b/drivers/scsi/libiscsi.c
>> @@ -1669,29 +1669,11 @@ enum {
>>  	FAILURE_SESSION_NOT_READY,
>>  };
>>  
>> -int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
>> +static bool iscsi_eh_running(struct iscsi_conn *conn, struct scsi_cmnd *sc,
>> +			     int *reason)
>>  {
>> -	struct iscsi_cls_session *cls_session;
>> -	struct iscsi_host *ihost;
>> -	int reason = 0;
>> -	struct iscsi_session *session;
>> -	struct iscsi_conn *conn;
>> -	struct iscsi_task *task = NULL;
>> -
>> -	sc->result = 0;
>> -	sc->SCp.ptr = NULL;
>> -
>> -	ihost = shost_priv(host);
>> -
>> -	cls_session = starget_to_session(scsi_target(sc->device));
>> -	session = cls_session->dd_data;
>> -	spin_lock_bh(&session->frwd_lock);
>> -
>> -	reason = iscsi_session_chkready(cls_session);
>> -	if (reason) {
>> -		sc->result = reason;
>> -		goto fault;
>> -	}
>> +	struct iscsi_session *session = conn->session;
>> +	struct iscsi_tm *tmf;
>>  
>>  	if (session->state != ISCSI_STATE_LOGGED_IN) {
>>  		/*
>> @@ -1707,31 +1689,92 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc)
>>  			 * state is bad, allowing completion to happen
>>  			 */
>>  			if (unlikely(system_state != SYSTEM_RUNNING)) {
>> -				reason = FAILURE_SESSION_FAILED;
>> +				*reason = FAILURE_SESSION_FAILED;
>>  				sc->result = DID_NO_CONNECT << 16;
>>  				break;
>>  			}
>>  			fallthrough;
>>  		case ISCSI_STATE_IN_RECOVERY:
>> -			reason = FAILURE_SESSION_IN_RECOVERY;
>> +			*reason = FAILURE_SESSION_IN_RECOVERY;
>>  			sc->result = DID_IMM_RETRY << 16;
>>  			break;
>>  		case ISCSI_STATE_LOGGING_OUT:
>> -			reason = FAILURE_SESSION_LOGGING_OUT;
>> +			*reason = FAILURE_SESSION_LOGGING_OUT;
>>  			sc->result = DID_IMM_RETRY << 16;
>>  			break;
>>  		case ISCSI_STATE_RECOVERY_FAILED:
>> -			reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
>> +			*reason = FAILURE_SESSION_RECOVERY_TIMEOUT;
>>  			sc->result = DID_TRANSPORT_FAILFAST << 16;
>>  			break;
>>  		case ISCSI_STATE_TERMINATE:
>> -			reason = FAILURE_SESSION_TERMINATE;
>> +			*reason = FAILURE_SESSION_TERMINATE;
>>  			sc->result = DID_NO_CONNECT << 16;
>>  			break;
>>  		default:
>> -			reason = FAILURE_SESSION_FREED;
>> +			*reason = FAILURE_SESSION_FREED;
>>  			sc->result = DID_NO_CONNECT << 16;
>>  		}
>> +		goto eh_running;
>> +	}
>> +
>> +	if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
>> +		*reason = FAILURE_SESSION_IN_RECOVERY;
>> +		sc->result = DID_REQUEUE << 16;
>> +		goto eh_running;
>> +	}
>> +
>> +	/*
>> +	 * For sg resets make sure libiscsi, the drivers and their fw see the
>> +	 * same cmds. Once we get a TMF that can affect multiple cmds stop
>> +	 * queueing.
>> +	 */
>> +	if (conn->tmf_state != TMF_INITIAL) {
>> +		tmf = &conn->tmhdr;
>> +
>> +		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
>> +		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
>> +			if (sc->device->lun != scsilun_to_int(&tmf->lun))
>> +				break;
>> +
>> +			ISCSI_DBG_EH(conn->session,
>> +				     "Requeue cmd sent during LU RESET processing.\n");
>> +			sc->result = DID_REQUEUE << 16;
>> +			goto eh_running;
>> +		case ISCSI_TM_FUNC_TARGET_WARM_RESET:
>> +			ISCSI_DBG_EH(conn->session,
>> +				     "Requeue cmd sent during TARGET RESET processing.\n");
>> +			sc->result = DID_REQUEUE << 16;
>> +			goto eh_running;
>> +		}
> 
> Is it common to have no default case? I thought the compiler disliked
> that. If the compiler and convention are fine with this, I'm fine with
> it too.
> 

There is no compiler warnings or errors.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 03/17] scsi: iscsi: stop queueing during ep_disconnect
  2021-04-16  2:04 ` [PATCH v3 03/17] scsi: iscsi: stop queueing during ep_disconnect Mike Christie
@ 2021-04-20 14:28   ` Lee Duncan
  0 siblings, 0 replies; 31+ messages in thread
From: Lee Duncan @ 2021-04-20 14:28 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> During ep_disconnect we have been doing iscsi_suspend_tx/queue to block
> new IO but every driver except cxgbi and iscsi_tcp can still get IO from
> __iscsi_conn_send_pdu if we haven't called iscsi_conn_failure before
> ep_disconnect. This could happen if we were terminating the session, and
> the logout timedout before it was even sent to libiscsi.
> 
> This patch fixes the issue by adding a helper which reverses the bind_conn
> call that allows new IO to be queued. Drivers implementing ep_disconnect
> can use this to make sure new IO is not queued to them when handling the
> disconnect.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/infiniband/ulp/iser/iscsi_iser.c |  1 +
>  drivers/scsi/be2iscsi/be_main.c          |  1 +
>  drivers/scsi/bnx2i/bnx2i_iscsi.c         |  1 +
>  drivers/scsi/cxgbi/cxgb3i/cxgb3i.c       |  1 +
>  drivers/scsi/cxgbi/cxgb4i/cxgb4i.c       |  1 +
>  drivers/scsi/libiscsi.c                  | 61 +++++++++++++++++++++---
>  drivers/scsi/qedi/qedi_iscsi.c           |  1 +
>  drivers/scsi/qla4xxx/ql4_os.c            |  1 +
>  drivers/scsi/scsi_transport_iscsi.c      |  3 ++
>  include/scsi/libiscsi.h                  |  1 +
>  include/scsi/scsi_transport_iscsi.h      |  1 +
>  11 files changed, 67 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
> index 8fcaa1136f2c..6baebcb6d14d 100644
> --- a/drivers/infiniband/ulp/iser/iscsi_iser.c
> +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
> @@ -1002,6 +1002,7 @@ static struct iscsi_transport iscsi_iser_transport = {
>  	/* connection management */
>  	.create_conn            = iscsi_iser_conn_create,
>  	.bind_conn              = iscsi_iser_conn_bind,
> +	.unbind_conn		= iscsi_conn_unbind,
>  	.destroy_conn           = iscsi_conn_teardown,
>  	.attr_is_visible	= iser_attr_is_visible,
>  	.set_param              = iscsi_iser_set_param,
> diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
> index 90fcddb76f46..e9658a67d9da 100644
> --- a/drivers/scsi/be2iscsi/be_main.c
> +++ b/drivers/scsi/be2iscsi/be_main.c
> @@ -5809,6 +5809,7 @@ struct iscsi_transport beiscsi_iscsi_transport = {
>  	.destroy_session = beiscsi_session_destroy,
>  	.create_conn = beiscsi_conn_create,
>  	.bind_conn = beiscsi_conn_bind,
> +	.unbind_conn = iscsi_conn_unbind,
>  	.destroy_conn = iscsi_conn_teardown,
>  	.attr_is_visible = beiscsi_attr_is_visible,
>  	.set_iface_param = beiscsi_iface_set_param,
> diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
> index 1e6d8f62ea3c..b6c1da46d582 100644
> --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
> +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
> @@ -2276,6 +2276,7 @@ struct iscsi_transport bnx2i_iscsi_transport = {
>  	.destroy_session	= bnx2i_session_destroy,
>  	.create_conn		= bnx2i_conn_create,
>  	.bind_conn		= bnx2i_conn_bind,
> +	.unbind_conn		= iscsi_conn_unbind,
>  	.destroy_conn		= bnx2i_conn_destroy,
>  	.attr_is_visible	= bnx2i_attr_is_visible,
>  	.set_param		= iscsi_set_param,
> diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
> index 37d99357120f..edcd3fab6973 100644
> --- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
> +++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
> @@ -117,6 +117,7 @@ static struct iscsi_transport cxgb3i_iscsi_transport = {
>  	/* connection management */
>  	.create_conn	= cxgbi_create_conn,
>  	.bind_conn	= cxgbi_bind_conn,
> +	.unbind_conn	= iscsi_conn_unbind,
>  	.destroy_conn	= iscsi_tcp_conn_teardown,
>  	.start_conn	= iscsi_conn_start,
>  	.stop_conn	= iscsi_conn_stop,
> diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
> index 2c3491528d42..efb3e2b3398e 100644
> --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
> +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
> @@ -134,6 +134,7 @@ static struct iscsi_transport cxgb4i_iscsi_transport = {
>  	/* connection management */
>  	.create_conn	= cxgbi_create_conn,
>  	.bind_conn		= cxgbi_bind_conn,
> +	.unbind_conn	= iscsi_conn_unbind,
>  	.destroy_conn	= iscsi_tcp_conn_teardown,
>  	.start_conn		= iscsi_conn_start,
>  	.stop_conn		= iscsi_conn_stop,
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index aa5ceaffc697..ce3898fdb10f 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1387,22 +1387,28 @@ void iscsi_session_failure(struct iscsi_session *session,
>  }
>  EXPORT_SYMBOL_GPL(iscsi_session_failure);
>  
> -void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err)
> +static void iscsi_set_conn_failed(struct iscsi_conn *conn)
>  {
>  	struct iscsi_session *session = conn->session;
>  
> -	spin_lock_bh(&session->frwd_lock);
> -	if (session->state == ISCSI_STATE_FAILED) {
> -		spin_unlock_bh(&session->frwd_lock);
> +	if (session->state == ISCSI_STATE_FAILED)
>  		return;
> -	}
>  
>  	if (conn->stop_stage == 0)
>  		session->state = ISCSI_STATE_FAILED;
> -	spin_unlock_bh(&session->frwd_lock);
>  
>  	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx);
>  	set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx);
> +}
> +
> +void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err)
> +{
> +	struct iscsi_session *session = conn->session;
> +
> +	spin_lock_bh(&session->frwd_lock);
> +	iscsi_set_conn_failed(conn);
> +	spin_unlock_bh(&session->frwd_lock);
> +
>  	iscsi_conn_error_event(conn->cls_conn, err);
>  }
>  EXPORT_SYMBOL_GPL(iscsi_conn_failure);
> @@ -2220,6 +2226,49 @@ static void iscsi_check_transport_timeouts(struct timer_list *t)
>  	spin_unlock(&session->frwd_lock);
>  }
>  
> +/*
> + * iscsi_conn_unbind - prevent queueing to conn.
> + * @conn: iscsi conn ep is bound to.
> + *
> + * This must be called by drivers implementing the ep_disconnect callout.
> + * It disables queueing to the connection from libiscsi in preparation for
> + * an ep_disconnect call.
> + */
> +void iscsi_conn_unbind(struct iscsi_cls_conn *cls_conn)
> +{
> +	struct iscsi_session *session;
> +	struct iscsi_conn *conn;
> +
> +	if (!cls_conn)
> +		return;
> +
> +	conn = cls_conn->dd_data;
> +	session = conn->session;
> +	/*
> +	 * Wait for iscsi_eh calls to exit. We don't wait for the tmf to
> +	 * complete or timeout. The caller just wants to know what's running
> +	 * is everything that needs to be cleaned up, and no cmds will be
> +	 * queued.
> +	 */
> +	mutex_lock(&session->eh_mutex);
> +
> +	iscsi_suspend_queue(conn);
> +	iscsi_suspend_tx(conn);
> +
> +	spin_lock_bh(&session->frwd_lock);
> +	/*
> +	 * if logout timed out before userspace could even send a PDU the
> +	 * state might still be in ISCSI_STATE_LOGGED_IN and allowing new cmds
> +	 * and TMFs.
> +	 */
> +	if (session->state == ISCSI_STATE_LOGGED_IN)
> +		iscsi_set_conn_failed(conn);
> +
> +	spin_unlock_bh(&session->frwd_lock);
> +	mutex_unlock(&session->eh_mutex);
> +}
> +EXPORT_SYMBOL_GPL(iscsi_conn_unbind);
> +
>  static void iscsi_prep_abort_task_pdu(struct iscsi_task *task,
>  				      struct iscsi_tm *hdr)
>  {
> diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
> index 08c05403cd72..ef16537c523c 100644
> --- a/drivers/scsi/qedi/qedi_iscsi.c
> +++ b/drivers/scsi/qedi/qedi_iscsi.c
> @@ -1401,6 +1401,7 @@ struct iscsi_transport qedi_iscsi_transport = {
>  	.destroy_session = qedi_session_destroy,
>  	.create_conn = qedi_conn_create,
>  	.bind_conn = qedi_conn_bind,
> +	.unbind_conn = iscsi_conn_unbind,
>  	.start_conn = qedi_conn_start,
>  	.stop_conn = iscsi_conn_stop,
>  	.destroy_conn = qedi_conn_destroy,
> diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
> index 7bd9a4a04ad5..ff663cb330c2 100644
> --- a/drivers/scsi/qla4xxx/ql4_os.c
> +++ b/drivers/scsi/qla4xxx/ql4_os.c
> @@ -259,6 +259,7 @@ static struct iscsi_transport qla4xxx_iscsi_transport = {
>  	.start_conn             = qla4xxx_conn_start,
>  	.create_conn            = qla4xxx_conn_create,
>  	.bind_conn              = qla4xxx_conn_bind,
> +	.unbind_conn		= iscsi_conn_unbind,
>  	.stop_conn              = iscsi_conn_stop,
>  	.destroy_conn           = qla4xxx_conn_destroy,
>  	.set_param              = iscsi_set_param,
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 441f0152193f..833114c8e197 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -2981,6 +2981,8 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport *transport,
>  		conn->ep = NULL;
>  		mutex_unlock(&conn->ep_mutex);
>  		conn->state = ISCSI_CONN_FAILED;
> +
> +		transport->unbind_conn(conn);
>  	}
>  
>  	transport->ep_disconnect(ep);
> @@ -4656,6 +4658,7 @@ iscsi_register_transport(struct iscsi_transport *tt)
>  	int err;
>  
>  	BUG_ON(!tt);
> +	WARN_ON(tt->ep_disconnect && !tt->unbind_conn);
>  
>  	priv = iscsi_if_transport_lookup(tt);
>  	if (priv)
> diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
> index 8c6d358a8abc..ec6d508e7a4a 100644
> --- a/include/scsi/libiscsi.h
> +++ b/include/scsi/libiscsi.h
> @@ -431,6 +431,7 @@ extern int iscsi_conn_start(struct iscsi_cls_conn *);
>  extern void iscsi_conn_stop(struct iscsi_cls_conn *, int);
>  extern int iscsi_conn_bind(struct iscsi_cls_session *, struct iscsi_cls_conn *,
>  			   int);
> +extern void iscsi_conn_unbind(struct iscsi_cls_conn *cls_conn);
>  extern void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err);
>  extern void iscsi_session_failure(struct iscsi_session *session,
>  				  enum iscsi_err err);
> diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
> index fc5a39839b4b..afc61a23628d 100644
> --- a/include/scsi/scsi_transport_iscsi.h
> +++ b/include/scsi/scsi_transport_iscsi.h
> @@ -82,6 +82,7 @@ struct iscsi_transport {
>  	void (*destroy_session) (struct iscsi_cls_session *session);
>  	struct iscsi_cls_conn *(*create_conn) (struct iscsi_cls_session *sess,
>  				uint32_t cid);
> +	void (*unbind_conn) (struct iscsi_cls_conn *conn);
>  	int (*bind_conn) (struct iscsi_cls_session *session,
>  			  struct iscsi_cls_conn *cls_conn,
>  			  uint64_t transport_eph, int is_leading);
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 04/17] scsi: iscsi: drop suspend calls from ep_disconnect
  2021-04-16  2:04 ` [PATCH v3 04/17] scsi: iscsi: drop suspend calls from ep_disconnect Mike Christie
@ 2021-04-20 14:29   ` Lee Duncan
  0 siblings, 0 replies; 31+ messages in thread
From: Lee Duncan @ 2021-04-20 14:29 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> libiscsi will now suspend the send/tx queue for the drivers so we can drop
> it from the drivers ep_disconnect.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/be2iscsi/be_iscsi.c | 6 ------
>  drivers/scsi/bnx2i/bnx2i_iscsi.c | 6 +-----
>  drivers/scsi/cxgbi/libcxgbi.c    | 1 -
>  drivers/scsi/qedi/qedi_iscsi.c   | 8 ++++----
>  4 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/be2iscsi/be_iscsi.c b/drivers/scsi/be2iscsi/be_iscsi.c
> index a13c203ef7a9..a03d0ebc2312 100644
> --- a/drivers/scsi/be2iscsi/be_iscsi.c
> +++ b/drivers/scsi/be2iscsi/be_iscsi.c
> @@ -1293,7 +1293,6 @@ static int beiscsi_conn_close(struct beiscsi_endpoint *beiscsi_ep)
>  void beiscsi_ep_disconnect(struct iscsi_endpoint *ep)
>  {
>  	struct beiscsi_endpoint *beiscsi_ep;
> -	struct beiscsi_conn *beiscsi_conn;
>  	struct beiscsi_hba *phba;
>  	uint16_t cri_index;
>  
> @@ -1312,11 +1311,6 @@ void beiscsi_ep_disconnect(struct iscsi_endpoint *ep)
>  		return;
>  	}
>  
> -	if (beiscsi_ep->conn) {
> -		beiscsi_conn = beiscsi_ep->conn;
> -		iscsi_suspend_queue(beiscsi_conn->conn);
> -	}
> -
>  	if (!beiscsi_hba_is_online(phba)) {
>  		beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_CONFIG,
>  			    "BS_%d : HBA in error 0x%lx\n", phba->state);
> diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
> index b6c1da46d582..9a4f4776a78a 100644
> --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
> +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
> @@ -2113,7 +2113,6 @@ static void bnx2i_ep_disconnect(struct iscsi_endpoint *ep)
>  {
>  	struct bnx2i_endpoint *bnx2i_ep;
>  	struct bnx2i_conn *bnx2i_conn = NULL;
> -	struct iscsi_conn *conn = NULL;
>  	struct bnx2i_hba *hba;
>  
>  	bnx2i_ep = ep->dd_data;
> @@ -2126,11 +2125,8 @@ static void bnx2i_ep_disconnect(struct iscsi_endpoint *ep)
>  		!time_after(jiffies, bnx2i_ep->timestamp + (12 * HZ)))
>  		msleep(250);
>  
> -	if (bnx2i_ep->conn) {
> +	if (bnx2i_ep->conn)
>  		bnx2i_conn = bnx2i_ep->conn;
> -		conn = bnx2i_conn->cls_conn->dd_data;
> -		iscsi_suspend_queue(conn);
> -	}
>  	hba = bnx2i_ep->hba;
>  
>  	mutex_lock(&hba->net_dev_lock);
> diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
> index f078b3c4e083..215dd0eb3f48 100644
> --- a/drivers/scsi/cxgbi/libcxgbi.c
> +++ b/drivers/scsi/cxgbi/libcxgbi.c
> @@ -2968,7 +2968,6 @@ void cxgbi_ep_disconnect(struct iscsi_endpoint *ep)
>  		ep, cep, cconn, csk, csk->state, csk->flags);
>  
>  	if (cconn && cconn->iconn) {
> -		iscsi_suspend_tx(cconn->iconn);
>  		write_lock_bh(&csk->callback_lock);
>  		cep->csk->user_data = NULL;
>  		cconn->cep = NULL;
> diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c
> index ef16537c523c..30dc345b011c 100644
> --- a/drivers/scsi/qedi/qedi_iscsi.c
> +++ b/drivers/scsi/qedi/qedi_iscsi.c
> @@ -988,7 +988,6 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
>  {
>  	struct qedi_endpoint *qedi_ep;
>  	struct qedi_conn *qedi_conn = NULL;
> -	struct iscsi_conn *conn = NULL;
>  	struct qedi_ctx *qedi;
>  	int ret = 0;
>  	int wait_delay;
> @@ -1007,8 +1006,6 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep)
>  
>  	if (qedi_ep->conn) {
>  		qedi_conn = qedi_ep->conn;
> -		conn = qedi_conn->cls_conn->dd_data;
> -		iscsi_suspend_queue(conn);
>  		abrt_conn = qedi_conn->abrt_conn;
>  
>  		while (count--)	{
> @@ -1621,8 +1618,11 @@ void qedi_clear_session_ctx(struct iscsi_cls_session *cls_sess)
>  	struct iscsi_conn *conn = session->leadconn;
>  	struct qedi_conn *qedi_conn = conn->dd_data;
>  
> -	if (iscsi_is_session_online(cls_sess))
> +	if (iscsi_is_session_online(cls_sess)) {
> +		if (conn)
> +			iscsi_suspend_queue(conn);
>  		qedi_ep_disconnect(qedi_conn->iscsi_ep);
> +	}
>  
>  	qedi_conn_destroy(qedi_conn->cls_conn);
>  
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn
  2021-04-16  2:04 ` [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn Mike Christie
@ 2021-04-22 15:02   ` Lee Duncan
  2021-04-22 20:09     ` Mike Christie
  0 siblings, 1 reply; 31+ messages in thread
From: Lee Duncan @ 2021-04-22 15:02 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> If we haven't done an unbind target call, we can race during conn
> destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
> still accessing a task while iscsi_conn_teardown is freeing the conn. This
> patch has us wait for all threads to drop their refs to outstanding tasks
> during conn destruction.
> 
> There is also an issue where we could be accessing the conn directly via
> fields like conn->ehwait in the eh callbacks. The next patch will fix
> those.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/libiscsi.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index ce3898fdb10f..ce6d04035c64 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -3120,6 +3120,24 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
>  }
>  EXPORT_SYMBOL_GPL(iscsi_conn_setup);
>  
> +static bool iscsi_session_has_tasks(struct iscsi_session *session)
> +{
> +	struct iscsi_task *task;
> +	int i;
> +
> +	spin_lock_bh(&session->back_lock);
> +	for (i = 0; i < session->cmds_max; i++) {
> +		task = session->cmds[i];
> +
> +		if (task->sc) {
> +			spin_unlock_bh(&session->back_lock);
> +			return true;
> +		}
> +	}
> +	spin_unlock_bh(&session->back_lock);
> +	return false;
> +}
> +
>  /**
>   * iscsi_conn_teardown - teardown iscsi connection
>   * @cls_conn: iscsi class connection
> @@ -3144,7 +3162,17 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
>  		session->state = ISCSI_STATE_TERMINATE;
>  		wake_up(&conn->ehwait);
>  	}
> +
>  	spin_unlock_bh(&session->frwd_lock);
> +	mutex_unlock(&session->eh_mutex);
> +	/*
> +	 * If the caller didn't do a target unbind we could be exiting a
> +	 * scsi-ml entry point that had a task ref. Wait on them here.
> +	 */
> +	while (iscsi_session_has_tasks(session))
> +		msleep(50);

Is there a limit on the amount of time this might spin?

> +
> +	mutex_lock(&session->eh_mutex);
>  
>  	/* flush queued up work because we free the connection below */
>  	iscsi_suspend_tx(conn);
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn
  2021-04-22 15:02   ` Lee Duncan
@ 2021-04-22 20:09     ` Mike Christie
  0 siblings, 0 replies; 31+ messages in thread
From: Mike Christie @ 2021-04-22 20:09 UTC (permalink / raw)
  To: Lee Duncan, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/22/21 10:02 AM, Lee Duncan wrote:
> On 4/15/21 7:04 PM, Mike Christie wrote:
>> If we haven't done an unbind target call, we can race during conn
>> destruction where iscsi_conn_teardown wakes up the eh/abort thread and its
>> still accessing a task while iscsi_conn_teardown is freeing the conn. This
>> patch has us wait for all threads to drop their refs to outstanding tasks
>> during conn destruction.
>>
>> There is also an issue where we could be accessing the conn directly via
>> fields like conn->ehwait in the eh callbacks. The next patch will fix
>> those.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>> ---
>>  drivers/scsi/libiscsi.c | 28 ++++++++++++++++++++++++++++
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>> index ce3898fdb10f..ce6d04035c64 100644
>> --- a/drivers/scsi/libiscsi.c
>> +++ b/drivers/scsi/libiscsi.c
>> @@ -3120,6 +3120,24 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
>>  }
>>  EXPORT_SYMBOL_GPL(iscsi_conn_setup);
>>  
>> +static bool iscsi_session_has_tasks(struct iscsi_session *session)
>> +{
>> +	struct iscsi_task *task;
>> +	int i;
>> +
>> +	spin_lock_bh(&session->back_lock);
>> +	for (i = 0; i < session->cmds_max; i++) {
>> +		task = session->cmds[i];
>> +
>> +		if (task->sc) {
>> +			spin_unlock_bh(&session->back_lock);
>> +			return true;
>> +		}
>> +	}
>> +	spin_unlock_bh(&session->back_lock);
>> +	return false;
>> +}
>> +
>>  /**
>>   * iscsi_conn_teardown - teardown iscsi connection
>>   * @cls_conn: iscsi class connection
>> @@ -3144,7 +3162,17 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
>>  		session->state = ISCSI_STATE_TERMINATE;
>>  		wake_up(&conn->ehwait);
>>  	}
>> +
>>  	spin_unlock_bh(&session->frwd_lock);
>> +	mutex_unlock(&session->eh_mutex);
>> +	/*
>> +	 * If the caller didn't do a target unbind we could be exiting a
>> +	 * scsi-ml entry point that had a task ref. Wait on them here.
>> +	 */
>> +	while (iscsi_session_has_tasks(session))
>> +		msleep(50);
> 
> Is there a limit on the amount of time this might spin?

No.

Are you asking because you think there should be one or because you just wanted
to check?

It's really like the patch description says and is a quick wait.

If you want to add a limit, we can return and leak mem and maybe crash because
we've left some objects running and setup. Maybe just get a warning when we remove
the session. Or we can let the crash happen.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 06/17] scsi: iscsi: fix use conn use after free
  2021-04-16  2:04 ` [PATCH v3 06/17] scsi: iscsi: fix use conn use after free Mike Christie
@ 2021-04-24 21:11   ` Lee Duncan
  0 siblings, 0 replies; 31+ messages in thread
From: Lee Duncan @ 2021-04-24 21:11 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> If we haven't done a unbind target call we can race where
> iscsi_conn_teardown wakes up the eh/abort thread and then frees the conn
> while those threads are still accessing the conn ehwait.
> 
> We can only do one TMF per session so this just moves the TMF fields from
> the conn to the session. We can then rely on the
> iscsi_session_teardown->iscsi_remove_session->__iscsi_unbind_session call
> to remove the target and it's devices, and know after that point there is
> no device or scsi-ml callout trying to access the session.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/libiscsi.c | 123 +++++++++++++++++++---------------------
>  include/scsi/libiscsi.h |  11 ++--
>  2 files changed, 64 insertions(+), 70 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index ce6d04035c64..56b41d8fff02 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -230,11 +230,11 @@ static int iscsi_prep_ecdb_ahs(struct iscsi_task *task)
>   */
>  static int iscsi_check_tmf_restrictions(struct iscsi_task *task, int opcode)
>  {
> -	struct iscsi_conn *conn = task->conn;
> -	struct iscsi_tm *tmf = &conn->tmhdr;
> +	struct iscsi_session *session = task->conn->session;
> +	struct iscsi_tm *tmf = &session->tmhdr;
>  	u64 hdr_lun;
>  
> -	if (conn->tmf_state == TMF_INITIAL)
> +	if (session->tmf_state == TMF_INITIAL)
>  		return 0;
>  
>  	if ((tmf->opcode & ISCSI_OPCODE_MASK) != ISCSI_OP_SCSI_TMFUNC)
> @@ -254,24 +254,19 @@ static int iscsi_check_tmf_restrictions(struct iscsi_task *task, int opcode)
>  		 * Fail all SCSI cmd PDUs
>  		 */
>  		if (opcode != ISCSI_OP_SCSI_DATA_OUT) {
> -			iscsi_conn_printk(KERN_INFO, conn,
> -					  "task [op %x itt "
> -					  "0x%x/0x%x] "
> -					  "rejected.\n",
> -					  opcode, task->itt,
> -					  task->hdr_itt);
> +			iscsi_session_printk(KERN_INFO, session,
> +					"task [op %x itt 0x%x/0x%x] rejected.\n",
> +					opcode, task->itt, task->hdr_itt);
>  			return -EACCES;
>  		}
>  		/*
>  		 * And also all data-out PDUs in response to R2T
>  		 * if fast_abort is set.
>  		 */
> -		if (conn->session->fast_abort) {
> -			iscsi_conn_printk(KERN_INFO, conn,
> -					  "task [op %x itt "
> -					  "0x%x/0x%x] fast abort.\n",
> -					  opcode, task->itt,
> -					  task->hdr_itt);
> +		if (session->fast_abort) {
> +			iscsi_session_printk(KERN_INFO, session,
> +					"task [op %x itt 0x%x/0x%x] fast abort.\n",
> +					opcode, task->itt, task->hdr_itt);
>  			return -EACCES;
>  		}
>  		break;
> @@ -284,7 +279,7 @@ static int iscsi_check_tmf_restrictions(struct iscsi_task *task, int opcode)
>  		 */
>  		if (opcode == ISCSI_OP_SCSI_DATA_OUT &&
>  		    task->hdr_itt == tmf->rtt) {
> -			ISCSI_DBG_SESSION(conn->session,
> +			ISCSI_DBG_SESSION(session,
>  					  "Preventing task %x/%x from sending "
>  					  "data-out due to abort task in "
>  					  "progress\n", task->itt,
> @@ -936,20 +931,21 @@ iscsi_data_in_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr,
>  static void iscsi_tmf_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr)
>  {
>  	struct iscsi_tm_rsp *tmf = (struct iscsi_tm_rsp *)hdr;
> +	struct iscsi_session *session = conn->session;
>  
>  	conn->exp_statsn = be32_to_cpu(hdr->statsn) + 1;
>  	conn->tmfrsp_pdus_cnt++;
>  
> -	if (conn->tmf_state != TMF_QUEUED)
> +	if (session->tmf_state != TMF_QUEUED)
>  		return;
>  
>  	if (tmf->response == ISCSI_TMF_RSP_COMPLETE)
> -		conn->tmf_state = TMF_SUCCESS;
> +		session->tmf_state = TMF_SUCCESS;
>  	else if (tmf->response == ISCSI_TMF_RSP_NO_TASK)
> -		conn->tmf_state = TMF_NOT_FOUND;
> +		session->tmf_state = TMF_NOT_FOUND;
>  	else
> -		conn->tmf_state = TMF_FAILED;
> -	wake_up(&conn->ehwait);
> +		session->tmf_state = TMF_FAILED;
> +	wake_up(&session->ehwait);
>  }
>  
>  static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
> @@ -1734,20 +1730,20 @@ static bool iscsi_eh_running(struct iscsi_conn *conn, struct scsi_cmnd *sc,
>  	 * same cmds. Once we get a TMF that can affect multiple cmds stop
>  	 * queueing.
>  	 */
> -	if (conn->tmf_state != TMF_INITIAL) {
> -		tmf = &conn->tmhdr;
> +	if (session->tmf_state != TMF_INITIAL) {
> +		tmf = &session->tmhdr;
>  
>  		switch (ISCSI_TM_FUNC_VALUE(tmf)) {
>  		case ISCSI_TM_FUNC_LOGICAL_UNIT_RESET:
>  			if (sc->device->lun != scsilun_to_int(&tmf->lun))
>  				break;
>  
> -			ISCSI_DBG_EH(conn->session,
> +			ISCSI_DBG_EH(session,
>  				     "Requeue cmd sent during LU RESET processing.\n");
>  			sc->result = DID_REQUEUE << 16;
>  			goto eh_running;
>  		case ISCSI_TM_FUNC_TARGET_WARM_RESET:
> -			ISCSI_DBG_EH(conn->session,
> +			ISCSI_DBG_EH(session,
>  				     "Requeue cmd sent during TARGET RESET processing.\n");
>  			sc->result = DID_REQUEUE << 16;
>  			goto eh_running;
> @@ -1866,15 +1862,14 @@ EXPORT_SYMBOL_GPL(iscsi_target_alloc);
>  
>  static void iscsi_tmf_timedout(struct timer_list *t)
>  {
> -	struct iscsi_conn *conn = from_timer(conn, t, tmf_timer);
> -	struct iscsi_session *session = conn->session;
> +	struct iscsi_session *session = from_timer(session, t, tmf_timer);
>  
>  	spin_lock(&session->frwd_lock);
> -	if (conn->tmf_state == TMF_QUEUED) {
> -		conn->tmf_state = TMF_TIMEDOUT;
> +	if (session->tmf_state == TMF_QUEUED) {
> +		session->tmf_state = TMF_TIMEDOUT;
>  		ISCSI_DBG_EH(session, "tmf timedout\n");
>  		/* unblock eh_abort() */
> -		wake_up(&conn->ehwait);
> +		wake_up(&session->ehwait);
>  	}
>  	spin_unlock(&session->frwd_lock);
>  }
> @@ -1897,8 +1892,8 @@ static int iscsi_exec_task_mgmt_fn(struct iscsi_conn *conn,
>  		return -EPERM;
>  	}
>  	conn->tmfcmd_pdus_cnt++;
> -	conn->tmf_timer.expires = timeout * HZ + jiffies;
> -	add_timer(&conn->tmf_timer);
> +	session->tmf_timer.expires = timeout * HZ + jiffies;
> +	add_timer(&session->tmf_timer);
>  	ISCSI_DBG_EH(session, "tmf set timeout\n");
>  
>  	spin_unlock_bh(&session->frwd_lock);
> @@ -1912,12 +1907,12 @@ static int iscsi_exec_task_mgmt_fn(struct iscsi_conn *conn,
>  	 * 3) session is terminated or restarted or userspace has
>  	 * given up on recovery
>  	 */
> -	wait_event_interruptible(conn->ehwait, age != session->age ||
> +	wait_event_interruptible(session->ehwait, age != session->age ||
>  				 session->state != ISCSI_STATE_LOGGED_IN ||
> -				 conn->tmf_state != TMF_QUEUED);
> +				 session->tmf_state != TMF_QUEUED);
>  	if (signal_pending(current))
>  		flush_signals(current);
> -	del_timer_sync(&conn->tmf_timer);
> +	del_timer_sync(&session->tmf_timer);
>  
>  	mutex_lock(&session->eh_mutex);
>  	spin_lock_bh(&session->frwd_lock);
> @@ -2347,17 +2342,17 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
>  	}
>  
>  	/* only have one tmf outstanding at a time */
> -	if (conn->tmf_state != TMF_INITIAL)
> +	if (session->tmf_state != TMF_INITIAL)
>  		goto failed;
> -	conn->tmf_state = TMF_QUEUED;
> +	session->tmf_state = TMF_QUEUED;
>  
> -	hdr = &conn->tmhdr;
> +	hdr = &session->tmhdr;
>  	iscsi_prep_abort_task_pdu(task, hdr);
>  
>  	if (iscsi_exec_task_mgmt_fn(conn, hdr, age, session->abort_timeout))
>  		goto failed;
>  
> -	switch (conn->tmf_state) {
> +	switch (session->tmf_state) {
>  	case TMF_SUCCESS:
>  		spin_unlock_bh(&session->frwd_lock);
>  		/*
> @@ -2372,7 +2367,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
>  		 */
>  		spin_lock_bh(&session->frwd_lock);
>  		fail_scsi_task(task, DID_ABORT);
> -		conn->tmf_state = TMF_INITIAL;
> +		session->tmf_state = TMF_INITIAL;
>  		memset(hdr, 0, sizeof(*hdr));
>  		spin_unlock_bh(&session->frwd_lock);
>  		iscsi_start_tx(conn);
> @@ -2383,7 +2378,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
>  		goto failed_unlocked;
>  	case TMF_NOT_FOUND:
>  		if (!sc->SCp.ptr) {
> -			conn->tmf_state = TMF_INITIAL;
> +			session->tmf_state = TMF_INITIAL;
>  			memset(hdr, 0, sizeof(*hdr));
>  			/* task completed before tmf abort response */
>  			ISCSI_DBG_EH(session, "sc completed while abort	in "
> @@ -2392,7 +2387,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc)
>  		}
>  		fallthrough;
>  	default:
> -		conn->tmf_state = TMF_INITIAL;
> +		session->tmf_state = TMF_INITIAL;
>  		goto failed;
>  	}
>  
> @@ -2451,11 +2446,11 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
>  	conn = session->leadconn;
>  
>  	/* only have one tmf outstanding at a time */
> -	if (conn->tmf_state != TMF_INITIAL)
> +	if (session->tmf_state != TMF_INITIAL)
>  		goto unlock;
> -	conn->tmf_state = TMF_QUEUED;
> +	session->tmf_state = TMF_QUEUED;
>  
> -	hdr = &conn->tmhdr;
> +	hdr = &session->tmhdr;
>  	iscsi_prep_lun_reset_pdu(sc, hdr);
>  
>  	if (iscsi_exec_task_mgmt_fn(conn, hdr, session->age,
> @@ -2464,7 +2459,7 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
>  		goto unlock;
>  	}
>  
> -	switch (conn->tmf_state) {
> +	switch (session->tmf_state) {
>  	case TMF_SUCCESS:
>  		break;
>  	case TMF_TIMEDOUT:
> @@ -2472,7 +2467,7 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
>  		iscsi_conn_failure(conn, ISCSI_ERR_SCSI_EH_SESSION_RST);
>  		goto done;
>  	default:
> -		conn->tmf_state = TMF_INITIAL;
> +		session->tmf_state = TMF_INITIAL;
>  		goto unlock;
>  	}
>  
> @@ -2484,7 +2479,7 @@ int iscsi_eh_device_reset(struct scsi_cmnd *sc)
>  	spin_lock_bh(&session->frwd_lock);
>  	memset(hdr, 0, sizeof(*hdr));
>  	fail_scsi_tasks(conn, sc->device->lun, DID_ERROR);
> -	conn->tmf_state = TMF_INITIAL;
> +	session->tmf_state = TMF_INITIAL;
>  	spin_unlock_bh(&session->frwd_lock);
>  
>  	iscsi_start_tx(conn);
> @@ -2507,8 +2502,7 @@ void iscsi_session_recovery_timedout(struct iscsi_cls_session *cls_session)
>  	spin_lock_bh(&session->frwd_lock);
>  	if (session->state != ISCSI_STATE_LOGGED_IN) {
>  		session->state = ISCSI_STATE_RECOVERY_FAILED;
> -		if (session->leadconn)
> -			wake_up(&session->leadconn->ehwait);
> +		wake_up(&session->ehwait);
>  	}
>  	spin_unlock_bh(&session->frwd_lock);
>  }
> @@ -2553,7 +2547,7 @@ int iscsi_eh_session_reset(struct scsi_cmnd *sc)
>  	iscsi_conn_failure(conn, ISCSI_ERR_SCSI_EH_SESSION_RST);
>  
>  	ISCSI_DBG_EH(session, "wait for relogin\n");
> -	wait_event_interruptible(conn->ehwait,
> +	wait_event_interruptible(session->ehwait,
>  				 session->state == ISCSI_STATE_TERMINATE ||
>  				 session->state == ISCSI_STATE_LOGGED_IN ||
>  				 session->state == ISCSI_STATE_RECOVERY_FAILED);
> @@ -2614,11 +2608,11 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
>  	conn = session->leadconn;
>  
>  	/* only have one tmf outstanding at a time */
> -	if (conn->tmf_state != TMF_INITIAL)
> +	if (session->tmf_state != TMF_INITIAL)
>  		goto unlock;
> -	conn->tmf_state = TMF_QUEUED;
> +	session->tmf_state = TMF_QUEUED;
>  
> -	hdr = &conn->tmhdr;
> +	hdr = &session->tmhdr;
>  	iscsi_prep_tgt_reset_pdu(sc, hdr);
>  
>  	if (iscsi_exec_task_mgmt_fn(conn, hdr, session->age,
> @@ -2627,7 +2621,7 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
>  		goto unlock;
>  	}
>  
> -	switch (conn->tmf_state) {
> +	switch (session->tmf_state) {
>  	case TMF_SUCCESS:
>  		break;
>  	case TMF_TIMEDOUT:
> @@ -2635,7 +2629,7 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
>  		iscsi_conn_failure(conn, ISCSI_ERR_SCSI_EH_SESSION_RST);
>  		goto done;
>  	default:
> -		conn->tmf_state = TMF_INITIAL;
> +		session->tmf_state = TMF_INITIAL;
>  		goto unlock;
>  	}
>  
> @@ -2647,7 +2641,7 @@ static int iscsi_eh_target_reset(struct scsi_cmnd *sc)
>  	spin_lock_bh(&session->frwd_lock);
>  	memset(hdr, 0, sizeof(*hdr));
>  	fail_scsi_tasks(conn, -1, DID_ERROR);
> -	conn->tmf_state = TMF_INITIAL;
> +	session->tmf_state = TMF_INITIAL;
>  	spin_unlock_bh(&session->frwd_lock);
>  
>  	iscsi_start_tx(conn);
> @@ -2977,7 +2971,10 @@ iscsi_session_setup(struct iscsi_transport *iscsit, struct Scsi_Host *shost,
>  	session->tt = iscsit;
>  	session->dd_data = cls_session->dd_data + sizeof(*session);
>  
> +	session->tmf_state = TMF_INITIAL;
> +	timer_setup(&session->tmf_timer, iscsi_tmf_timedout, 0);
>  	mutex_init(&session->eh_mutex);
> +
>  	spin_lock_init(&session->frwd_lock);
>  	spin_lock_init(&session->back_lock);
>  
> @@ -3081,7 +3078,6 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
>  	conn->c_stage = ISCSI_CONN_INITIAL_STAGE;
>  	conn->id = conn_idx;
>  	conn->exp_statsn = 0;
> -	conn->tmf_state = TMF_INITIAL;
>  
>  	timer_setup(&conn->transport_timer, iscsi_check_transport_timeouts, 0);
>  
> @@ -3106,8 +3102,7 @@ iscsi_conn_setup(struct iscsi_cls_session *cls_session, int dd_size,
>  		goto login_task_data_alloc_fail;
>  	conn->login_task->data = conn->data = data;
>  
> -	timer_setup(&conn->tmf_timer, iscsi_tmf_timedout, 0);
> -	init_waitqueue_head(&conn->ehwait);
> +	init_waitqueue_head(&session->ehwait);
>  
>  	return cls_conn;
>  
> @@ -3160,7 +3155,7 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
>  		 * leading connection? then give up on recovery.
>  		 */
>  		session->state = ISCSI_STATE_TERMINATE;
> -		wake_up(&conn->ehwait);
> +		wake_up(&session->ehwait);
>  	}
>  
>  	spin_unlock_bh(&session->frwd_lock);
> @@ -3245,7 +3240,7 @@ int iscsi_conn_start(struct iscsi_cls_conn *cls_conn)
>  		 * commands after successful recovery
>  		 */
>  		conn->stop_stage = 0;
> -		conn->tmf_state = TMF_INITIAL;
> +		session->tmf_state = TMF_INITIAL;
>  		session->age++;
>  		if (session->age == 16)
>  			session->age = 0;
> @@ -3259,7 +3254,7 @@ int iscsi_conn_start(struct iscsi_cls_conn *cls_conn)
>  	spin_unlock_bh(&session->frwd_lock);
>  
>  	iscsi_unblock_session(session->cls_session);
> -	wake_up(&conn->ehwait);
> +	wake_up(&session->ehwait);
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(iscsi_conn_start);
> @@ -3353,7 +3348,7 @@ void iscsi_conn_stop(struct iscsi_cls_conn *cls_conn, int flag)
>  	spin_lock_bh(&session->frwd_lock);
>  	fail_scsi_tasks(conn, -1, DID_TRANSPORT_DISRUPTED);
>  	fail_mgmt_tasks(session, conn);
> -	memset(&conn->tmhdr, 0, sizeof(conn->tmhdr));
> +	memset(&session->tmhdr, 0, sizeof(session->tmhdr));
>  	spin_unlock_bh(&session->frwd_lock);
>  	mutex_unlock(&session->eh_mutex);
>  }
> diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
> index ec6d508e7a4a..545dfefffe9b 100644
> --- a/include/scsi/libiscsi.h
> +++ b/include/scsi/libiscsi.h
> @@ -202,12 +202,6 @@ struct iscsi_conn {
>  	unsigned long		suspend_tx;	/* suspend Tx */
>  	unsigned long		suspend_rx;	/* suspend Rx */
>  
> -	/* abort */
> -	wait_queue_head_t	ehwait;		/* used in eh_abort() */
> -	struct iscsi_tm		tmhdr;
> -	struct timer_list	tmf_timer;
> -	int			tmf_state;	/* see TMF_INITIAL, etc.*/
> -
>  	/* negotiated params */
>  	unsigned		max_recv_dlength; /* initiator_max_recv_dsl*/
>  	unsigned		max_xmit_dlength; /* target_max_recv_dsl */
> @@ -277,6 +271,11 @@ struct iscsi_session {
>  	 * and recv lock.
>  	 */
>  	struct mutex		eh_mutex;
> +	/* abort */
> +	wait_queue_head_t	ehwait;		/* used in eh_abort() */
> +	struct iscsi_tm		tmhdr;
> +	struct timer_list	tmf_timer;
> +	int			tmf_state;	/* see TMF_INITIAL, etc.*/
>  
>  	/* iSCSI session-wide sequencing */
>  	uint32_t		cmdsn;
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 07/17] scsi: iscsi: move pool freeing
  2021-04-16  2:04 ` [PATCH v3 07/17] scsi: iscsi: move pool freeing Mike Christie
@ 2021-04-24 21:12   ` Lee Duncan
  0 siblings, 0 replies; 31+ messages in thread
From: Lee Duncan @ 2021-04-24 21:12 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, mrangankar, svernekar, linux-scsi, jejb

On 4/15/21 7:04 PM, Mike Christie wrote:
> This doesn't fix any bugs, but it makes more sense to free the pool after
> we have removed the session. At that time we know nothing is touching any
> of the session fields, because all devices have been removed and scans are
> stopped.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/libiscsi.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 56b41d8fff02..b2970054558a 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -3025,10 +3025,9 @@ void iscsi_session_teardown(struct iscsi_cls_session *cls_session)
>  	struct module *owner = cls_session->transport->owner;
>  	struct Scsi_Host *shost = session->host;
>  
> -	iscsi_pool_free(&session->cmdpool);
> -
>  	iscsi_remove_session(cls_session);
>  
> +	iscsi_pool_free(&session->cmdpool);
>  	kfree(session->password);
>  	kfree(session->password_in);
>  	kfree(session->username);
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-04-24 21:12 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-16  2:04 [PATCH v3 00/17] libicsi and qedi TMF fixes Mike Christie
2021-04-16  2:04 ` [PATCH v3 01/17] scsi: iscsi: add task completion helper Mike Christie
2021-04-16 21:56   ` Lee Duncan
2021-04-16  2:04 ` [PATCH v3 02/17] scsi: iscsi: sync libiscsi and driver reset cleanup Mike Christie
2021-04-17 17:22   ` Lee Duncan
2021-04-17 17:26     ` Mike Christie
2021-04-16  2:04 ` [PATCH v3 03/17] scsi: iscsi: stop queueing during ep_disconnect Mike Christie
2021-04-20 14:28   ` Lee Duncan
2021-04-16  2:04 ` [PATCH v3 04/17] scsi: iscsi: drop suspend calls from ep_disconnect Mike Christie
2021-04-20 14:29   ` Lee Duncan
2021-04-16  2:04 ` [PATCH v3 05/17] scsi: iscsi: wait on cmds before freeing conn Mike Christie
2021-04-22 15:02   ` Lee Duncan
2021-04-22 20:09     ` Mike Christie
2021-04-16  2:04 ` [PATCH v3 06/17] scsi: iscsi: fix use conn use after free Mike Christie
2021-04-24 21:11   ` Lee Duncan
2021-04-16  2:04 ` [PATCH v3 07/17] scsi: iscsi: move pool freeing Mike Christie
2021-04-24 21:12   ` Lee Duncan
2021-04-16  2:04 ` [PATCH v3 08/17] scsi: qedi: fix null ref during abort handling Mike Christie
2021-04-16  2:04 ` [PATCH v3 09/17] scsi: qedi: fix race during abort timeouts Mike Christie
2021-04-16 11:39   ` kernel test robot
2021-04-16 11:39     ` kernel test robot
2021-04-16 15:23     ` michael.christie
2021-04-16 15:23       ` michael.christie
2021-04-16  2:04 ` [PATCH v3 10/17] scsi: qedi: fix use after free during abort cleanup Mike Christie
2021-04-16  2:04 ` [PATCH v3 11/17] scsi: qedi: fix TMF tid allocation Mike Christie
2021-04-16  2:04 ` [PATCH v3 12/17] scsi: qedi: use GFP_NOIO for tmf allocation Mike Christie
2021-04-16  2:04 ` [PATCH v3 13/17] scsi: qedi: fix TMF session block/unblock use Mike Christie
2021-04-16  2:04 ` [PATCH v3 14/17] scsi: qedi: fix cleanup " Mike Christie
2021-04-16  2:04 ` [PATCH v3 15/17] scsi: qedi: pass send_iscsi_tmf task to abort Mike Christie
2021-04-16  2:04 ` [PATCH v3 16/17] scsi: qedi: complete TMF works before disconnect Mike Christie
2021-04-16  2:04 ` [PATCH v3 17/17] scsi: qedi: always wake up if cmd_cleanup_req is set Mike Christie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.