All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] iscsi: Speed up failover with lots of devices.
@ 2022-02-26 23:04 Mike Christie
  2022-02-26 23:04 ` [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race Mike Christie
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521

In:

https://lore.kernel.org/all/CAK3e-EZbJMDHkozGiz8LnMNAZ+SoCA+QeK0kpkqM4vQ4pz86SQ@mail.gmail.com/t/ 

Zhengyuan Liu found an issue where failovers are taking a long time
with lots of devices (/dev/sdXYZ nodes). The problem is that iscsid
expects most nl operations to be fast (ignoring mem issues) and when
the session block code was written blocking a queue/scsi_device was
just setting some flag bits and state values more or less. Now a block
call will actually handle IO that has been sent to the driver, so it
can be expensive. When you add in more and more devices, then a
session block call will take longer and longer.

This patchset moves the recovery and unbind operations to a per
session work queue instead of the mix or per session, host and module.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
@ 2022-02-26 23:04 ` Mike Christie
  2022-02-27 19:49   ` Lee Duncan
  2022-02-28 20:06   ` Chris Leech
  2022-02-26 23:04 ` [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal Mike Christie
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521
  Cc: Mike Christie

If the user sets the iscsi_eh_timer_workq/iscsi_eh workqueue's max_active
to greater than 1, the recovery_work could be running when
__iscsi_unblock_session runs. The cancel_delayed_work will then not wait
for the running work and we can race where we end up with the wrong
session state and scsi_device state set.

This replaces the cancel_delayed_work with the sync version.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_transport_iscsi.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 554b6f784223..c58126e8cd88 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1917,11 +1917,8 @@ static void __iscsi_unblock_session(struct work_struct *work)
 	unsigned long flags;
 
 	ISCSI_DBG_TRANS_SESSION(session, "Unblocking session\n");
-	/*
-	 * The recovery and unblock work get run from the same workqueue,
-	 * so try to cancel it if it was going to run after this unblock.
-	 */
-	cancel_delayed_work(&session->recovery_work);
+
+	cancel_delayed_work_sync(&session->recovery_work);
 	spin_lock_irqsave(&session->lock, flags);
 	session->state = ISCSI_SESSION_LOGGED_IN;
 	spin_unlock_irqrestore(&session->lock, flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
  2022-02-26 23:04 ` [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race Mike Christie
@ 2022-02-26 23:04 ` Mike Christie
  2022-02-28 16:05   ` Lee Duncan
  2022-02-28 20:06   ` Chris Leech
  2022-02-26 23:04 ` [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished Mike Christie
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521
  Cc: Mike Christie

When the iscsi class was added upstream blocking a queue was fast because
it just set some flag bits and didn't handle IO that was in the process
of being sent to the driver. That's no longer the case so blocking a queue
is expensive and we can end up with a backlog of blocks by the time we
have relogged in and are trying to start the queues.

For the session unblock case, this has try to cancel the block and
recovery work in case they are still queued so we can avoid unneeded queue
manipulations. For removal we also now try to cancel all the recovery
related works since a couple lines down we will set the session and device
state so running those functions are not necessary.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_transport_iscsi.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index c58126e8cd88..732938f5436b 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1944,7 +1944,8 @@ static void __iscsi_unblock_session(struct work_struct *work)
  */
 void iscsi_unblock_session(struct iscsi_cls_session *session)
 {
-	flush_work(&session->block_work);
+	if (!cancel_work_sync(&session->block_work))
+		cancel_delayed_work_sync(&session->recovery_work);
 
 	queue_work(iscsi_eh_timer_workq, &session->unblock_work);
 	/*
@@ -2177,9 +2178,9 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
 		list_del(&session->sess_list);
 	spin_unlock_irqrestore(&sesslock, flags);
 
-	flush_work(&session->block_work);
-	flush_work(&session->unblock_work);
-	cancel_delayed_work_sync(&session->recovery_work);
+	if (!cancel_work_sync(&session->block_work))
+		cancel_delayed_work_sync(&session->recovery_work);
+	cancel_work_sync(&session->unblock_work);
 	/*
 	 * If we are blocked let commands flow again. The lld or iscsi
 	 * layer should set up the queuecommand to fail commands.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
  2022-02-26 23:04 ` [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race Mike Christie
  2022-02-26 23:04 ` [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal Mike Christie
@ 2022-02-26 23:04 ` Mike Christie
  2022-02-28 18:05   ` Lee Duncan
  2022-02-28 20:07   ` Chris Leech
  2022-02-26 23:04 ` [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding Mike Christie
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521
  Cc: Mike Christie

qla4xxx does not use iscsi_scan_finished anymore so remove it.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_transport_iscsi.c | 39 +++--------------------------
 include/scsi/scsi_transport_iscsi.h |  2 --
 2 files changed, 4 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 732938f5436b..05cd4bca979e 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1557,7 +1557,6 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev,
 	struct iscsi_cls_host *ihost = shost->shost_data;
 
 	memset(ihost, 0, sizeof(*ihost));
-	atomic_set(&ihost->nr_scans, 0);
 	mutex_init(&ihost->mutex);
 
 	iscsi_bsg_host_add(shost, ihost);
@@ -1744,25 +1743,6 @@ void iscsi_host_for_each_session(struct Scsi_Host *shost,
 }
 EXPORT_SYMBOL_GPL(iscsi_host_for_each_session);
 
-/**
- * iscsi_scan_finished - helper to report when running scans are done
- * @shost: scsi host
- * @time: scan run time
- *
- * This function can be used by drives like qla4xxx to report to the scsi
- * layer when the scans it kicked off at module load time are done.
- */
-int iscsi_scan_finished(struct Scsi_Host *shost, unsigned long time)
-{
-	struct iscsi_cls_host *ihost = shost->shost_data;
-	/*
-	 * qla4xxx will have kicked off some session unblocks before calling
-	 * scsi_scan_host, so just wait for them to complete.
-	 */
-	return !atomic_read(&ihost->nr_scans);
-}
-EXPORT_SYMBOL_GPL(iscsi_scan_finished);
-
 struct iscsi_scan_data {
 	unsigned int channel;
 	unsigned int id;
@@ -1831,8 +1811,6 @@ static void iscsi_scan_session(struct work_struct *work)
 {
 	struct iscsi_cls_session *session =
 			container_of(work, struct iscsi_cls_session, scan_work);
-	struct Scsi_Host *shost = iscsi_session_to_shost(session);
-	struct iscsi_cls_host *ihost = shost->shost_data;
 	struct iscsi_scan_data scan_data;
 
 	scan_data.channel = 0;
@@ -1841,7 +1819,6 @@ static void iscsi_scan_session(struct work_struct *work)
 	scan_data.rescan = SCSI_SCAN_RESCAN;
 
 	iscsi_user_scan_session(&session->dev, &scan_data);
-	atomic_dec(&ihost->nr_scans);
 }
 
 /**
@@ -1912,8 +1889,6 @@ static void __iscsi_unblock_session(struct work_struct *work)
 	struct iscsi_cls_session *session =
 			container_of(work, struct iscsi_cls_session,
 				     unblock_work);
-	struct Scsi_Host *shost = iscsi_session_to_shost(session);
-	struct iscsi_cls_host *ihost = shost->shost_data;
 	unsigned long flags;
 
 	ISCSI_DBG_TRANS_SESSION(session, "Unblocking session\n");
@@ -1924,15 +1899,6 @@ static void __iscsi_unblock_session(struct work_struct *work)
 	spin_unlock_irqrestore(&session->lock, flags);
 	/* start IO */
 	scsi_target_unblock(&session->dev, SDEV_RUNNING);
-	/*
-	 * Only do kernel scanning if the driver is properly hooked into
-	 * the async scanning code (drivers like iscsi_tcp do login and
-	 * scanning from userspace).
-	 */
-	if (shost->hostt->scan_finished) {
-		if (scsi_queue_work(shost, &session->scan_work))
-			atomic_inc(&ihost->nr_scans);
-	}
 	ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking session\n");
 }
 
@@ -2192,7 +2158,10 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
 	spin_unlock_irqrestore(&session->lock, flags);
 
 	scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE);
-	/* flush running scans then delete devices */
+	/*
+	 * qla4xxx can perform it's own scans when it runs in kernel only
+	 * mode. Make sure to flush those scans.
+	 */
 	flush_work(&session->scan_work);
 	/* flush running unbind operations */
 	flush_work(&session->unbind_work);
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index c5d7810fd792..90b55db46d7c 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -278,7 +278,6 @@ struct iscsi_cls_session {
 	iscsi_dev_to_session(_stgt->dev.parent)
 
 struct iscsi_cls_host {
-	atomic_t nr_scans;
 	struct mutex mutex;
 	struct request_queue *bsg_q;
 	uint32_t port_speed;
@@ -448,7 +447,6 @@ extern void iscsi_get_conn(struct iscsi_cls_conn *conn);
 extern int iscsi_destroy_conn(struct iscsi_cls_conn *conn);
 extern void iscsi_unblock_session(struct iscsi_cls_session *session);
 extern void iscsi_block_session(struct iscsi_cls_session *session);
-extern int iscsi_scan_finished(struct Scsi_Host *shost, unsigned long time);
 extern struct iscsi_endpoint *iscsi_create_endpoint(int dd_size);
 extern void iscsi_destroy_endpoint(struct iscsi_endpoint *ep);
 extern struct iscsi_endpoint *iscsi_lookup_endpoint(u64 handle);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (2 preceding siblings ...)
  2022-02-26 23:04 ` [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished Mike Christie
@ 2022-02-26 23:04 ` Mike Christie
  2022-02-28 18:19   ` Lee Duncan
  2022-02-28 20:07   ` Chris Leech
  2022-02-26 23:04 ` [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery Mike Christie
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521
  Cc: Mike Christie

We currently allocate a workqueue per host and only use it for removing
the target. For the session per host case we could be using this workqueue
to be able to do recoveries (block, unblock, timeout handling) in
parallel. To also allow offload drivers to do their session recoveries in
parallel, this drops the per host workqueue and replaces it with a per
session one.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/qla4xxx/ql4_os.c       |  2 +-
 drivers/scsi/scsi_transport_iscsi.c | 19 ++++++++++++++-----
 include/scsi/scsi_transport_iscsi.h |  2 ++
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 0ae936d839f1..955d8cb675f1 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -5096,7 +5096,7 @@ int qla4xxx_unblock_flash_ddb(struct iscsi_cls_session *cls_session)
 		ql4_printk(KERN_INFO, ha, "scsi%ld: %s: ddb[%d]"
 			   " start scan\n", ha->host_no, __func__,
 			   ddb_entry->fw_ddb_index);
-		scsi_queue_work(ha->host, &ddb_entry->sess->scan_work);
+		queue_work(ddb_entry->sess->workq, &ddb_entry->sess->scan_work);
 	}
 	return QLA_SUCCESS;
 }
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 05cd4bca979e..ecb592a70e03 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2032,19 +2032,27 @@ EXPORT_SYMBOL_GPL(iscsi_alloc_session);
 
 int iscsi_add_session(struct iscsi_cls_session *session, unsigned int target_id)
 {
+	struct Scsi_Host *shost = iscsi_session_to_shost(session);
 	unsigned long flags;
 	int id = 0;
 	int err;
 
 	session->sid = atomic_add_return(1, &iscsi_session_nr);
 
+	session->workq = alloc_workqueue("iscsi_ctrl_%d:%d",
+			WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
+			shost->host_no, session->sid);
+	if (!session->workq)
+		return -ENOMEM;
+
 	if (target_id == ISCSI_MAX_TARGET) {
 		id = ida_simple_get(&iscsi_sess_ida, 0, 0, GFP_KERNEL);
 
 		if (id < 0) {
 			iscsi_cls_session_printk(KERN_ERR, session,
 					"Failure in Target ID Allocation\n");
-			return id;
+			err = id;
+			goto destroy_wq;
 		}
 		session->target_id = (unsigned int)id;
 		session->ida_used = true;
@@ -2078,7 +2086,8 @@ int iscsi_add_session(struct iscsi_cls_session *session, unsigned int target_id)
 release_ida:
 	if (session->ida_used)
 		ida_simple_remove(&iscsi_sess_ida, session->target_id);
-
+destroy_wq:
+	destroy_workqueue(session->workq);
 	return err;
 }
 EXPORT_SYMBOL_GPL(iscsi_add_session);
@@ -2177,6 +2186,8 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
 
 	transport_unregister_device(&session->dev);
 
+	destroy_workqueue(session->workq);
+
 	ISCSI_DBG_TRANS_SESSION(session, "Completing session removal\n");
 	device_del(&session->dev);
 }
@@ -3833,8 +3844,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
 	case ISCSI_UEVENT_UNBIND_SESSION:
 		session = iscsi_session_lookup(ev->u.d_session.sid);
 		if (session)
-			scsi_queue_work(iscsi_session_to_shost(session),
-					&session->unbind_work);
+			queue_work(session->workq, &session->unbind_work);
 		else
 			err = -EINVAL;
 		break;
@@ -4707,7 +4717,6 @@ iscsi_register_transport(struct iscsi_transport *tt)
 	INIT_LIST_HEAD(&priv->list);
 	priv->iscsi_transport = tt;
 	priv->t.user_scan = iscsi_user_scan;
-	priv->t.create_work_queue = 1;
 
 	priv->dev.class = &iscsi_transport_class;
 	dev_set_name(&priv->dev, "%s", tt->name);
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index 90b55db46d7c..7a0d24d3b916 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -251,6 +251,8 @@ struct iscsi_cls_session {
 	bool recovery_tmo_sysfs_override;
 	struct delayed_work recovery_work;
 
+	struct workqueue_struct *workq;
+
 	unsigned int target_id;
 	bool ida_used;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (3 preceding siblings ...)
  2022-02-26 23:04 ` [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding Mike Christie
@ 2022-02-26 23:04 ` Mike Christie
  2022-02-28 20:08   ` Chris Leech
  2022-02-28 20:09   ` Lee Duncan
  2022-02-26 23:04 ` [PATCH 6/6] scsi: iscsi: Drop temp workq_name Mike Christie
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521
  Cc: Mike Christie

Use the session workqueue for recovery and unbinding. If there are delays
during device blocking/cleanup then it will no longer affect other
sessions.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_transport_iscsi.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index ecb592a70e03..754277bec63a 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -87,7 +87,6 @@ struct iscsi_internal {
 };
 
 static atomic_t iscsi_session_nr; /* sysfs session id for next new session */
-static struct workqueue_struct *iscsi_eh_timer_workq;
 
 static struct workqueue_struct *iscsi_conn_cleanup_workq;
 
@@ -1913,7 +1912,7 @@ void iscsi_unblock_session(struct iscsi_cls_session *session)
 	if (!cancel_work_sync(&session->block_work))
 		cancel_delayed_work_sync(&session->recovery_work);
 
-	queue_work(iscsi_eh_timer_workq, &session->unblock_work);
+	queue_work(session->workq, &session->unblock_work);
 	/*
 	 * Blocking the session can be done from any context so we only
 	 * queue the block work. Make sure the unblock work has completed
@@ -1937,14 +1936,14 @@ static void __iscsi_block_session(struct work_struct *work)
 	scsi_target_block(&session->dev);
 	ISCSI_DBG_TRANS_SESSION(session, "Completed SCSI target blocking\n");
 	if (session->recovery_tmo >= 0)
-		queue_delayed_work(iscsi_eh_timer_workq,
+		queue_delayed_work(session->workq,
 				   &session->recovery_work,
 				   session->recovery_tmo * HZ);
 }
 
 void iscsi_block_session(struct iscsi_cls_session *session)
 {
-	queue_work(iscsi_eh_timer_workq, &session->block_work);
+	queue_work(session->workq, &session->block_work);
 }
 EXPORT_SYMBOL_GPL(iscsi_block_session);
 
@@ -4851,26 +4850,16 @@ static __init int iscsi_transport_init(void)
 		goto unregister_flashnode_bus;
 	}
 
-	iscsi_eh_timer_workq = alloc_workqueue("%s",
-			WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
-			1, "iscsi_eh");
-	if (!iscsi_eh_timer_workq) {
-		err = -ENOMEM;
-		goto release_nls;
-	}
-
 	iscsi_conn_cleanup_workq = alloc_workqueue("%s",
 			WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
 			"iscsi_conn_cleanup");
 	if (!iscsi_conn_cleanup_workq) {
 		err = -ENOMEM;
-		goto destroy_wq;
+		goto release_nls;
 	}
 
 	return 0;
 
-destroy_wq:
-	destroy_workqueue(iscsi_eh_timer_workq);
 release_nls:
 	netlink_kernel_release(nls);
 unregister_flashnode_bus:
@@ -4893,7 +4882,6 @@ static __init int iscsi_transport_init(void)
 static void __exit iscsi_transport_exit(void)
 {
 	destroy_workqueue(iscsi_conn_cleanup_workq);
-	destroy_workqueue(iscsi_eh_timer_workq);
 	netlink_kernel_release(nls);
 	bus_unregister(&iscsi_flashnode_bus);
 	transport_class_unregister(&iscsi_connection_class);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] scsi: iscsi: Drop temp workq_name.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (4 preceding siblings ...)
  2022-02-26 23:04 ` [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery Mike Christie
@ 2022-02-26 23:04 ` Mike Christie
  2022-02-28 20:08   ` Chris Leech
  2022-02-28 22:49   ` Lee Duncan
  2022-02-28 15:53 ` [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-26 23:04 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521
  Cc: Mike Christie

When the workqueue code was created it didn't allow variable args so we
have been using a temp buffer. Drop that.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 6 ++----
 include/scsi/libiscsi.h | 1 -
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 059dae8909ee..a75b85f0a189 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2798,11 +2798,9 @@ struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
 	ihost = shost_priv(shost);
 
 	if (xmit_can_sleep) {
-		snprintf(ihost->workq_name, sizeof(ihost->workq_name),
-			"iscsi_q_%d", shost->host_no);
-		ihost->workq = alloc_workqueue("%s",
+		ihost->workq = alloc_workqueue("iscsi_q_%d",
 			WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
-			1, ihost->workq_name);
+			1, shost->host_no);
 		if (!ihost->workq)
 			goto free_host;
 	}
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 4ee233e5a6ff..2d85810d1929 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -371,7 +371,6 @@ struct iscsi_host {
 	int			state;
 
 	struct workqueue_struct	*workq;
-	char			workq_name[20];
 };
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race.
  2022-02-26 23:04 ` [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race Mike Christie
@ 2022-02-27 19:49   ` Lee Duncan
  2022-02-28 20:06   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Lee Duncan @ 2022-02-27 19:49 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/26/22 15:04, Mike Christie wrote:
> If the user sets the iscsi_eh_timer_workq/iscsi_eh workqueue's max_active
> to greater than 1, the recovery_work could be running when
> __iscsi_unblock_session runs. The cancel_delayed_work will then not wait
> for the running work and we can race where we end up with the wrong
> session state and scsi_device state set.
> 
> This replaces the cancel_delayed_work with the sync version.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/scsi_transport_iscsi.c | 7 ++-----
>   1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 554b6f784223..c58126e8cd88 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1917,11 +1917,8 @@ static void __iscsi_unblock_session(struct work_struct *work)
>   	unsigned long flags;
>   
>   	ISCSI_DBG_TRANS_SESSION(session, "Unblocking session\n");
> -	/*
> -	 * The recovery and unblock work get run from the same workqueue,
> -	 * so try to cancel it if it was going to run after this unblock.
> -	 */
> -	cancel_delayed_work(&session->recovery_work);
> +
> +	cancel_delayed_work_sync(&session->recovery_work);
>   	spin_lock_irqsave(&session->lock, flags);
>   	session->state = ISCSI_SESSION_LOGGED_IN;
>   	spin_unlock_irqrestore(&session->lock, flags);

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/6] iscsi: Speed up failover with lots of devices.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (5 preceding siblings ...)
  2022-02-26 23:04 ` [PATCH 6/6] scsi: iscsi: Drop temp workq_name Mike Christie
@ 2022-02-28 15:53 ` Mike Christie
  2022-03-02  4:20 ` Martin K. Petersen
  2022-03-09  4:14 ` Martin K. Petersen
  8 siblings, 0 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-28 15:53 UTC (permalink / raw)
  To: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521

Hey Lee,
On 2/26/22 5:04 PM, Mike Christie wrote:
> In:
> 
> https://lore.kernel.org/all/CAK3e-EZbJMDHkozGiz8LnMNAZ+SoCA+QeK0kpkqM4vQ4pz86SQ@mail.gmail.com/t/ 
> 
> Zhengyuan Liu found an issue where failovers are taking a long time
> with lots of devices (/dev/sdXYZ nodes). The problem is that iscsid
> expects most nl operations to be fast (ignoring mem issues) and when
> the session block code was written blocking a queue/scsi_device was
> just setting some flag bits and state values more or less. Now a block
> call will actually handle IO that has been sent to the driver, so it
> can be expensive. When you add in more and more devices, then a
> session block call will take longer and longer.
> 
> This patchset moves the recovery and unbind operations to a per
> session work queue instead of the mix or per session, host and module.


If you get to the end of the patchset and wonder if there is a patch
missing, there is :)

I have one more patchset that is related to this, but not required for
to handle Zhengyuan Liu's issue. I think I can also kill
iscsi_conn_cleanup_workq and use the iscsi wq instead, but I want to
think about it some more and test it out. And since it's not needed
to handle the issue in the thread below, it should be ok to do
separately.

It might just be a simple kill iscsi_conn_cleanup_workq and use the
iscsi wq, or I might be able to go more invasive and drop some
code. I'm not sure yet.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal.
  2022-02-26 23:04 ` [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal Mike Christie
@ 2022-02-28 16:05   ` Lee Duncan
  2022-02-28 20:06   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Lee Duncan @ 2022-02-28 16:05 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/26/22 15:04, Mike Christie wrote:
> When the iscsi class was added upstream blocking a queue was fast because
> it just set some flag bits and didn't handle IO that was in the process
> of being sent to the driver. That's no longer the case so blocking a queue
> is expensive and we can end up with a backlog of blocks by the time we
> have relogged in and are trying to start the queues.
> 
> For the session unblock case, this has try to cancel the block and
> recovery work in case they are still queued so we can avoid unneeded queue
> manipulations. For removal we also now try to cancel all the recovery
> related works since a couple lines down we will set the session and device
> state so running those functions are not necessary.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/scsi_transport_iscsi.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index c58126e8cd88..732938f5436b 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1944,7 +1944,8 @@ static void __iscsi_unblock_session(struct work_struct *work)
>    */
>   void iscsi_unblock_session(struct iscsi_cls_session *session)
>   {
> -	flush_work(&session->block_work);
> +	if (!cancel_work_sync(&session->block_work))
> +		cancel_delayed_work_sync(&session->recovery_work);
>   
>   	queue_work(iscsi_eh_timer_workq, &session->unblock_work);
>   	/*
> @@ -2177,9 +2178,9 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
>   		list_del(&session->sess_list);
>   	spin_unlock_irqrestore(&sesslock, flags);
>   
> -	flush_work(&session->block_work);
> -	flush_work(&session->unblock_work);
> -	cancel_delayed_work_sync(&session->recovery_work);
> +	if (!cancel_work_sync(&session->block_work))
> +		cancel_delayed_work_sync(&session->recovery_work);
> +	cancel_work_sync(&session->unblock_work);
>   	/*
>   	 * If we are blocked let commands flow again. The lld or iscsi
>   	 * layer should set up the queuecommand to fail commands.

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished.
  2022-02-26 23:04 ` [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished Mike Christie
@ 2022-02-28 18:05   ` Lee Duncan
  2022-02-28 20:39     ` Mike Christie
  2022-02-28 20:07   ` Chris Leech
  1 sibling, 1 reply; 23+ messages in thread
From: Lee Duncan @ 2022-02-28 18:05 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/26/22 15:04, Mike Christie wrote:
> qla4xxx does not use iscsi_scan_finished anymore so remove it.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/scsi_transport_iscsi.c | 39 +++--------------------------
>   include/scsi/scsi_transport_iscsi.h |  2 --
>   2 files changed, 4 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 732938f5436b..05cd4bca979e 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1557,7 +1557,6 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev,
>   	struct iscsi_cls_host *ihost = shost->shost_data;
>   
>   	memset(ihost, 0, sizeof(*ihost));
> -	atomic_set(&ihost->nr_scans, 0);
>   	mutex_init(&ihost->mutex);
>   
>   	iscsi_bsg_host_add(shost, ihost);
> @@ -1744,25 +1743,6 @@ void iscsi_host_for_each_session(struct Scsi_Host *shost,
>   }
>   EXPORT_SYMBOL_GPL(iscsi_host_for_each_session);
>   
> -/**
> - * iscsi_scan_finished - helper to report when running scans are done
> - * @shost: scsi host
> - * @time: scan run time
> - *
> - * This function can be used by drives like qla4xxx to report to the scsi
> - * layer when the scans it kicked off at module load time are done.
> - */
> -int iscsi_scan_finished(struct Scsi_Host *shost, unsigned long time)
> -{
> -	struct iscsi_cls_host *ihost = shost->shost_data;
> -	/*
> -	 * qla4xxx will have kicked off some session unblocks before calling
> -	 * scsi_scan_host, so just wait for them to complete.
> -	 */
> -	return !atomic_read(&ihost->nr_scans);
> -}
> -EXPORT_SYMBOL_GPL(iscsi_scan_finished);
> -
>   struct iscsi_scan_data {
>   	unsigned int channel;
>   	unsigned int id;
> @@ -1831,8 +1811,6 @@ static void iscsi_scan_session(struct work_struct *work)
>   {
>   	struct iscsi_cls_session *session =
>   			container_of(work, struct iscsi_cls_session, scan_work);
> -	struct Scsi_Host *shost = iscsi_session_to_shost(session);
> -	struct iscsi_cls_host *ihost = shost->shost_data;
>   	struct iscsi_scan_data scan_data;
>   
>   	scan_data.channel = 0;
> @@ -1841,7 +1819,6 @@ static void iscsi_scan_session(struct work_struct *work)
>   	scan_data.rescan = SCSI_SCAN_RESCAN;
>   
>   	iscsi_user_scan_session(&session->dev, &scan_data);
> -	atomic_dec(&ihost->nr_scans);
>   }
>   
>   /**
> @@ -1912,8 +1889,6 @@ static void __iscsi_unblock_session(struct work_struct *work)
>   	struct iscsi_cls_session *session =
>   			container_of(work, struct iscsi_cls_session,
>   				     unblock_work);
> -	struct Scsi_Host *shost = iscsi_session_to_shost(session);
> -	struct iscsi_cls_host *ihost = shost->shost_data;
>   	unsigned long flags;
>   
>   	ISCSI_DBG_TRANS_SESSION(session, "Unblocking session\n");
> @@ -1924,15 +1899,6 @@ static void __iscsi_unblock_session(struct work_struct *work)
>   	spin_unlock_irqrestore(&session->lock, flags);
>   	/* start IO */
>   	scsi_target_unblock(&session->dev, SDEV_RUNNING);
> -	/*
> -	 * Only do kernel scanning if the driver is properly hooked into
> -	 * the async scanning code (drivers like iscsi_tcp do login and
> -	 * scanning from userspace).
> -	 */
> -	if (shost->hostt->scan_finished) {
> -		if (scsi_queue_work(shost, &session->scan_work))
> -			atomic_inc(&ihost->nr_scans);
> -	}
>   	ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking session\n");
>   }
>   
> @@ -2192,7 +2158,10 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
>   	spin_unlock_irqrestore(&session->lock, flags);
>   
>   	scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE);
> -	/* flush running scans then delete devices */
> +	/*
> +	 * qla4xxx can perform it's own scans when it runs in kernel only
> +	 * mode. Make sure to flush those scans.
> +	 */
>   	flush_work(&session->scan_work);
>   	/* flush running unbind operations */
>   	flush_work(&session->unbind_work);
> diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
> index c5d7810fd792..90b55db46d7c 100644
> --- a/include/scsi/scsi_transport_iscsi.h
> +++ b/include/scsi/scsi_transport_iscsi.h
> @@ -278,7 +278,6 @@ struct iscsi_cls_session {
>   	iscsi_dev_to_session(_stgt->dev.parent)
>   
>   struct iscsi_cls_host {
> -	atomic_t nr_scans;
>   	struct mutex mutex;
>   	struct request_queue *bsg_q;
>   	uint32_t port_speed;
> @@ -448,7 +447,6 @@ extern void iscsi_get_conn(struct iscsi_cls_conn *conn);
>   extern int iscsi_destroy_conn(struct iscsi_cls_conn *conn);
>   extern void iscsi_unblock_session(struct iscsi_cls_session *session);
>   extern void iscsi_block_session(struct iscsi_cls_session *session);
> -extern int iscsi_scan_finished(struct Scsi_Host *shost, unsigned long time);
>   extern struct iscsi_endpoint *iscsi_create_endpoint(int dd_size);
>   extern void iscsi_destroy_endpoint(struct iscsi_endpoint *ep);
>   extern struct iscsi_endpoint *iscsi_lookup_endpoint(u64 handle);

I have no issue with this, but it seems kind of unrelated to speeding up 
session unblocking ... Never the less:

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding.
  2022-02-26 23:04 ` [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding Mike Christie
@ 2022-02-28 18:19   ` Lee Duncan
  2022-02-28 20:07   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Lee Duncan @ 2022-02-28 18:19 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/26/22 15:04, Mike Christie wrote:
> We currently allocate a workqueue per host and only use it for removing
> the target. For the session per host case we could be using this workqueue
> to be able to do recoveries (block, unblock, timeout handling) in
> parallel. To also allow offload drivers to do their session recoveries in
> parallel, this drops the per host workqueue and replaces it with a per
> session one.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/qla4xxx/ql4_os.c       |  2 +-
>   drivers/scsi/scsi_transport_iscsi.c | 19 ++++++++++++++-----
>   include/scsi/scsi_transport_iscsi.h |  2 ++
>   3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
> index 0ae936d839f1..955d8cb675f1 100644
> --- a/drivers/scsi/qla4xxx/ql4_os.c
> +++ b/drivers/scsi/qla4xxx/ql4_os.c
> @@ -5096,7 +5096,7 @@ int qla4xxx_unblock_flash_ddb(struct iscsi_cls_session *cls_session)
>   		ql4_printk(KERN_INFO, ha, "scsi%ld: %s: ddb[%d]"
>   			   " start scan\n", ha->host_no, __func__,
>   			   ddb_entry->fw_ddb_index);
> -		scsi_queue_work(ha->host, &ddb_entry->sess->scan_work);
> +		queue_work(ddb_entry->sess->workq, &ddb_entry->sess->scan_work);
>   	}
>   	return QLA_SUCCESS;
>   }
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 05cd4bca979e..ecb592a70e03 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -2032,19 +2032,27 @@ EXPORT_SYMBOL_GPL(iscsi_alloc_session);
>   
>   int iscsi_add_session(struct iscsi_cls_session *session, unsigned int target_id)
>   {
> +	struct Scsi_Host *shost = iscsi_session_to_shost(session);
>   	unsigned long flags;
>   	int id = 0;
>   	int err;
>   
>   	session->sid = atomic_add_return(1, &iscsi_session_nr);
>   
> +	session->workq = alloc_workqueue("iscsi_ctrl_%d:%d",
> +			WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
> +			shost->host_no, session->sid);
> +	if (!session->workq)
> +		return -ENOMEM;
> +
>   	if (target_id == ISCSI_MAX_TARGET) {
>   		id = ida_simple_get(&iscsi_sess_ida, 0, 0, GFP_KERNEL);
>   
>   		if (id < 0) {
>   			iscsi_cls_session_printk(KERN_ERR, session,
>   					"Failure in Target ID Allocation\n");
> -			return id;
> +			err = id;
> +			goto destroy_wq;
>   		}
>   		session->target_id = (unsigned int)id;
>   		session->ida_used = true;
> @@ -2078,7 +2086,8 @@ int iscsi_add_session(struct iscsi_cls_session *session, unsigned int target_id)
>   release_ida:
>   	if (session->ida_used)
>   		ida_simple_remove(&iscsi_sess_ida, session->target_id);
> -
> +destroy_wq:
> +	destroy_workqueue(session->workq);
>   	return err;
>   }
>   EXPORT_SYMBOL_GPL(iscsi_add_session);
> @@ -2177,6 +2186,8 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
>   
>   	transport_unregister_device(&session->dev);
>   
> +	destroy_workqueue(session->workq);
> +
>   	ISCSI_DBG_TRANS_SESSION(session, "Completing session removal\n");
>   	device_del(&session->dev);
>   }
> @@ -3833,8 +3844,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
>   	case ISCSI_UEVENT_UNBIND_SESSION:
>   		session = iscsi_session_lookup(ev->u.d_session.sid);
>   		if (session)
> -			scsi_queue_work(iscsi_session_to_shost(session),
> -					&session->unbind_work);
> +			queue_work(session->workq, &session->unbind_work);
>   		else
>   			err = -EINVAL;
>   		break;
> @@ -4707,7 +4717,6 @@ iscsi_register_transport(struct iscsi_transport *tt)
>   	INIT_LIST_HEAD(&priv->list);
>   	priv->iscsi_transport = tt;
>   	priv->t.user_scan = iscsi_user_scan;
> -	priv->t.create_work_queue = 1;
>   
>   	priv->dev.class = &iscsi_transport_class;
>   	dev_set_name(&priv->dev, "%s", tt->name);
> diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
> index 90b55db46d7c..7a0d24d3b916 100644
> --- a/include/scsi/scsi_transport_iscsi.h
> +++ b/include/scsi/scsi_transport_iscsi.h
> @@ -251,6 +251,8 @@ struct iscsi_cls_session {
>   	bool recovery_tmo_sysfs_override;
>   	struct delayed_work recovery_work;
>   
> +	struct workqueue_struct *workq;
> +
>   	unsigned int target_id;
>   	bool ida_used;
>   

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race.
  2022-02-26 23:04 ` [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race Mike Christie
  2022-02-27 19:49   ` Lee Duncan
@ 2022-02-28 20:06   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Leech @ 2022-02-28 20:06 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, liuzhengyuang521

On 2/26/22 3:04 PM, Mike Christie wrote:
> If the user sets the iscsi_eh_timer_workq/iscsi_eh workqueue's max_active
> to greater than 1, the recovery_work could be running when
> __iscsi_unblock_session runs. The cancel_delayed_work will then not wait
> for the running work and we can race where we end up with the wrong
> session state and scsi_device state set.
> 
> This replaces the cancel_delayed_work with the sync version.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_transport_iscsi.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 554b6f784223..c58126e8cd88 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1917,11 +1917,8 @@ static void __iscsi_unblock_session(struct work_struct *work)
>  	unsigned long flags;
>  
>  	ISCSI_DBG_TRANS_SESSION(session, "Unblocking session\n");
> -	/*
> -	 * The recovery and unblock work get run from the same workqueue,
> -	 * so try to cancel it if it was going to run after this unblock.
> -	 */
> -	cancel_delayed_work(&session->recovery_work);
> +
> +	cancel_delayed_work_sync(&session->recovery_work);
>  	spin_lock_irqsave(&session->lock, flags);
>  	session->state = ISCSI_SESSION_LOGGED_IN;
>  	spin_unlock_irqrestore(&session->lock, flags);

Looks good to me,

Reviewed-by: Chris Leech <cleech@redhat.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal.
  2022-02-26 23:04 ` [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal Mike Christie
  2022-02-28 16:05   ` Lee Duncan
@ 2022-02-28 20:06   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Leech @ 2022-02-28 20:06 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, liuzhengyuang521

On 2/26/22 3:04 PM, Mike Christie wrote:
> When the iscsi class was added upstream blocking a queue was fast because
> it just set some flag bits and didn't handle IO that was in the process
> of being sent to the driver. That's no longer the case so blocking a queue
> is expensive and we can end up with a backlog of blocks by the time we
> have relogged in and are trying to start the queues.
> 
> For the session unblock case, this has try to cancel the block and
> recovery work in case they are still queued so we can avoid unneeded queue
> manipulations. For removal we also now try to cancel all the recovery
> related works since a couple lines down we will set the session and device
> state so running those functions are not necessary.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_transport_iscsi.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index c58126e8cd88..732938f5436b 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1944,7 +1944,8 @@ static void __iscsi_unblock_session(struct work_struct *work)
>   */
>  void iscsi_unblock_session(struct iscsi_cls_session *session)
>  {
> -	flush_work(&session->block_work);
> +	if (!cancel_work_sync(&session->block_work))
> +		cancel_delayed_work_sync(&session->recovery_work);
>  
>  	queue_work(iscsi_eh_timer_workq, &session->unblock_work);
>  	/*
> @@ -2177,9 +2178,9 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
>  		list_del(&session->sess_list);
>  	spin_unlock_irqrestore(&sesslock, flags);
>  
> -	flush_work(&session->block_work);
> -	flush_work(&session->unblock_work);
> -	cancel_delayed_work_sync(&session->recovery_work);
> +	if (!cancel_work_sync(&session->block_work))
> +		cancel_delayed_work_sync(&session->recovery_work);
> +	cancel_work_sync(&session->unblock_work);
>  	/*
>  	 * If we are blocked let commands flow again. The lld or iscsi
>  	 * layer should set up the queuecommand to fail commands.


Reviewed-by: Chris Leech <cleech@redhat.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished.
  2022-02-26 23:04 ` [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished Mike Christie
  2022-02-28 18:05   ` Lee Duncan
@ 2022-02-28 20:07   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Leech @ 2022-02-28 20:07 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, liuzhengyuang521

On 2/26/22 3:04 PM, Mike Christie wrote:
> qla4xxx does not use iscsi_scan_finished anymore so remove it.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_transport_iscsi.c | 39 +++--------------------------
>  include/scsi/scsi_transport_iscsi.h |  2 --
>  2 files changed, 4 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 732938f5436b..05cd4bca979e 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -1557,7 +1557,6 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev,
>  	struct iscsi_cls_host *ihost = shost->shost_data;
>  
>  	memset(ihost, 0, sizeof(*ihost));
> -	atomic_set(&ihost->nr_scans, 0);
>  	mutex_init(&ihost->mutex);
>  
>  	iscsi_bsg_host_add(shost, ihost);
> @@ -1744,25 +1743,6 @@ void iscsi_host_for_each_session(struct Scsi_Host *shost,
>  }
>  EXPORT_SYMBOL_GPL(iscsi_host_for_each_session);
>  
> -/**
> - * iscsi_scan_finished - helper to report when running scans are done
> - * @shost: scsi host
> - * @time: scan run time
> - *
> - * This function can be used by drives like qla4xxx to report to the scsi
> - * layer when the scans it kicked off at module load time are done.
> - */
> -int iscsi_scan_finished(struct Scsi_Host *shost, unsigned long time)
> -{
> -	struct iscsi_cls_host *ihost = shost->shost_data;
> -	/*
> -	 * qla4xxx will have kicked off some session unblocks before calling
> -	 * scsi_scan_host, so just wait for them to complete.
> -	 */
> -	return !atomic_read(&ihost->nr_scans);
> -}
> -EXPORT_SYMBOL_GPL(iscsi_scan_finished);
> -
>  struct iscsi_scan_data {
>  	unsigned int channel;
>  	unsigned int id;
> @@ -1831,8 +1811,6 @@ static void iscsi_scan_session(struct work_struct *work)
>  {
>  	struct iscsi_cls_session *session =
>  			container_of(work, struct iscsi_cls_session, scan_work);
> -	struct Scsi_Host *shost = iscsi_session_to_shost(session);
> -	struct iscsi_cls_host *ihost = shost->shost_data;
>  	struct iscsi_scan_data scan_data;
>  
>  	scan_data.channel = 0;
> @@ -1841,7 +1819,6 @@ static void iscsi_scan_session(struct work_struct *work)
>  	scan_data.rescan = SCSI_SCAN_RESCAN;
>  
>  	iscsi_user_scan_session(&session->dev, &scan_data);
> -	atomic_dec(&ihost->nr_scans);
>  }
>  
>  /**
> @@ -1912,8 +1889,6 @@ static void __iscsi_unblock_session(struct work_struct *work)
>  	struct iscsi_cls_session *session =
>  			container_of(work, struct iscsi_cls_session,
>  				     unblock_work);
> -	struct Scsi_Host *shost = iscsi_session_to_shost(session);
> -	struct iscsi_cls_host *ihost = shost->shost_data;
>  	unsigned long flags;
>  
>  	ISCSI_DBG_TRANS_SESSION(session, "Unblocking session\n");
> @@ -1924,15 +1899,6 @@ static void __iscsi_unblock_session(struct work_struct *work)
>  	spin_unlock_irqrestore(&session->lock, flags);
>  	/* start IO */
>  	scsi_target_unblock(&session->dev, SDEV_RUNNING);
> -	/*
> -	 * Only do kernel scanning if the driver is properly hooked into
> -	 * the async scanning code (drivers like iscsi_tcp do login and
> -	 * scanning from userspace).
> -	 */
> -	if (shost->hostt->scan_finished) {
> -		if (scsi_queue_work(shost, &session->scan_work))
> -			atomic_inc(&ihost->nr_scans);
> -	}
>  	ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking session\n");
>  }
>  
> @@ -2192,7 +2158,10 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
>  	spin_unlock_irqrestore(&session->lock, flags);
>  
>  	scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE);
> -	/* flush running scans then delete devices */
> +	/*
> +	 * qla4xxx can perform it's own scans when it runs in kernel only
> +	 * mode. Make sure to flush those scans.
> +	 */
>  	flush_work(&session->scan_work);
>  	/* flush running unbind operations */
>  	flush_work(&session->unbind_work);
> diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
> index c5d7810fd792..90b55db46d7c 100644
> --- a/include/scsi/scsi_transport_iscsi.h
> +++ b/include/scsi/scsi_transport_iscsi.h
> @@ -278,7 +278,6 @@ struct iscsi_cls_session {
>  	iscsi_dev_to_session(_stgt->dev.parent)
>  
>  struct iscsi_cls_host {
> -	atomic_t nr_scans;
>  	struct mutex mutex;
>  	struct request_queue *bsg_q;
>  	uint32_t port_speed;
> @@ -448,7 +447,6 @@ extern void iscsi_get_conn(struct iscsi_cls_conn *conn);
>  extern int iscsi_destroy_conn(struct iscsi_cls_conn *conn);
>  extern void iscsi_unblock_session(struct iscsi_cls_session *session);
>  extern void iscsi_block_session(struct iscsi_cls_session *session);
> -extern int iscsi_scan_finished(struct Scsi_Host *shost, unsigned long time);
>  extern struct iscsi_endpoint *iscsi_create_endpoint(int dd_size);
>  extern void iscsi_destroy_endpoint(struct iscsi_endpoint *ep);
>  extern struct iscsi_endpoint *iscsi_lookup_endpoint(u64 handle);

Reviewed-by: Chris Leech <cleech@redhat.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding.
  2022-02-26 23:04 ` [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding Mike Christie
  2022-02-28 18:19   ` Lee Duncan
@ 2022-02-28 20:07   ` Chris Leech
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Leech @ 2022-02-28 20:07 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, liuzhengyuang521

On 2/26/22 3:04 PM, Mike Christie wrote:
> We currently allocate a workqueue per host and only use it for removing
> the target. For the session per host case we could be using this workqueue
> to be able to do recoveries (block, unblock, timeout handling) in
> parallel. To also allow offload drivers to do their session recoveries in
> parallel, this drops the per host workqueue and replaces it with a per
> session one.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/qla4xxx/ql4_os.c       |  2 +-
>  drivers/scsi/scsi_transport_iscsi.c | 19 ++++++++++++++-----
>  include/scsi/scsi_transport_iscsi.h |  2 ++
>  3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
> index 0ae936d839f1..955d8cb675f1 100644
> --- a/drivers/scsi/qla4xxx/ql4_os.c
> +++ b/drivers/scsi/qla4xxx/ql4_os.c
> @@ -5096,7 +5096,7 @@ int qla4xxx_unblock_flash_ddb(struct iscsi_cls_session *cls_session)
>  		ql4_printk(KERN_INFO, ha, "scsi%ld: %s: ddb[%d]"
>  			   " start scan\n", ha->host_no, __func__,
>  			   ddb_entry->fw_ddb_index);
> -		scsi_queue_work(ha->host, &ddb_entry->sess->scan_work);
> +		queue_work(ddb_entry->sess->workq, &ddb_entry->sess->scan_work);
>  	}
>  	return QLA_SUCCESS;
>  }
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index 05cd4bca979e..ecb592a70e03 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -2032,19 +2032,27 @@ EXPORT_SYMBOL_GPL(iscsi_alloc_session);
>  
>  int iscsi_add_session(struct iscsi_cls_session *session, unsigned int target_id)
>  {
> +	struct Scsi_Host *shost = iscsi_session_to_shost(session);
>  	unsigned long flags;
>  	int id = 0;
>  	int err;
>  
>  	session->sid = atomic_add_return(1, &iscsi_session_nr);
>  
> +	session->workq = alloc_workqueue("iscsi_ctrl_%d:%d",
> +			WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
> +			shost->host_no, session->sid);
> +	if (!session->workq)
> +		return -ENOMEM;
> +
>  	if (target_id == ISCSI_MAX_TARGET) {
>  		id = ida_simple_get(&iscsi_sess_ida, 0, 0, GFP_KERNEL);
>  
>  		if (id < 0) {
>  			iscsi_cls_session_printk(KERN_ERR, session,
>  					"Failure in Target ID Allocation\n");
> -			return id;
> +			err = id;
> +			goto destroy_wq;
>  		}
>  		session->target_id = (unsigned int)id;
>  		session->ida_used = true;
> @@ -2078,7 +2086,8 @@ int iscsi_add_session(struct iscsi_cls_session *session, unsigned int target_id)
>  release_ida:
>  	if (session->ida_used)
>  		ida_simple_remove(&iscsi_sess_ida, session->target_id);
> -
> +destroy_wq:
> +	destroy_workqueue(session->workq);
>  	return err;
>  }
>  EXPORT_SYMBOL_GPL(iscsi_add_session);
> @@ -2177,6 +2186,8 @@ void iscsi_remove_session(struct iscsi_cls_session *session)
>  
>  	transport_unregister_device(&session->dev);
>  
> +	destroy_workqueue(session->workq);
> +
>  	ISCSI_DBG_TRANS_SESSION(session, "Completing session removal\n");
>  	device_del(&session->dev);
>  }
> @@ -3833,8 +3844,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
>  	case ISCSI_UEVENT_UNBIND_SESSION:
>  		session = iscsi_session_lookup(ev->u.d_session.sid);
>  		if (session)
> -			scsi_queue_work(iscsi_session_to_shost(session),
> -					&session->unbind_work);
> +			queue_work(session->workq, &session->unbind_work);
>  		else
>  			err = -EINVAL;
>  		break;
> @@ -4707,7 +4717,6 @@ iscsi_register_transport(struct iscsi_transport *tt)
>  	INIT_LIST_HEAD(&priv->list);
>  	priv->iscsi_transport = tt;
>  	priv->t.user_scan = iscsi_user_scan;
> -	priv->t.create_work_queue = 1;
>  
>  	priv->dev.class = &iscsi_transport_class;
>  	dev_set_name(&priv->dev, "%s", tt->name);
> diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
> index 90b55db46d7c..7a0d24d3b916 100644
> --- a/include/scsi/scsi_transport_iscsi.h
> +++ b/include/scsi/scsi_transport_iscsi.h
> @@ -251,6 +251,8 @@ struct iscsi_cls_session {
>  	bool recovery_tmo_sysfs_override;
>  	struct delayed_work recovery_work;
>  
> +	struct workqueue_struct *workq;
> +
>  	unsigned int target_id;
>  	bool ida_used;
>  

Reviewed-by: Chris Leech <cleech@redhat.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery.
  2022-02-26 23:04 ` [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery Mike Christie
@ 2022-02-28 20:08   ` Chris Leech
  2022-02-28 20:09   ` Lee Duncan
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Leech @ 2022-02-28 20:08 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, liuzhengyuang521

On 2/26/22 3:04 PM, Mike Christie wrote:
> Use the session workqueue for recovery and unbinding. If there are delays
> during device blocking/cleanup then it will no longer affect other
> sessions.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/scsi_transport_iscsi.c | 20 ++++----------------
>  1 file changed, 4 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index ecb592a70e03..754277bec63a 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -87,7 +87,6 @@ struct iscsi_internal {
>  };
>  
>  static atomic_t iscsi_session_nr; /* sysfs session id for next new session */
> -static struct workqueue_struct *iscsi_eh_timer_workq;
>  
>  static struct workqueue_struct *iscsi_conn_cleanup_workq;
>  
> @@ -1913,7 +1912,7 @@ void iscsi_unblock_session(struct iscsi_cls_session *session)
>  	if (!cancel_work_sync(&session->block_work))
>  		cancel_delayed_work_sync(&session->recovery_work);
>  
> -	queue_work(iscsi_eh_timer_workq, &session->unblock_work);
> +	queue_work(session->workq, &session->unblock_work);
>  	/*
>  	 * Blocking the session can be done from any context so we only
>  	 * queue the block work. Make sure the unblock work has completed
> @@ -1937,14 +1936,14 @@ static void __iscsi_block_session(struct work_struct *work)
>  	scsi_target_block(&session->dev);
>  	ISCSI_DBG_TRANS_SESSION(session, "Completed SCSI target blocking\n");
>  	if (session->recovery_tmo >= 0)
> -		queue_delayed_work(iscsi_eh_timer_workq,
> +		queue_delayed_work(session->workq,
>  				   &session->recovery_work,
>  				   session->recovery_tmo * HZ);
>  }
>  
>  void iscsi_block_session(struct iscsi_cls_session *session)
>  {
> -	queue_work(iscsi_eh_timer_workq, &session->block_work);
> +	queue_work(session->workq, &session->block_work);
>  }
>  EXPORT_SYMBOL_GPL(iscsi_block_session);
>  
> @@ -4851,26 +4850,16 @@ static __init int iscsi_transport_init(void)
>  		goto unregister_flashnode_bus;
>  	}
>  
> -	iscsi_eh_timer_workq = alloc_workqueue("%s",
> -			WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
> -			1, "iscsi_eh");
> -	if (!iscsi_eh_timer_workq) {
> -		err = -ENOMEM;
> -		goto release_nls;
> -	}
> -
>  	iscsi_conn_cleanup_workq = alloc_workqueue("%s",
>  			WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
>  			"iscsi_conn_cleanup");
>  	if (!iscsi_conn_cleanup_workq) {
>  		err = -ENOMEM;
> -		goto destroy_wq;
> +		goto release_nls;
>  	}
>  
>  	return 0;
>  
> -destroy_wq:
> -	destroy_workqueue(iscsi_eh_timer_workq);
>  release_nls:
>  	netlink_kernel_release(nls);
>  unregister_flashnode_bus:
> @@ -4893,7 +4882,6 @@ static __init int iscsi_transport_init(void)
>  static void __exit iscsi_transport_exit(void)
>  {
>  	destroy_workqueue(iscsi_conn_cleanup_workq);
> -	destroy_workqueue(iscsi_eh_timer_workq);
>  	netlink_kernel_release(nls);
>  	bus_unregister(&iscsi_flashnode_bus);
>  	transport_class_unregister(&iscsi_connection_class);


Reviewed-by: Chris Leech <cleech@redhat.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] scsi: iscsi: Drop temp workq_name.
  2022-02-26 23:04 ` [PATCH 6/6] scsi: iscsi: Drop temp workq_name Mike Christie
@ 2022-02-28 20:08   ` Chris Leech
  2022-02-28 22:49   ` Lee Duncan
  1 sibling, 0 replies; 23+ messages in thread
From: Chris Leech @ 2022-02-28 20:08 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, liuzhengyuang521

On 2/26/22 3:04 PM, Mike Christie wrote:
> When the workqueue code was created it didn't allow variable args so we
> have been using a temp buffer. Drop that.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/scsi/libiscsi.c | 6 ++----
>  include/scsi/libiscsi.h | 1 -
>  2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 059dae8909ee..a75b85f0a189 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -2798,11 +2798,9 @@ struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
>  	ihost = shost_priv(shost);
>  
>  	if (xmit_can_sleep) {
> -		snprintf(ihost->workq_name, sizeof(ihost->workq_name),
> -			"iscsi_q_%d", shost->host_no);
> -		ihost->workq = alloc_workqueue("%s",
> +		ihost->workq = alloc_workqueue("iscsi_q_%d",
>  			WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
> -			1, ihost->workq_name);
> +			1, shost->host_no);
>  		if (!ihost->workq)
>  			goto free_host;
>  	}
> diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
> index 4ee233e5a6ff..2d85810d1929 100644
> --- a/include/scsi/libiscsi.h
> +++ b/include/scsi/libiscsi.h
> @@ -371,7 +371,6 @@ struct iscsi_host {
>  	int			state;
>  
>  	struct workqueue_struct	*workq;
> -	char			workq_name[20];
>  };
>  
>  /*

Reviewed-by: Chris Leech <cleech@redhat.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery.
  2022-02-26 23:04 ` [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery Mike Christie
  2022-02-28 20:08   ` Chris Leech
@ 2022-02-28 20:09   ` Lee Duncan
  1 sibling, 0 replies; 23+ messages in thread
From: Lee Duncan @ 2022-02-28 20:09 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/26/22 15:04, Mike Christie wrote:
> Use the session workqueue for recovery and unbinding. If there are delays
> during device blocking/cleanup then it will no longer affect other
> sessions.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/scsi_transport_iscsi.c | 20 ++++----------------
>   1 file changed, 4 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
> index ecb592a70e03..754277bec63a 100644
> --- a/drivers/scsi/scsi_transport_iscsi.c
> +++ b/drivers/scsi/scsi_transport_iscsi.c
> @@ -87,7 +87,6 @@ struct iscsi_internal {
>   };
>   
>   static atomic_t iscsi_session_nr; /* sysfs session id for next new session */
> -static struct workqueue_struct *iscsi_eh_timer_workq;
>   
>   static struct workqueue_struct *iscsi_conn_cleanup_workq;
>   
> @@ -1913,7 +1912,7 @@ void iscsi_unblock_session(struct iscsi_cls_session *session)
>   	if (!cancel_work_sync(&session->block_work))
>   		cancel_delayed_work_sync(&session->recovery_work);
>   
> -	queue_work(iscsi_eh_timer_workq, &session->unblock_work);
> +	queue_work(session->workq, &session->unblock_work);
>   	/*
>   	 * Blocking the session can be done from any context so we only
>   	 * queue the block work. Make sure the unblock work has completed
> @@ -1937,14 +1936,14 @@ static void __iscsi_block_session(struct work_struct *work)
>   	scsi_target_block(&session->dev);
>   	ISCSI_DBG_TRANS_SESSION(session, "Completed SCSI target blocking\n");
>   	if (session->recovery_tmo >= 0)
> -		queue_delayed_work(iscsi_eh_timer_workq,
> +		queue_delayed_work(session->workq,
>   				   &session->recovery_work,
>   				   session->recovery_tmo * HZ);
>   }
>   
>   void iscsi_block_session(struct iscsi_cls_session *session)
>   {
> -	queue_work(iscsi_eh_timer_workq, &session->block_work);
> +	queue_work(session->workq, &session->block_work);
>   }
>   EXPORT_SYMBOL_GPL(iscsi_block_session);
>   
> @@ -4851,26 +4850,16 @@ static __init int iscsi_transport_init(void)
>   		goto unregister_flashnode_bus;
>   	}
>   
> -	iscsi_eh_timer_workq = alloc_workqueue("%s",
> -			WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
> -			1, "iscsi_eh");
> -	if (!iscsi_eh_timer_workq) {
> -		err = -ENOMEM;
> -		goto release_nls;
> -	}
> -
>   	iscsi_conn_cleanup_workq = alloc_workqueue("%s",
>   			WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND, 0,
>   			"iscsi_conn_cleanup");
>   	if (!iscsi_conn_cleanup_workq) {
>   		err = -ENOMEM;
> -		goto destroy_wq;
> +		goto release_nls;
>   	}
>   
>   	return 0;
>   
> -destroy_wq:
> -	destroy_workqueue(iscsi_eh_timer_workq);
>   release_nls:
>   	netlink_kernel_release(nls);
>   unregister_flashnode_bus:
> @@ -4893,7 +4882,6 @@ static __init int iscsi_transport_init(void)
>   static void __exit iscsi_transport_exit(void)
>   {
>   	destroy_workqueue(iscsi_conn_cleanup_workq);
> -	destroy_workqueue(iscsi_eh_timer_workq);
>   	netlink_kernel_release(nls);
>   	bus_unregister(&iscsi_flashnode_bus);
>   	transport_class_unregister(&iscsi_connection_class);

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished.
  2022-02-28 18:05   ` Lee Duncan
@ 2022-02-28 20:39     ` Mike Christie
  0 siblings, 0 replies; 23+ messages in thread
From: Mike Christie @ 2022-02-28 20:39 UTC (permalink / raw)
  To: Lee Duncan, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/28/22 12:05 PM, Lee Duncan wrote:
>> -    if (shost->hostt->scan_finished) {
>> -        if (scsi_queue_work(shost, &session->scan_work))
>> -            atomic_inc(&ihost->nr_scans);
>> -    }
>>       ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking session\n");
>>   }

....

> 
> I have no issue with this, but it seems kind of unrelated to speeding up session unblocking 

It's related because in the next patch we stop creating the host's transport
wq which is used above in the chunk that was removed in __iscsi_unblock_session.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 6/6] scsi: iscsi: Drop temp workq_name.
  2022-02-26 23:04 ` [PATCH 6/6] scsi: iscsi: Drop temp workq_name Mike Christie
  2022-02-28 20:08   ` Chris Leech
@ 2022-02-28 22:49   ` Lee Duncan
  1 sibling, 0 replies; 23+ messages in thread
From: Lee Duncan @ 2022-02-28 22:49 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, cleech, liuzhengyuang521

On 2/26/22 15:04, Mike Christie wrote:
> When the workqueue code was created it didn't allow variable args so we
> have been using a temp buffer. Drop that.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/libiscsi.c | 6 ++----
>   include/scsi/libiscsi.h | 1 -
>   2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 059dae8909ee..a75b85f0a189 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -2798,11 +2798,9 @@ struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
>   	ihost = shost_priv(shost);
>   
>   	if (xmit_can_sleep) {
> -		snprintf(ihost->workq_name, sizeof(ihost->workq_name),
> -			"iscsi_q_%d", shost->host_no);
> -		ihost->workq = alloc_workqueue("%s",
> +		ihost->workq = alloc_workqueue("iscsi_q_%d",
>   			WQ_SYSFS | __WQ_LEGACY | WQ_MEM_RECLAIM | WQ_UNBOUND,
> -			1, ihost->workq_name);
> +			1, shost->host_no);
>   		if (!ihost->workq)
>   			goto free_host;
>   	}
> diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
> index 4ee233e5a6ff..2d85810d1929 100644
> --- a/include/scsi/libiscsi.h
> +++ b/include/scsi/libiscsi.h
> @@ -371,7 +371,6 @@ struct iscsi_host {
>   	int			state;
>   
>   	struct workqueue_struct	*workq;
> -	char			workq_name[20];
>   };
>   
>   /*

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/6] iscsi: Speed up failover with lots of devices.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (6 preceding siblings ...)
  2022-02-28 15:53 ` [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
@ 2022-03-02  4:20 ` Martin K. Petersen
  2022-03-09  4:14 ` Martin K. Petersen
  8 siblings, 0 replies; 23+ messages in thread
From: Martin K. Petersen @ 2022-03-02  4:20 UTC (permalink / raw)
  To: Mike Christie
  Cc: martin.petersen, linux-scsi, mrangankar, njavali,
	GR-QLogic-Storage-Upstream, lduncan, cleech, liuzhengyuang521


Mike,

> Zhengyuan Liu found an issue where failovers are taking a long time
> with lots of devices (/dev/sdXYZ nodes). The problem is that iscsid
> expects most nl operations to be fast (ignoring mem issues) and when
> the session block code was written blocking a queue/scsi_device was
> just setting some flag bits and state values more or less. Now a block
> call will actually handle IO that has been sent to the driver, so it
> can be expensive. When you add in more and more devices, then a
> session block call will take longer and longer.
>
> This patchset moves the recovery and unbind operations to a per
> session work queue instead of the mix or per session, host and module.

Applied to 5.18/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/6] iscsi: Speed up failover with lots of devices.
  2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
                   ` (7 preceding siblings ...)
  2022-03-02  4:20 ` Martin K. Petersen
@ 2022-03-09  4:14 ` Martin K. Petersen
  8 siblings, 0 replies; 23+ messages in thread
From: Martin K. Petersen @ 2022-03-09  4:14 UTC (permalink / raw)
  To: linux-scsi, njavali, liuzhengyuang521, mrangankar, Mike Christie,
	cleech, lduncan, GR-QLogic-Storage-Upstream
  Cc: Martin K . Petersen

On Sat, 26 Feb 2022 17:04:29 -0600, Mike Christie wrote:

> In:
> 
> https://lore.kernel.org/all/CAK3e-EZbJMDHkozGiz8LnMNAZ+SoCA+QeK0kpkqM4vQ4pz86SQ@mail.gmail.com/t/
> 
> Zhengyuan Liu found an issue where failovers are taking a long time
> with lots of devices (/dev/sdXYZ nodes). The problem is that iscsid
> expects most nl operations to be fast (ignoring mem issues) and when
> the session block code was written blocking a queue/scsi_device was
> just setting some flag bits and state values more or less. Now a block
> call will actually handle IO that has been sent to the driver, so it
> can be expensive. When you add in more and more devices, then a
> session block call will take longer and longer.
> 
> [...]

Applied to 5.18/scsi-queue, thanks!

[1/6] scsi: iscsi: Fix recovery and ublocking race.
      https://git.kernel.org/mkp/scsi/c/8dd3dff3bf3e
[2/6] scsi: iscsi: Speed up session unblocking and removal.
      https://git.kernel.org/mkp/scsi/c/b07c348f8ffb
[3/6] scsi: iscsi: Remove iscsi_scan_finished.
      https://git.kernel.org/mkp/scsi/c/d8ec5d67b8bb
[4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding.
      https://git.kernel.org/mkp/scsi/c/5842ea366831
[5/6] scsi: iscsi: Use the session workqueue for recovery.
      https://git.kernel.org/mkp/scsi/c/7cb6683ce761
[6/6] scsi: iscsi: Drop temp workq_name.
      https://git.kernel.org/mkp/scsi/c/69af1c9577aa

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-03-09  4:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-26 23:04 [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
2022-02-26 23:04 ` [PATCH 1/6] scsi: iscsi: Fix recovery and ublocking race Mike Christie
2022-02-27 19:49   ` Lee Duncan
2022-02-28 20:06   ` Chris Leech
2022-02-26 23:04 ` [PATCH 2/6] scsi: iscsi: Speed up session unblocking and removal Mike Christie
2022-02-28 16:05   ` Lee Duncan
2022-02-28 20:06   ` Chris Leech
2022-02-26 23:04 ` [PATCH 3/6] scsi: iscsi: Remove iscsi_scan_finished Mike Christie
2022-02-28 18:05   ` Lee Duncan
2022-02-28 20:39     ` Mike Christie
2022-02-28 20:07   ` Chris Leech
2022-02-26 23:04 ` [PATCH 4/6] scsi: iscsi, ql4: Use per session workqueue for unbinding Mike Christie
2022-02-28 18:19   ` Lee Duncan
2022-02-28 20:07   ` Chris Leech
2022-02-26 23:04 ` [PATCH 5/6] scsi: iscsi: Use the session workqueue for recovery Mike Christie
2022-02-28 20:08   ` Chris Leech
2022-02-28 20:09   ` Lee Duncan
2022-02-26 23:04 ` [PATCH 6/6] scsi: iscsi: Drop temp workq_name Mike Christie
2022-02-28 20:08   ` Chris Leech
2022-02-28 22:49   ` Lee Duncan
2022-02-28 15:53 ` [PATCH 0/6] iscsi: Speed up failover with lots of devices Mike Christie
2022-03-02  4:20 ` Martin K. Petersen
2022-03-09  4:14 ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.