All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] target: fixes and perf improvements
@ 2021-09-30  2:04 Mike Christie
  2021-09-30  2:04 ` [PATCH 1/5] target: fix ordered CMD_T_SENT handling Mike Christie
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Mike Christie @ 2021-09-30  2:04 UTC (permalink / raw)
  To: martin.petersen, james.bottomley, linux-scsi, target-devel

The following patches apply to Martin's staging tree or Linus's tree.
The patches main goal is to take the locks out of the main IO path but
for the case of ordered cmds they also fix a handfull of bugs.

For the locks we currently have:

1. lun_tg_pt_gp_lock
2. delayed_cmd_lock
3. dev_reservation_lock

and this set takes out 1 and 2. With them removed a simple fio:

fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=64  --numjobs=$NUM_QUEUES

can increase IOPs by up to 30% (from a max of 1.4M to 2M) when using
multiple queues and vhost-scsi with the multiple vhost thread patches or
tcm loop with nr_hw_queues set.

Note: I normally hit a ceiling of 1.4M IOPs with around 8 queues but with
the patches I hit a ceiling at around 16 queues and 2M IOPs.

If I cheat and set emulate_pr=0 so the reservation lock is removed then
it scales nicely and you can continue to add a job and queue per CPU (at
least up to 20 CPUs which is when I run out of CPUs).



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/5] target: fix ordered CMD_T_SENT handling
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
@ 2021-09-30  2:04 ` Mike Christie
  2021-10-05 18:02   ` Lee Duncan
  2021-09-30  2:04 ` [PATCH 2/5] target: fix ordered tag handling Mike Christie
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2021-09-30  2:04 UTC (permalink / raw)
  To: martin.petersen, james.bottomley, linux-scsi, target-devel; +Cc: Mike Christie

We can race where target_handle_task_attr has put the cmd on the
delayed_cmd_list. Then target_restart_delayed_cmds has removed it and
set CMD_T_SENT, but then target_execute_cmd now clears that bit.

This patch moves the clearing to before we've put the cmd on the list.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_transport.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 14c6f2bb1b01..e02173a4b7bc 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2200,6 +2200,10 @@ static bool target_handle_task_attr(struct se_cmd *cmd)
 	if (atomic_read(&dev->dev_ordered_sync) == 0)
 		return false;
 
+	spin_lock_irq(&cmd->t_state_lock);
+	cmd->transport_state &= ~CMD_T_SENT;
+	spin_unlock_irq(&cmd->t_state_lock);
+
 	spin_lock(&dev->delayed_cmd_lock);
 	list_add_tail(&cmd->se_delayed_node, &dev->delayed_cmd_list);
 	spin_unlock(&dev->delayed_cmd_lock);
@@ -2228,12 +2232,8 @@ void target_execute_cmd(struct se_cmd *cmd)
 	if (target_write_prot_action(cmd))
 		return;
 
-	if (target_handle_task_attr(cmd)) {
-		spin_lock_irq(&cmd->t_state_lock);
-		cmd->transport_state &= ~CMD_T_SENT;
-		spin_unlock_irq(&cmd->t_state_lock);
+	if (target_handle_task_attr(cmd))
 		return;
-	}
 
 	__target_execute_cmd(cmd, true);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/5] target: fix ordered tag handling
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
  2021-09-30  2:04 ` [PATCH 1/5] target: fix ordered CMD_T_SENT handling Mike Christie
@ 2021-09-30  2:04 ` Mike Christie
  2021-09-30  2:04 ` [PATCH 3/5] target: fix alua_tg_pt_gps_count tracking Mike Christie
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2021-09-30  2:04 UTC (permalink / raw)
  To: martin.petersen, james.bottomley, linux-scsi, target-devel; +Cc: Mike Christie

This patch fixes the following bugs:

1. If there are multiple ordered cmds queued and multiple simple cmds
completing, target_restart_delayed_cmds could be called on different
CPUs and each instance could start a ordered cmd. They could then
run in different orders than they were queued.

2. target_restart_delayed_cmds and target_handle_task_attr can race
where:

        1. target_handle_task_attr has passed the simple_cmds == 0 check.
        2. transport_complete_task_attr then decrements simple_cmds to 0.
        3. transport_complete_task_attr runs target_restart_delayed_cmds
        and it does not see any cmds on the delayed_cmd_list.
        4. target_handle_task_attr adds the cmd to the delayed_cmd_list.

The cmd will then end up timing out.

3. If we are sent > 1 ordered cmds and simple_cmds == 0, we can execute
them out of order, because target_handle_task_attr will hit that
simple_cmds check first and return false for all ordered cmds sent.

4. We run target_restart_delayed_cmds after every cmd completion, so
if there is more than 1 simple cmd running, we start executing ordered
cmds after that first cmd instead of waiting for all of them to complete.

5. Ordered cmds are not supposed to start until HEAD OF QUEUE and all
older cmds have completed, and not just simple.

6. It's not a bug but it doesn't make sense to take the delayed_cmd_lock
for every cmd completion when ordered cmds are almost never used. Just
replacing that lock with an atomic increases IOPs by up to 10% when
completions are spread over multiple CPUs and there are multiple sessions/
mqs/thread accessing the same device.

This patch moves the queued delayed handling to a per device work to
serialze the cmd executions for each device and adds a new counter to
track HEAD_OF_QUEUE and SIMPLE cmds. We can then check the new counter to
determine when to run the work on the completion path.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_device.c    |  2 +
 drivers/target/target_core_internal.h  |  1 +
 drivers/target/target_core_transport.c | 76 ++++++++++++++++++--------
 include/target/target_core_base.h      |  6 +-
 4 files changed, 61 insertions(+), 24 deletions(-)

diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index 8cb1fa0c0585..44bb380e7390 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -772,6 +772,8 @@ struct se_device *target_alloc_device(struct se_hba *hba, const char *name)
 	INIT_LIST_HEAD(&dev->t10_alua.lba_map_list);
 	spin_lock_init(&dev->t10_alua.lba_map_lock);
 
+	INIT_WORK(&dev->delayed_cmd_work, target_do_delayed_work);
+
 	dev->t10_wwn.t10_dev = dev;
 	/*
 	 * Use OpenFabrics IEEE Company ID: 00 14 05
diff --git a/drivers/target/target_core_internal.h b/drivers/target/target_core_internal.h
index a343bcfa2180..a889a6237d9c 100644
--- a/drivers/target/target_core_internal.h
+++ b/drivers/target/target_core_internal.h
@@ -151,6 +151,7 @@ int	transport_dump_vpd_ident(struct t10_vpd *, unsigned char *, int);
 void	transport_clear_lun_ref(struct se_lun *);
 sense_reason_t	target_cmd_size_check(struct se_cmd *cmd, unsigned int size);
 void	target_qf_do_work(struct work_struct *work);
+void	target_do_delayed_work(struct work_struct *work);
 bool	target_check_wce(struct se_device *dev);
 bool	target_check_fua(struct se_device *dev);
 void	__target_execute_cmd(struct se_cmd *, bool);
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index e02173a4b7bc..913f31561531 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2173,32 +2173,35 @@ static bool target_handle_task_attr(struct se_cmd *cmd)
 	 */
 	switch (cmd->sam_task_attr) {
 	case TCM_HEAD_TAG:
+		atomic_inc_mb(&dev->non_ordered);
 		pr_debug("Added HEAD_OF_QUEUE for CDB: 0x%02x\n",
 			 cmd->t_task_cdb[0]);
 		return false;
 	case TCM_ORDERED_TAG:
-		atomic_inc_mb(&dev->dev_ordered_sync);
+		atomic_inc_mb(&dev->delayed_cmd_count);
 
 		pr_debug("Added ORDERED for CDB: 0x%02x to ordered list\n",
 			 cmd->t_task_cdb[0]);
-
-		/*
-		 * Execute an ORDERED command if no other older commands
-		 * exist that need to be completed first.
-		 */
-		if (!atomic_read(&dev->simple_cmds))
-			return false;
 		break;
 	default:
 		/*
 		 * For SIMPLE and UNTAGGED Task Attribute commands
 		 */
-		atomic_inc_mb(&dev->simple_cmds);
+		atomic_inc_mb(&dev->non_ordered);
+
+		if (atomic_read(&dev->delayed_cmd_count) == 0)
+			return false;
 		break;
 	}
 
-	if (atomic_read(&dev->dev_ordered_sync) == 0)
-		return false;
+	if (cmd->sam_task_attr != TCM_ORDERED_TAG) {
+		atomic_inc_mb(&dev->delayed_cmd_count);
+		/*
+		 * We will account for this when we dequeue from the delayed
+		 * list.
+		 */
+		atomic_dec_mb(&dev->non_ordered);
+	}
 
 	spin_lock_irq(&cmd->t_state_lock);
 	cmd->transport_state &= ~CMD_T_SENT;
@@ -2210,6 +2213,12 @@ static bool target_handle_task_attr(struct se_cmd *cmd)
 
 	pr_debug("Added CDB: 0x%02x Task Attr: 0x%02x to delayed CMD listn",
 		cmd->t_task_cdb[0], cmd->sam_task_attr);
+	/*
+	 * We may have no non ordered cmds when this function started or we
+	 * could have raced with the last simple/head cmd completing, so kick
+	 * the delayed handler here.
+	 */
+	schedule_work(&dev->delayed_cmd_work);
 	return true;
 }
 
@@ -2243,29 +2252,48 @@ EXPORT_SYMBOL(target_execute_cmd);
  * Process all commands up to the last received ORDERED task attribute which
  * requires another blocking boundary
  */
-static void target_restart_delayed_cmds(struct se_device *dev)
+void target_do_delayed_work(struct work_struct *work)
 {
-	for (;;) {
+	struct se_device *dev = container_of(work, struct se_device,
+					     delayed_cmd_work);
+
+	spin_lock(&dev->delayed_cmd_lock);
+	while (!dev->ordered_sync_in_progress) {
 		struct se_cmd *cmd;
 
-		spin_lock(&dev->delayed_cmd_lock);
-		if (list_empty(&dev->delayed_cmd_list)) {
-			spin_unlock(&dev->delayed_cmd_lock);
+		if (list_empty(&dev->delayed_cmd_list))
 			break;
-		}
 
 		cmd = list_entry(dev->delayed_cmd_list.next,
 				 struct se_cmd, se_delayed_node);
+
+		if (cmd->sam_task_attr == TCM_ORDERED_TAG) {
+			/*
+			 * Check if we started with:
+			 * [ordered] [simple] [ordered]
+			 * and we are now at the last ordered so we have to wait
+			 * for the simple cmd.
+			 */
+			if (atomic_read(&dev->non_ordered) > 0)
+				break;
+
+			dev->ordered_sync_in_progress = true;
+		}
+
 		list_del(&cmd->se_delayed_node);
+		atomic_dec_mb(&dev->delayed_cmd_count);
 		spin_unlock(&dev->delayed_cmd_lock);
 
+		if (cmd->sam_task_attr != TCM_ORDERED_TAG)
+			atomic_inc_mb(&dev->non_ordered);
+
 		cmd->transport_state |= CMD_T_SENT;
 
 		__target_execute_cmd(cmd, true);
 
-		if (cmd->sam_task_attr == TCM_ORDERED_TAG)
-			break;
+		spin_lock(&dev->delayed_cmd_lock);
 	}
+	spin_unlock(&dev->delayed_cmd_lock);
 }
 
 /*
@@ -2283,14 +2311,17 @@ static void transport_complete_task_attr(struct se_cmd *cmd)
 		goto restart;
 
 	if (cmd->sam_task_attr == TCM_SIMPLE_TAG) {
-		atomic_dec_mb(&dev->simple_cmds);
+		atomic_dec_mb(&dev->non_ordered);
 		dev->dev_cur_ordered_id++;
 	} else if (cmd->sam_task_attr == TCM_HEAD_TAG) {
+		atomic_dec_mb(&dev->non_ordered);
 		dev->dev_cur_ordered_id++;
 		pr_debug("Incremented dev_cur_ordered_id: %u for HEAD_OF_QUEUE\n",
 			 dev->dev_cur_ordered_id);
 	} else if (cmd->sam_task_attr == TCM_ORDERED_TAG) {
-		atomic_dec_mb(&dev->dev_ordered_sync);
+		spin_lock(&dev->delayed_cmd_lock);
+		dev->ordered_sync_in_progress = false;
+		spin_unlock(&dev->delayed_cmd_lock);
 
 		dev->dev_cur_ordered_id++;
 		pr_debug("Incremented dev_cur_ordered_id: %u for ORDERED\n",
@@ -2299,7 +2330,8 @@ static void transport_complete_task_attr(struct se_cmd *cmd)
 	cmd->se_cmd_flags &= ~SCF_TASK_ATTR_SET;
 
 restart:
-	target_restart_delayed_cmds(dev);
+	if (atomic_read(&dev->delayed_cmd_count) > 0)
+		schedule_work(&dev->delayed_cmd_work);
 }
 
 static void transport_complete_qf(struct se_cmd *cmd)
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index fb11c7693b25..2121a323fd6c 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -812,8 +812,9 @@ struct se_device {
 	atomic_long_t		read_bytes;
 	atomic_long_t		write_bytes;
 	/* Active commands on this virtual SE device */
-	atomic_t		simple_cmds;
-	atomic_t		dev_ordered_sync;
+	atomic_t		non_ordered;
+	bool			ordered_sync_in_progress;
+	atomic_t		delayed_cmd_count;
 	atomic_t		dev_qf_count;
 	u32			export_count;
 	spinlock_t		delayed_cmd_lock;
@@ -834,6 +835,7 @@ struct se_device {
 	struct list_head	dev_sep_list;
 	struct list_head	dev_tmr_list;
 	struct work_struct	qf_work_queue;
+	struct work_struct	delayed_cmd_work;
 	struct list_head	delayed_cmd_list;
 	struct list_head	qf_cmd_list;
 	/* Pointer to associated SE HBA */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/5] target: fix alua_tg_pt_gps_count tracking
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
  2021-09-30  2:04 ` [PATCH 1/5] target: fix ordered CMD_T_SENT handling Mike Christie
  2021-09-30  2:04 ` [PATCH 2/5] target: fix ordered tag handling Mike Christie
@ 2021-09-30  2:04 ` Mike Christie
  2021-09-30  2:04 ` [PATCH 4/5] target: replace lun_tg_pt_gp_lock with rcu in IO path Mike Christie
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2021-09-30  2:04 UTC (permalink / raw)
  To: martin.petersen, james.bottomley, linux-scsi, target-devel; +Cc: Mike Christie

We can't free the tg_pt_gp in core_alua_set_tg_pt_gp_id because it's still
accessed via configfs. It's release must go through the normal
configfs/refcount process.

I think the max alua_tg_pt_gps_count check should have been done in
core_alua_allocate_tg_pt_gp, but with the current code userspace could
have created 0x0000ffff + 1 groups, but only set the id for 0x0000ffff.
Then it could have deleted a group with an id set, and then set the id for
that extra group and it would work ok.

It's unlikely, but just in case this patch continues to allow that type of
behavior, and just fixes the kfree while in use bug.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_alua.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c
index cb1de1ecaaa6..bd0f2ce011dd 100644
--- a/drivers/target/target_core_alua.c
+++ b/drivers/target/target_core_alua.c
@@ -1674,7 +1674,6 @@ int core_alua_set_tg_pt_gp_id(
 		pr_err("Maximum ALUA alua_tg_pt_gps_count:"
 			" 0x0000ffff reached\n");
 		spin_unlock(&dev->t10_alua.tg_pt_gps_lock);
-		kmem_cache_free(t10_alua_tg_pt_gp_cache, tg_pt_gp);
 		return -ENOSPC;
 	}
 again:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/5] target: replace lun_tg_pt_gp_lock with rcu in IO path
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
                   ` (2 preceding siblings ...)
  2021-09-30  2:04 ` [PATCH 3/5] target: fix alua_tg_pt_gps_count tracking Mike Christie
@ 2021-09-30  2:04 ` Mike Christie
  2021-09-30  2:04 ` [PATCH 5/5] target: perform alua group changes in one step Mike Christie
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2021-09-30  2:04 UTC (permalink / raw)
  To: martin.petersen, james.bottomley, linux-scsi, target-devel; +Cc: Mike Christie

We are only holding the lun_tg_pt_gp_lock in target_alua_state_check to
make sure tg_pt_gp is not freed from under us while we copy the state,
delay, id values. We can instead use RCU here to access the tg_pt_gp.

With this patch IOPs can increase up to 10% for jobs like:

fio  --filename=/dev/sdX  --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=64  --numjobs=N

when there aire mulitple sessions (you are running that fio command to
each /dev/sdX or using multipath and there are over 8 paths), or more than
8 queues for the loop or vhost with multiple threads case and numjobs > 8.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_alua.c | 61 +++++++++++++++++--------------
 include/target/target_core_base.h |  2 +-
 2 files changed, 35 insertions(+), 28 deletions(-)

diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c
index bd0f2ce011dd..74944b914b4e 100644
--- a/drivers/target/target_core_alua.c
+++ b/drivers/target/target_core_alua.c
@@ -247,11 +247,11 @@ target_emulate_report_target_port_groups(struct se_cmd *cmd)
 		 * this CDB was received upon to determine this value individually
 		 * for ALUA target port group.
 		 */
-		spin_lock(&cmd->se_lun->lun_tg_pt_gp_lock);
-		tg_pt_gp = cmd->se_lun->lun_tg_pt_gp;
+		rcu_read_lock();
+		tg_pt_gp = rcu_dereference(cmd->se_lun->lun_tg_pt_gp);
 		if (tg_pt_gp)
 			buf[5] = tg_pt_gp->tg_pt_gp_implicit_trans_secs;
-		spin_unlock(&cmd->se_lun->lun_tg_pt_gp_lock);
+		rcu_read_unlock();
 	}
 	transport_kunmap_data_sg(cmd);
 
@@ -292,24 +292,24 @@ target_emulate_set_target_port_groups(struct se_cmd *cmd)
 	 * Determine if explicit ALUA via SET_TARGET_PORT_GROUPS is allowed
 	 * for the local tg_pt_gp.
 	 */
-	spin_lock(&l_lun->lun_tg_pt_gp_lock);
-	l_tg_pt_gp = l_lun->lun_tg_pt_gp;
+	rcu_read_lock();
+	l_tg_pt_gp = rcu_dereference(l_lun->lun_tg_pt_gp);
 	if (!l_tg_pt_gp) {
-		spin_unlock(&l_lun->lun_tg_pt_gp_lock);
+		rcu_read_unlock();
 		pr_err("Unable to access l_lun->tg_pt_gp\n");
 		rc = TCM_UNSUPPORTED_SCSI_OPCODE;
 		goto out;
 	}
 
 	if (!(l_tg_pt_gp->tg_pt_gp_alua_access_type & TPGS_EXPLICIT_ALUA)) {
-		spin_unlock(&l_lun->lun_tg_pt_gp_lock);
+		rcu_read_unlock();
 		pr_debug("Unable to process SET_TARGET_PORT_GROUPS"
 				" while TPGS_EXPLICIT_ALUA is disabled\n");
 		rc = TCM_UNSUPPORTED_SCSI_OPCODE;
 		goto out;
 	}
 	valid_states = l_tg_pt_gp->tg_pt_gp_alua_supported_states;
-	spin_unlock(&l_lun->lun_tg_pt_gp_lock);
+	rcu_read_unlock();
 
 	ptr = &buf[4]; /* Skip over RESERVED area in header */
 
@@ -662,17 +662,17 @@ target_alua_state_check(struct se_cmd *cmd)
 				" target port\n");
 		return TCM_ALUA_OFFLINE;
 	}
-
-	if (!lun->lun_tg_pt_gp)
+	rcu_read_lock();
+	tg_pt_gp = rcu_dereference(lun->lun_tg_pt_gp);
+	if (!tg_pt_gp) {
+		rcu_read_unlock();
 		return 0;
+	}
 
-	spin_lock(&lun->lun_tg_pt_gp_lock);
-	tg_pt_gp = lun->lun_tg_pt_gp;
 	out_alua_state = tg_pt_gp->tg_pt_gp_alua_access_state;
 	nonop_delay_msecs = tg_pt_gp->tg_pt_gp_nonop_delay_msecs;
 	tg_pt_gp_id = tg_pt_gp->tg_pt_gp_id;
-
-	spin_unlock(&lun->lun_tg_pt_gp_lock);
+	rcu_read_unlock();
 	/*
 	 * Process ALUA_ACCESS_STATE_ACTIVE_OPTIMIZED in a separate conditional
 	 * statement so the compiler knows explicitly to check this case first.
@@ -1219,10 +1219,10 @@ static int core_alua_set_tg_pt_secondary_state(
 	struct t10_alua_tg_pt_gp *tg_pt_gp;
 	int trans_delay_msecs;
 
-	spin_lock(&lun->lun_tg_pt_gp_lock);
-	tg_pt_gp = lun->lun_tg_pt_gp;
+	rcu_read_lock();
+	tg_pt_gp = rcu_dereference(lun->lun_tg_pt_gp);
 	if (!tg_pt_gp) {
-		spin_unlock(&lun->lun_tg_pt_gp_lock);
+		rcu_read_unlock();
 		pr_err("Unable to complete secondary state"
 				" transition\n");
 		return -EINVAL;
@@ -1246,7 +1246,7 @@ static int core_alua_set_tg_pt_secondary_state(
 		"implicit", config_item_name(&tg_pt_gp->tg_pt_gp_group.cg_item),
 		tg_pt_gp->tg_pt_gp_id, (offline) ? "OFFLINE" : "ONLINE");
 
-	spin_unlock(&lun->lun_tg_pt_gp_lock);
+	rcu_read_unlock();
 	/*
 	 * Do the optional transition delay after we set the secondary
 	 * ALUA access state.
@@ -1754,13 +1754,14 @@ void core_alua_free_tg_pt_gp(
 			__target_attach_tg_pt_gp(lun,
 					dev->t10_alua.default_tg_pt_gp);
 		} else
-			lun->lun_tg_pt_gp = NULL;
+			rcu_assign_pointer(lun->lun_tg_pt_gp, NULL);
 		spin_unlock(&lun->lun_tg_pt_gp_lock);
 
 		spin_lock(&tg_pt_gp->tg_pt_gp_lock);
 	}
 	spin_unlock(&tg_pt_gp->tg_pt_gp_lock);
 
+	synchronize_rcu();
 	kmem_cache_free(t10_alua_tg_pt_gp_cache, tg_pt_gp);
 }
 
@@ -1805,7 +1806,7 @@ static void __target_attach_tg_pt_gp(struct se_lun *lun,
 	assert_spin_locked(&lun->lun_tg_pt_gp_lock);
 
 	spin_lock(&tg_pt_gp->tg_pt_gp_lock);
-	lun->lun_tg_pt_gp = tg_pt_gp;
+	rcu_assign_pointer(lun->lun_tg_pt_gp, tg_pt_gp);
 	list_add_tail(&lun->lun_tg_pt_gp_link, &tg_pt_gp->tg_pt_gp_lun_list);
 	tg_pt_gp->tg_pt_gp_members++;
 	spin_lock(&lun->lun_deve_lock);
@@ -1822,6 +1823,7 @@ void target_attach_tg_pt_gp(struct se_lun *lun,
 	spin_lock(&lun->lun_tg_pt_gp_lock);
 	__target_attach_tg_pt_gp(lun, tg_pt_gp);
 	spin_unlock(&lun->lun_tg_pt_gp_lock);
+	synchronize_rcu();
 }
 
 static void __target_detach_tg_pt_gp(struct se_lun *lun,
@@ -1834,7 +1836,7 @@ static void __target_detach_tg_pt_gp(struct se_lun *lun,
 	tg_pt_gp->tg_pt_gp_members--;
 	spin_unlock(&tg_pt_gp->tg_pt_gp_lock);
 
-	lun->lun_tg_pt_gp = NULL;
+	rcu_assign_pointer(lun->lun_tg_pt_gp, NULL);
 }
 
 void target_detach_tg_pt_gp(struct se_lun *lun)
@@ -1842,10 +1844,12 @@ void target_detach_tg_pt_gp(struct se_lun *lun)
 	struct t10_alua_tg_pt_gp *tg_pt_gp;
 
 	spin_lock(&lun->lun_tg_pt_gp_lock);
-	tg_pt_gp = lun->lun_tg_pt_gp;
+	tg_pt_gp = rcu_dereference_check(lun->lun_tg_pt_gp,
+				lockdep_is_held(&lun->lun_tg_pt_gp_lock));
 	if (tg_pt_gp)
 		__target_detach_tg_pt_gp(lun, tg_pt_gp);
 	spin_unlock(&lun->lun_tg_pt_gp_lock);
+	synchronize_rcu();
 }
 
 ssize_t core_alua_show_tg_pt_gp_info(struct se_lun *lun, char *page)
@@ -1854,8 +1858,8 @@ ssize_t core_alua_show_tg_pt_gp_info(struct se_lun *lun, char *page)
 	struct t10_alua_tg_pt_gp *tg_pt_gp;
 	ssize_t len = 0;
 
-	spin_lock(&lun->lun_tg_pt_gp_lock);
-	tg_pt_gp = lun->lun_tg_pt_gp;
+	rcu_read_lock();
+	tg_pt_gp = rcu_dereference(lun->lun_tg_pt_gp);
 	if (tg_pt_gp) {
 		tg_pt_ci = &tg_pt_gp->tg_pt_gp_group.cg_item;
 		len += sprintf(page, "TG Port Alias: %s\nTG Port Group ID:"
@@ -1871,7 +1875,7 @@ ssize_t core_alua_show_tg_pt_gp_info(struct se_lun *lun, char *page)
 			"Offline" : "None",
 			core_alua_dump_status(lun->lun_tg_pt_secondary_stat));
 	}
-	spin_unlock(&lun->lun_tg_pt_gp_lock);
+	rcu_read_unlock();
 
 	return len;
 }
@@ -1918,7 +1922,8 @@ ssize_t core_alua_store_tg_pt_gp_info(
 	}
 
 	spin_lock(&lun->lun_tg_pt_gp_lock);
-	tg_pt_gp = lun->lun_tg_pt_gp;
+	tg_pt_gp = rcu_dereference_check(lun->lun_tg_pt_gp,
+				lockdep_is_held(&lun->lun_tg_pt_gp_lock));
 	if (tg_pt_gp) {
 		/*
 		 * Clearing an existing tg_pt_gp association, and replacing
@@ -1941,7 +1946,7 @@ ssize_t core_alua_store_tg_pt_gp_info(
 					dev->t10_alua.default_tg_pt_gp);
 			spin_unlock(&lun->lun_tg_pt_gp_lock);
 
-			return count;
+			goto sync_rcu;
 		}
 		__target_detach_tg_pt_gp(lun, tg_pt_gp);
 		move = 1;
@@ -1958,6 +1963,8 @@ ssize_t core_alua_store_tg_pt_gp_info(
 		tg_pt_gp_new->tg_pt_gp_id);
 
 	core_alua_put_tg_pt_gp_from_name(tg_pt_gp_new);
+sync_rcu:
+	synchronize_rcu();
 	return count;
 }
 
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 2121a323fd6c..d7d31a508dec 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -749,7 +749,7 @@ struct se_lun {
 
 	/* ALUA target port group linkage */
 	struct list_head	lun_tg_pt_gp_link;
-	struct t10_alua_tg_pt_gp *lun_tg_pt_gp;
+	struct t10_alua_tg_pt_gp __rcu *lun_tg_pt_gp;
 	spinlock_t		lun_tg_pt_gp_lock;
 
 	struct se_portal_group	*lun_tpg;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/5] target: perform alua group changes in one step
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
                   ` (3 preceding siblings ...)
  2021-09-30  2:04 ` [PATCH 4/5] target: replace lun_tg_pt_gp_lock with rcu in IO path Mike Christie
@ 2021-09-30  2:04 ` Mike Christie
  2021-10-17  3:08 ` [PATCH 0/5] target: fixes and perf improvements Martin K. Petersen
  2021-10-21  3:42 ` Martin K. Petersen
  6 siblings, 0 replies; 9+ messages in thread
From: Mike Christie @ 2021-09-30  2:04 UTC (permalink / raw)
  To: martin.petersen, james.bottomley, linux-scsi, target-devel; +Cc: Mike Christie

When userspace changes the lun's alua group, it will set the lun's group
to NULL then to the new group. Before the new group is set,
target_alua_state_check will return 0 and allow the IO to execute. This
has us skip the NULL stage, and just swap in the new group.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_alua.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c
index 74944b914b4e..b56ef8af66e7 100644
--- a/drivers/target/target_core_alua.c
+++ b/drivers/target/target_core_alua.c
@@ -1835,8 +1835,6 @@ static void __target_detach_tg_pt_gp(struct se_lun *lun,
 	list_del_init(&lun->lun_tg_pt_gp_link);
 	tg_pt_gp->tg_pt_gp_members--;
 	spin_unlock(&tg_pt_gp->tg_pt_gp_lock);
-
-	rcu_assign_pointer(lun->lun_tg_pt_gp, NULL);
 }
 
 void target_detach_tg_pt_gp(struct se_lun *lun)
@@ -1846,12 +1844,25 @@ void target_detach_tg_pt_gp(struct se_lun *lun)
 	spin_lock(&lun->lun_tg_pt_gp_lock);
 	tg_pt_gp = rcu_dereference_check(lun->lun_tg_pt_gp,
 				lockdep_is_held(&lun->lun_tg_pt_gp_lock));
-	if (tg_pt_gp)
+	if (tg_pt_gp) {
 		__target_detach_tg_pt_gp(lun, tg_pt_gp);
+		rcu_assign_pointer(lun->lun_tg_pt_gp, NULL);
+	}
 	spin_unlock(&lun->lun_tg_pt_gp_lock);
 	synchronize_rcu();
 }
 
+static void target_swap_tg_pt_gp(struct se_lun *lun,
+				 struct t10_alua_tg_pt_gp *old_tg_pt_gp,
+				 struct t10_alua_tg_pt_gp *new_tg_pt_gp)
+{
+	assert_spin_locked(&lun->lun_tg_pt_gp_lock);
+
+	if (old_tg_pt_gp)
+		__target_detach_tg_pt_gp(lun, old_tg_pt_gp);
+	__target_attach_tg_pt_gp(lun, new_tg_pt_gp);
+}
+
 ssize_t core_alua_show_tg_pt_gp_info(struct se_lun *lun, char *page)
 {
 	struct config_item *tg_pt_ci;
@@ -1941,18 +1952,16 @@ ssize_t core_alua_store_tg_pt_gp_info(
 					&tg_pt_gp->tg_pt_gp_group.cg_item),
 				tg_pt_gp->tg_pt_gp_id);
 
-			__target_detach_tg_pt_gp(lun, tg_pt_gp);
-			__target_attach_tg_pt_gp(lun,
+			target_swap_tg_pt_gp(lun, tg_pt_gp,
 					dev->t10_alua.default_tg_pt_gp);
 			spin_unlock(&lun->lun_tg_pt_gp_lock);
 
 			goto sync_rcu;
 		}
-		__target_detach_tg_pt_gp(lun, tg_pt_gp);
 		move = 1;
 	}
 
-	__target_attach_tg_pt_gp(lun, tg_pt_gp_new);
+	target_swap_tg_pt_gp(lun, tg_pt_gp, tg_pt_gp_new);
 	spin_unlock(&lun->lun_tg_pt_gp_lock);
 	pr_debug("Target_Core_ConfigFS: %s %s/tpgt_%hu/%s to ALUA"
 		" Target Port Group: alua/%s, ID: %hu\n", (move) ?
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/5] target: fix ordered CMD_T_SENT handling
  2021-09-30  2:04 ` [PATCH 1/5] target: fix ordered CMD_T_SENT handling Mike Christie
@ 2021-10-05 18:02   ` Lee Duncan
  0 siblings, 0 replies; 9+ messages in thread
From: Lee Duncan @ 2021-10-05 18:02 UTC (permalink / raw)
  To: Mike Christie, martin.petersen, james.bottomley, linux-scsi,
	target-devel

On 9/29/21 7:04 PM, Mike Christie wrote:
> We can race where target_handle_task_attr has put the cmd on the
> delayed_cmd_list. Then target_restart_delayed_cmds has removed it and
> set CMD_T_SENT, but then target_execute_cmd now clears that bit.
> 
> This patch moves the clearing to before we've put the cmd on the list.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>  drivers/target/target_core_transport.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 14c6f2bb1b01..e02173a4b7bc 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -2200,6 +2200,10 @@ static bool target_handle_task_attr(struct se_cmd *cmd)
>  	if (atomic_read(&dev->dev_ordered_sync) == 0)
>  		return false;
>  
> +	spin_lock_irq(&cmd->t_state_lock);
> +	cmd->transport_state &= ~CMD_T_SENT;
> +	spin_unlock_irq(&cmd->t_state_lock);
> +
>  	spin_lock(&dev->delayed_cmd_lock);
>  	list_add_tail(&cmd->se_delayed_node, &dev->delayed_cmd_list);
>  	spin_unlock(&dev->delayed_cmd_lock);
> @@ -2228,12 +2232,8 @@ void target_execute_cmd(struct se_cmd *cmd)
>  	if (target_write_prot_action(cmd))
>  		return;
>  
> -	if (target_handle_task_attr(cmd)) {
> -		spin_lock_irq(&cmd->t_state_lock);
> -		cmd->transport_state &= ~CMD_T_SENT;
> -		spin_unlock_irq(&cmd->t_state_lock);
> +	if (target_handle_task_attr(cmd))
>  		return;
> -	}
>  
>  	__target_execute_cmd(cmd, true);
>  }
> 

Reviewed-by: Lee Duncan <lduncan@suse.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/5] target: fixes and perf improvements
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
                   ` (4 preceding siblings ...)
  2021-09-30  2:04 ` [PATCH 5/5] target: perform alua group changes in one step Mike Christie
@ 2021-10-17  3:08 ` Martin K. Petersen
  2021-10-21  3:42 ` Martin K. Petersen
  6 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2021-10-17  3:08 UTC (permalink / raw)
  To: Mike Christie; +Cc: martin.petersen, james.bottomley, linux-scsi, target-devel


Mike,

> The following patches apply to Martin's staging tree or Linus's tree.
> The patches main goal is to take the locks out of the main IO path but
> for the case of ordered cmds they also fix a handfull of bugs.

Applied to 5.16/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/5] target: fixes and perf improvements
  2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
                   ` (5 preceding siblings ...)
  2021-10-17  3:08 ` [PATCH 0/5] target: fixes and perf improvements Martin K. Petersen
@ 2021-10-21  3:42 ` Martin K. Petersen
  6 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2021-10-21  3:42 UTC (permalink / raw)
  To: Mike Christie, james.bottomley, target-devel, linux-scsi
  Cc: Martin K . Petersen

On Wed, 29 Sep 2021 21:04:17 -0500, Mike Christie wrote:

> The following patches apply to Martin's staging tree or Linus's tree.
> The patches main goal is to take the locks out of the main IO path but
> for the case of ordered cmds they also fix a handfull of bugs.
> 
> For the locks we currently have:
> 
> 1. lun_tg_pt_gp_lock
> 2. delayed_cmd_lock
> 3. dev_reservation_lock
> 
> [...]

Applied to 5.16/scsi-queue, thanks!

[1/5] target: fix ordered CMD_T_SENT handling
      https://git.kernel.org/mkp/scsi/c/945a160794a9
[2/5] target: fix ordered tag handling
      https://git.kernel.org/mkp/scsi/c/ed1227e08099
[3/5] target: fix alua_tg_pt_gps_count tracking
      https://git.kernel.org/mkp/scsi/c/1283c0d1a32b
[4/5] target: replace lun_tg_pt_gp_lock with rcu in IO path
      https://git.kernel.org/mkp/scsi/c/7324f47d4293
[5/5] target: perform alua group changes in one step
      https://git.kernel.org/mkp/scsi/c/f9793d649c29

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-21  3:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30  2:04 [PATCH 0/5] target: fixes and perf improvements Mike Christie
2021-09-30  2:04 ` [PATCH 1/5] target: fix ordered CMD_T_SENT handling Mike Christie
2021-10-05 18:02   ` Lee Duncan
2021-09-30  2:04 ` [PATCH 2/5] target: fix ordered tag handling Mike Christie
2021-09-30  2:04 ` [PATCH 3/5] target: fix alua_tg_pt_gps_count tracking Mike Christie
2021-09-30  2:04 ` [PATCH 4/5] target: replace lun_tg_pt_gp_lock with rcu in IO path Mike Christie
2021-09-30  2:04 ` [PATCH 5/5] target: perform alua group changes in one step Mike Christie
2021-10-17  3:08 ` [PATCH 0/5] target: fixes and perf improvements Martin K. Petersen
2021-10-21  3:42 ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.