All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10
@ 2017-02-07 13:17 Nicholas A. Bellinger
  2017-02-07 13:17 ` [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion Nicholas A. Bellinger
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-07 13:17 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, lkml, Nicholas Bellinger

From: Nicholas Bellinger <nab@linux-iscsi.org>

Hi all,

This series contains a handful of bug-fixes that I've been validating
on v4.1.y code for production usage over the past couple of months.

All of these are long-standing issues that I don't think other
folks have been able to hit (or at least not reported), but have
been reproduced by Datera's QA team and/or have been reported by
customers.

Most of the patches are straight-forward fixes have been running
in Datera's nightly automation for weeks to months.

The one exception in patch #1 is a >= v4.2 RCU regression bug-fix
reported by Benjamin Estrabaud a while back, that re-instates
pre RCU conversion logic that kills a bogus BUG_ON() during
dynamic -> explicit se_node_acl conversion.

Please review.

--nab

Nicholas Bellinger (5):
  target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
  target: Use correct SCSI status during EXTENDED_COPY exception
  target: Fix early transport_generic_handle_tmr abort scenario
  target: Fix multi-session dynamic se_node_acl double free OOPs
  target: Fix COMPARE_AND_WRITE ref leak for non GOOD status

 drivers/target/target_core_device.c    | 16 +++++--
 drivers/target/target_core_sbc.c       |  8 +++-
 drivers/target/target_core_tpg.c       |  4 +-
 drivers/target/target_core_transport.c | 86 +++++++++++++++++++++++-----------
 drivers/target/target_core_xcopy.c     |  2 +-
 include/target/target_core_base.h      |  1 +
 6 files changed, 80 insertions(+), 37 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
  2017-02-07 13:17 [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10 Nicholas A. Bellinger
@ 2017-02-07 13:17 ` Nicholas A. Bellinger
  2017-02-07 22:44   ` Christoph Hellwig
  2017-02-07 13:17 ` [PATCH 2/5] target: Use correct SCSI status during EXTENDED_COPY exception Nicholas A. Bellinger
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-07 13:17 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, lkml, Nicholas Bellinger, Benjamin ESTRABAUD

From: Nicholas Bellinger <nab@linux-iscsi.org>

After the v4.2+ RCU conversion to se_node_acl->lun_entry_hlist,
a BUG_ON() was added in core_enable_device_list_for_node() to
detect when the passed *lun does not match the existing
orig->se_lun pointer reference.

However, this scenario can occur happen when a dynamically
generated NodeACL is being converted to an explicit NodeACL,
when the explicit NodeACL contains a different LUN mapping
than the default provided by the WWN endpoint.

So instead of triggering BUG_ON(), go ahead and fail instead
following the original pre RCU conversion logic.

Reported-by: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Cc: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_device.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index 1ebd13e..23e89af 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -345,14 +345,22 @@ int core_enable_device_list_for_node(
 					lockdep_is_held(&nacl->lun_entry_mutex));
 
 		if (orig_lun != lun) {
-			pr_err("Existing orig->se_lun doesn't match new lun"
-			       " for dynamic -> explicit NodeACL conversion:"
-				" %s\n", nacl->initiatorname);
+			pr_warn_ratelimited("Existing orig->se_lun doesn't match"
+				" new lun for dynamic -> explicit NodeACL"
+				" conversion: %s\n", nacl->initiatorname);
+			mutex_unlock(&nacl->lun_entry_mutex);
+			kfree(new);
+			return -EINVAL;
+		}
+		if (orig->se_lun_acl != NULL) {
+			pr_warn_ratelimited("Detected existing explicit"
+				" se_lun_acl->se_lun_group reference for %s"
+				" mapped_lun: %llu, ignoring\n",
+				 nacl->initiatorname, mapped_lun);
 			mutex_unlock(&nacl->lun_entry_mutex);
 			kfree(new);
 			return -EINVAL;
 		}
-		BUG_ON(orig->se_lun_acl != NULL);
 
 		rcu_assign_pointer(new->se_lun, lun);
 		rcu_assign_pointer(new->se_lun_acl, lun_acl);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] target: Use correct SCSI status during EXTENDED_COPY exception
  2017-02-07 13:17 [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10 Nicholas A. Bellinger
  2017-02-07 13:17 ` [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion Nicholas A. Bellinger
@ 2017-02-07 13:17 ` Nicholas A. Bellinger
  2017-02-07 22:39   ` Christoph Hellwig
  2017-02-07 13:17 ` [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario Nicholas A. Bellinger
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-07 13:17 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, lkml, Nicholas Bellinger, Nixon Vincent

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch adds the missing target_complete_cmd() SCSI status
parameter change in target_xcopy_do_work(), that was originally
missing in commit 926317de33.

It correctly propigates up the correct SCSI status during
EXTENDED_COPY exception cases, instead of always using the
hardcoded SAM_STAT_CHECK_CONDITION from original code.

This is required for ESX host environments that expect to
hit SAM_STAT_RESERVATION_CONFLICT for certain scenarios,
and SAM_STAT_CHECK_CONDITION results in non-retriable
status for these cases.

Reported-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Tested-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Cc: Nixon Vincent <nixon.vincent@calsoftinc.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_xcopy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/target_core_xcopy.c b/drivers/target/target_core_xcopy.c
index d828b3b..cac5a20 100644
--- a/drivers/target/target_core_xcopy.c
+++ b/drivers/target/target_core_xcopy.c
@@ -864,7 +864,7 @@ static void target_xcopy_do_work(struct work_struct *work)
 			" CHECK_CONDITION -> sending response\n", rc);
 		ec_cmd->scsi_status = SAM_STAT_CHECK_CONDITION;
 	}
-	target_complete_cmd(ec_cmd, SAM_STAT_CHECK_CONDITION);
+	target_complete_cmd(ec_cmd, ec_cmd->scsi_status);
 }
 
 sense_reason_t target_do_xcopy(struct se_cmd *se_cmd)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario
  2017-02-07 13:17 [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10 Nicholas A. Bellinger
  2017-02-07 13:17 ` [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion Nicholas A. Bellinger
  2017-02-07 13:17 ` [PATCH 2/5] target: Use correct SCSI status during EXTENDED_COPY exception Nicholas A. Bellinger
@ 2017-02-07 13:17 ` Nicholas A. Bellinger
  2017-02-07 22:45   ` Christoph Hellwig
  2017-02-07 13:17 ` [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs Nicholas A. Bellinger
  2017-02-07 13:17 ` [PATCH 5/5] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status Nicholas A. Bellinger
  4 siblings, 1 reply; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-07 13:17 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, lkml, Nicholas Bellinger, Rob Millner

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch fixes a bug where incoming task management requests
can be explicitly aborted during an active LUN_RESET, but who's
struct work_struct are canceled in-flight before execution.

This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
for target_tmr_work() never getting invoked and the aborted TMR
waiting indefinately within transport_wait_for_tasks().

To address this case, perform a CMD_T_ABORTED check early in
transport_generic_handle_tmr(), and invoke the normal path via
transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
blocked waiting for CMD_T_STOP in transport_wait_for_tasks().

Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
into transport_generic_handle_tmr() so the existing check in
core_tmr_drain_tmr_list() avoids attempting abort the incoming
se_tmr_req->task_cmd->work if it has already been queued into
se_device->tmr_wq.

Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_transport.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 1cadc9e..8b69843 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -3110,7 +3110,6 @@ static void target_tmr_work(struct work_struct *work)
 		spin_unlock_irqrestore(&cmd->t_state_lock, flags);
 		goto check_stop;
 	}
-	cmd->t_state = TRANSPORT_ISTATE_PROCESSING;
 	spin_unlock_irqrestore(&cmd->t_state_lock, flags);
 
 	cmd->se_tfo->queue_tm_rsp(cmd);
@@ -3123,11 +3122,25 @@ int transport_generic_handle_tmr(
 	struct se_cmd *cmd)
 {
 	unsigned long flags;
+	bool aborted = false;
 
 	spin_lock_irqsave(&cmd->t_state_lock, flags);
-	cmd->transport_state |= CMD_T_ACTIVE;
+	if (cmd->transport_state & CMD_T_ABORTED) {
+		aborted = true;
+	} else {
+		cmd->t_state = TRANSPORT_ISTATE_PROCESSING;
+		cmd->transport_state |= CMD_T_ACTIVE;
+	}
 	spin_unlock_irqrestore(&cmd->t_state_lock, flags);
 
+	if (aborted) {
+		pr_warn_ratelimited("handle_tmr caught CMD_T_ABORTED TMR %d"
+			"ref_tag: %llu tag: %llu\n", cmd->se_tmr_req->function,
+			cmd->se_tmr_req->ref_task_tag, cmd->tag);
+		transport_cmd_check_stop_to_fabric(cmd);
+		return 0;
+	}
+
 	INIT_WORK(&cmd->work, target_tmr_work);
 	queue_work(cmd->se_dev->tmr_wq, &cmd->work);
 	return 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs
  2017-02-07 13:17 [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10 Nicholas A. Bellinger
                   ` (2 preceding siblings ...)
  2017-02-07 13:17 ` [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario Nicholas A. Bellinger
@ 2017-02-07 13:17 ` Nicholas A. Bellinger
  2017-02-07 23:07   ` Christoph Hellwig
  2017-02-07 13:17 ` [PATCH 5/5] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status Nicholas A. Bellinger
  4 siblings, 1 reply; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-07 13:17 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, lkml, Nicholas Bellinger, Rob Millner

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch addresses a long-standing bug with multi-session
(eg: iscsi-target + iser-target) se_node_acl dynamic free
withini transport_deregister_session().

This bug is caused when a storage endpoint is configured with
demo-mode (generate_node_acls = 1 + cache_dynamic_acls = 1)
initiators, and initiator login creates a new dynamic node acl
and attaches two sessions to it.

After that, demo-mode for the storage instance is disabled via
configfs (generate_node_acls = 0 + cache_dynamic_acls = 0) and
the existing dynamic acl is never converted to an explicit ACL.

The end result is dynamic acl resources are released twice when
the sessions are shutdown in transport_deregister_session().

If the storage instance is not changed to disable demo-mode,
or the dynamic acl is converted to an explict ACL, or there
is only a single session associated with the dynamic ACL,
the bug is not triggered.

To address this big, move the release of dynamic se_node_acl
memory into target_complete_nacl() so it's only freed once
when se_node_acl->acl_kref reaches zero.

Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_tpg.c       |  4 +-
 drivers/target/target_core_transport.c | 69 +++++++++++++++++++++-------------
 include/target/target_core_base.h      |  1 +
 3 files changed, 46 insertions(+), 28 deletions(-)

diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
index d99752c..4f32c49 100644
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -364,7 +364,7 @@ void core_tpg_del_initiator_node_acl(struct se_node_acl *acl)
 	mutex_lock(&tpg->acl_node_mutex);
 	if (acl->dynamic_node_acl)
 		acl->dynamic_node_acl = 0;
-	list_del(&acl->acl_list);
+	list_del_init(&acl->acl_list);
 	mutex_unlock(&tpg->acl_node_mutex);
 
 	target_shutdown_sessions(acl);
@@ -540,7 +540,7 @@ int core_tpg_deregister(struct se_portal_group *se_tpg)
 	 * in transport_deregister_session().
 	 */
 	list_for_each_entry_safe(nacl, nacl_tmp, &node_list, acl_list) {
-		list_del(&nacl->acl_list);
+		list_del_init(&nacl->acl_list);
 
 		core_tpg_wait_for_nacl_pr_ref(nacl);
 		core_free_device_list_for_node(nacl, se_tpg);
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 8b69843..bd62dd4 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -457,8 +457,20 @@ static void target_complete_nacl(struct kref *kref)
 {
 	struct se_node_acl *nacl = container_of(kref,
 				struct se_node_acl, acl_kref);
+	struct se_portal_group *se_tpg = nacl->se_tpg;
 
-	complete(&nacl->acl_free_comp);
+	if (!nacl->dynamic_stop) {
+		complete(&nacl->acl_free_comp);
+		return;
+	}
+
+	mutex_lock(&se_tpg->acl_node_mutex);
+	list_del_init(&nacl->acl_list);
+	mutex_unlock(&se_tpg->acl_node_mutex);
+
+	core_tpg_wait_for_nacl_pr_ref(nacl);
+	core_free_device_list_for_node(nacl, se_tpg);
+	kfree(nacl);
 }
 
 void target_put_nacl(struct se_node_acl *nacl)
@@ -499,12 +511,39 @@ void transport_deregister_session_configfs(struct se_session *se_sess)
 void transport_free_session(struct se_session *se_sess)
 {
 	struct se_node_acl *se_nacl = se_sess->se_node_acl;
+
 	/*
 	 * Drop the se_node_acl->nacl_kref obtained from within
 	 * core_tpg_get_initiator_node_acl().
 	 */
 	if (se_nacl) {
+		struct se_portal_group *se_tpg = se_nacl->se_tpg;
+		const struct target_core_fabric_ops *se_tfo = se_tpg->se_tpg_tfo;
+		unsigned long flags;
+
 		se_sess->se_node_acl = NULL;
+
+		/*
+		 * Also determine if we need to drop the extra ->cmd_kref if
+		 * it had been previously dynamically generated, and
+		 * the endpoint is not caching dynamic ACLs.
+		 */
+		mutex_lock(&se_tpg->acl_node_mutex);
+		if (se_nacl->dynamic_node_acl &&
+		    !se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
+			spin_lock_irqsave(&se_nacl->nacl_sess_lock, flags);
+			if (list_empty(&se_nacl->acl_sess_list))
+				se_nacl->dynamic_stop = true;
+			spin_unlock_irqrestore(&se_nacl->nacl_sess_lock, flags);
+
+			if (se_nacl->dynamic_stop)
+				list_del_init(&se_nacl->acl_list);
+		}
+		mutex_unlock(&se_tpg->acl_node_mutex);
+
+		if (se_nacl->dynamic_stop)
+			target_put_nacl(se_nacl);
+
 		target_put_nacl(se_nacl);
 	}
 	if (se_sess->sess_cmd_map) {
@@ -518,16 +557,12 @@ void transport_free_session(struct se_session *se_sess)
 void transport_deregister_session(struct se_session *se_sess)
 {
 	struct se_portal_group *se_tpg = se_sess->se_tpg;
-	const struct target_core_fabric_ops *se_tfo;
-	struct se_node_acl *se_nacl;
 	unsigned long flags;
-	bool drop_nacl = false;
 
 	if (!se_tpg) {
 		transport_free_session(se_sess);
 		return;
 	}
-	se_tfo = se_tpg->se_tpg_tfo;
 
 	spin_lock_irqsave(&se_tpg->session_lock, flags);
 	list_del(&se_sess->sess_list);
@@ -535,33 +570,15 @@ void transport_deregister_session(struct se_session *se_sess)
 	se_sess->fabric_sess_ptr = NULL;
 	spin_unlock_irqrestore(&se_tpg->session_lock, flags);
 
-	/*
-	 * Determine if we need to do extra work for this initiator node's
-	 * struct se_node_acl if it had been previously dynamically generated.
-	 */
-	se_nacl = se_sess->se_node_acl;
-
-	mutex_lock(&se_tpg->acl_node_mutex);
-	if (se_nacl && se_nacl->dynamic_node_acl) {
-		if (!se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
-			list_del(&se_nacl->acl_list);
-			drop_nacl = true;
-		}
-	}
-	mutex_unlock(&se_tpg->acl_node_mutex);
-
-	if (drop_nacl) {
-		core_tpg_wait_for_nacl_pr_ref(se_nacl);
-		core_free_device_list_for_node(se_nacl, se_tpg);
-		se_sess->se_node_acl = NULL;
-		kfree(se_nacl);
-	}
 	pr_debug("TARGET_CORE[%s]: Deregistered fabric_sess\n",
 		se_tpg->se_tpg_tfo->get_fabric_name());
 	/*
 	 * If last kref is dropping now for an explicit NodeACL, awake sleeping
 	 * ->acl_free_comp caller to wakeup configfs se_node_acl->acl_group
 	 * removal context from within transport_free_session() code.
+	 *
+	 * For dynamic ACL, target_put_nacl() uses target_complete_nacl()
+	 * to release all remaining generate_node_acl=1 created ACL resources.
 	 */
 
 	transport_free_session(se_sess);
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 43edf82..da854fb 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -538,6 +538,7 @@ struct se_node_acl {
 	char			initiatorname[TRANSPORT_IQN_LEN];
 	/* Used to signal demo mode created ACL, disabled by default */
 	bool			dynamic_node_acl;
+	bool			dynamic_stop;
 	u32			queue_depth;
 	u32			acl_index;
 	enum target_prot_type	saved_prot_type;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status
  2017-02-07 13:17 [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10 Nicholas A. Bellinger
                   ` (3 preceding siblings ...)
  2017-02-07 13:17 ` [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs Nicholas A. Bellinger
@ 2017-02-07 13:17 ` Nicholas A. Bellinger
  2017-02-07 22:51   ` Christoph Hellwig
  4 siblings, 1 reply; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-07 13:17 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, lkml, Nicholas Bellinger, Donald White, Gary Guo

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch addresses a long standing bug where the commit phase
of COMPARE_AND_WRITE would result in a se_cmd->cmd_kref reference
leak if se_cmd->scsi_status returned non SAM_STAT_GOOD.

This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.

To address this bug, compare_and_write_post() has been changed
to drop the incorrect !cmd->scsi_status conditional that was
preventing *post_ret = 1 for being set during non SAM_STAT_GOOD
status.

This patch has been tested with SAM_STAT_CHECK_CONDITION status
from normal target_complete_cmd() callback path, as well as the
incoming __target_execute_cmd() submission failure path when
se_cmd->execute_cmd() returns non zero status.

Reported-by: Donald White <dew@datera.io>
Cc: Donald White <dew@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_sbc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 4879e70..df7b6e9 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -451,6 +451,7 @@ static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success,
 					     int *post_ret)
 {
 	struct se_device *dev = cmd->se_dev;
+	sense_reason_t ret = TCM_NO_SENSE;
 
 	/*
 	 * Only set SCF_COMPARE_AND_WRITE_POST to force a response fall-through
@@ -458,9 +459,12 @@ static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success,
 	 * sent to the backend driver.
 	 */
 	spin_lock_irq(&cmd->t_state_lock);
-	if ((cmd->transport_state & CMD_T_SENT) && !cmd->scsi_status) {
+	if (cmd->transport_state & CMD_T_SENT) {
 		cmd->se_cmd_flags |= SCF_COMPARE_AND_WRITE_POST;
 		*post_ret = 1;
+
+		if (cmd->scsi_status == SAM_STAT_CHECK_CONDITION)
+			ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 	}
 	spin_unlock_irq(&cmd->t_state_lock);
 
@@ -470,7 +474,7 @@ static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success,
 	 */
 	up(&dev->caw_sem);
 
-	return TCM_NO_SENSE;
+	return ret;
 }
 
 static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool success,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] target: Use correct SCSI status during EXTENDED_COPY exception
  2017-02-07 13:17 ` [PATCH 2/5] target: Use correct SCSI status during EXTENDED_COPY exception Nicholas A. Bellinger
@ 2017-02-07 22:39   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2017-02-07 22:39 UTC (permalink / raw)
  To: Nicholas A. Bellinger; +Cc: target-devel, linux-scsi, lkml, Nixon Vincent

Looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
  2017-02-07 13:17 ` [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion Nicholas A. Bellinger
@ 2017-02-07 22:44   ` Christoph Hellwig
  2017-02-08 16:16     ` Nicholas A. Bellinger
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2017-02-07 22:44 UTC (permalink / raw)
  To: Nicholas A. Bellinger; +Cc: target-devel, linux-scsi, lkml, Benjamin ESTRABAUD

On Tue, Feb 07, 2017 at 01:17:46PM +0000, Nicholas A. Bellinger wrote:
> +		if (orig->se_lun_acl != NULL) {
> +			pr_warn_ratelimited("Detected existing explicit"
> +				" se_lun_acl->se_lun_group reference for %s"
> +				" mapped_lun: %llu, ignoring\n",
> +				 nacl->initiatorname, mapped_lun);

The ignoring in the message confused the heck out of me first.  But it 
seems that's just an incorrect leftover from the original message, as the
changelog also says fail instead.  With that fixed up (and maybe the
whole message in a single string literal on a single line):

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario
  2017-02-07 13:17 ` [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario Nicholas A. Bellinger
@ 2017-02-07 22:45   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2017-02-07 22:45 UTC (permalink / raw)
  To: Nicholas A. Bellinger; +Cc: target-devel, linux-scsi, lkml, Rob Millner

On Tue, Feb 07, 2017 at 01:17:48PM +0000, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> This patch fixes a bug where incoming task management requests
> can be explicitly aborted during an active LUN_RESET, but who's
> struct work_struct are canceled in-flight before execution.
> 
> This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
> for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
> for target_tmr_work() never getting invoked and the aborted TMR
> waiting indefinately within transport_wait_for_tasks().
> 
> To address this case, perform a CMD_T_ABORTED check early in
> transport_generic_handle_tmr(), and invoke the normal path via
> transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
> blocked waiting for CMD_T_STOP in transport_wait_for_tasks().
> 
> Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
> into transport_generic_handle_tmr() so the existing check in
> core_tmr_drain_tmr_list() avoids attempting abort the incoming
> se_tmr_req->task_cmd->work if it has already been queued into
> se_device->tmr_wq.
> 
> Reported-by: Rob Millner <rlm@daterainc.com>
> Tested-by: Rob Millner <rlm@daterainc.com>
> Cc: Rob Millner <rlm@daterainc.com>
> Cc: stable@vger.kernel.org # 3.14+
> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
> ---
>  drivers/target/target_core_transport.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 1cadc9e..8b69843 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -3110,7 +3110,6 @@ static void target_tmr_work(struct work_struct *work)
>  		spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>  		goto check_stop;
>  	}
> -	cmd->t_state = TRANSPORT_ISTATE_PROCESSING;
>  	spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>  
>  	cmd->se_tfo->queue_tm_rsp(cmd);
> @@ -3123,11 +3122,25 @@ int transport_generic_handle_tmr(
>  	struct se_cmd *cmd)
>  {
>  	unsigned long flags;
> +	bool aborted = false;
>  
>  	spin_lock_irqsave(&cmd->t_state_lock, flags);
> -	cmd->transport_state |= CMD_T_ACTIVE;
> +	if (cmd->transport_state & CMD_T_ABORTED) {
> +		aborted = true;
> +	} else {
> +		cmd->t_state = TRANSPORT_ISTATE_PROCESSING;
> +		cmd->transport_state |= CMD_T_ACTIVE;
> +	}
>  	spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>  
> +	if (aborted) {
> +		pr_warn_ratelimited("handle_tmr caught CMD_T_ABORTED TMR %d"
> +			"ref_tag: %llu tag: %llu\n", cmd->se_tmr_req->function,
> +			cmd->se_tmr_req->ref_task_tag, cmd->tag);

This seems pretty noisy for something that could happen during normal
operation.  Otherwise:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status
  2017-02-07 13:17 ` [PATCH 5/5] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status Nicholas A. Bellinger
@ 2017-02-07 22:51   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2017-02-07 22:51 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: target-devel, linux-scsi, lkml, Donald White, Gary Guo

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs
  2017-02-07 13:17 ` [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs Nicholas A. Bellinger
@ 2017-02-07 23:07   ` Christoph Hellwig
  2017-02-07 23:12     ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2017-02-07 23:07 UTC (permalink / raw)
  To: Nicholas A. Bellinger; +Cc: target-devel, linux-scsi, lkml, Rob Millner

On Tue, Feb 07, 2017 at 01:17:49PM +0000, Nicholas A. Bellinger wrote:
> -	list_del(&acl->acl_list);
> +	list_del_init(&acl->acl_list);

All these list_del_init changes don't make sense to me - the whole target
code never does a list_empty check on ->acl_list.


Looking further I think all nacls should be switched to the direct
free from the kref release callback scheme - there is no real need
to wait for freeing the nacl I think.

Untested patch below:

diff --git a/drivers/target/target_core_internal.h b/drivers/target/target_core_internal.h
index 9ab7090f7c83..96c38f30069d 100644
--- a/drivers/target/target_core_internal.h
+++ b/drivers/target/target_core_internal.h
@@ -152,6 +152,7 @@ void	target_qf_do_work(struct work_struct *work);
 bool	target_check_wce(struct se_device *dev);
 bool	target_check_fua(struct se_device *dev);
 void	__target_execute_cmd(struct se_cmd *, bool);
+void	target_put_nacl(struct se_node_acl *);
 
 /* target_core_stat.c */
 void	target_stat_setup_dev_default_groups(struct se_device *);
diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
index d99752c6cd60..08738c87e5b0 100644
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -197,7 +197,6 @@ static struct se_node_acl *target_alloc_node_acl(struct se_portal_group *tpg,
 	INIT_LIST_HEAD(&acl->acl_sess_list);
 	INIT_HLIST_HEAD(&acl->lun_entry_hlist);
 	kref_init(&acl->acl_kref);
-	init_completion(&acl->acl_free_comp);
 	spin_lock_init(&acl->nacl_sess_lock);
 	mutex_init(&acl->lun_entry_mutex);
 	atomic_set(&acl->acl_pr_ref_count, 0);
@@ -370,21 +369,6 @@ void core_tpg_del_initiator_node_acl(struct se_node_acl *acl)
 	target_shutdown_sessions(acl);
 
 	target_put_nacl(acl);
-	/*
-	 * Wait for last target_put_nacl() to complete in target_complete_nacl()
-	 * for active fabric session transport_deregister_session() callbacks.
-	 */
-	wait_for_completion(&acl->acl_free_comp);
-
-	core_tpg_wait_for_nacl_pr_ref(acl);
-	core_free_device_list_for_node(acl, tpg);
-
-	pr_debug("%s_TPG[%hu] - Deleted ACL with TCQ Depth: %d for %s"
-		" Initiator Node: %s\n", tpg->se_tpg_tfo->get_fabric_name(),
-		tpg->se_tpg_tfo->tpg_get_tag(tpg), acl->queue_depth,
-		tpg->se_tpg_tfo->get_fabric_name(), acl->initiatorname);
-
-	kfree(acl);
 }
 
 /*	core_tpg_set_initiator_node_queue_depth():
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 1cadc9eefa21..72718283e553 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -453,19 +453,25 @@ ssize_t target_show_dynamic_sessions(struct se_portal_group *se_tpg, char *page)
 }
 EXPORT_SYMBOL(target_show_dynamic_sessions);
 
-static void target_complete_nacl(struct kref *kref)
+static void target_destroy_nacl(struct kref *kref)
 {
 	struct se_node_acl *nacl = container_of(kref,
 				struct se_node_acl, acl_kref);
+	struct se_portal_group *se_tpg = nacl->se_tpg;
 
-	complete(&nacl->acl_free_comp);
+	mutex_lock(&se_tpg->acl_node_mutex);
+	list_del(&nacl->acl_list);
+	mutex_unlock(&se_tpg->acl_node_mutex);
+
+	core_tpg_wait_for_nacl_pr_ref(nacl);
+	core_free_device_list_for_node(nacl, se_tpg);
+	kfree(nacl);
 }
 
 void target_put_nacl(struct se_node_acl *nacl)
 {
-	kref_put(&nacl->acl_kref, target_complete_nacl);
+	kref_put(&nacl->acl_kref, target_destroy_nacl);
 }
-EXPORT_SYMBOL(target_put_nacl);
 
 void transport_deregister_session_configfs(struct se_session *se_sess)
 {
@@ -499,12 +505,39 @@ EXPORT_SYMBOL(transport_deregister_session_configfs);
 void transport_free_session(struct se_session *se_sess)
 {
 	struct se_node_acl *se_nacl = se_sess->se_node_acl;
+
 	/*
 	 * Drop the se_node_acl->nacl_kref obtained from within
 	 * core_tpg_get_initiator_node_acl().
 	 */
 	if (se_nacl) {
+		struct se_portal_group *se_tpg = se_nacl->se_tpg;
+		const struct target_core_fabric_ops *se_tfo = se_tpg->se_tpg_tfo;
+		unsigned long flags;
+
 		se_sess->se_node_acl = NULL;
+
+		/*
+		 * Also determine if we need to drop the extra ->cmd_kref if
+		 * it had been previously dynamically generated, and
+		 * the endpoint is not caching dynamic ACLs.
+		 */
+		mutex_lock(&se_tpg->acl_node_mutex);
+		if (se_nacl->dynamic_node_acl &&
+		    !se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
+			spin_lock_irqsave(&se_nacl->nacl_sess_lock, flags);
+			if (list_empty(&se_nacl->acl_sess_list))
+				stop = true;
+			spin_unlock_irqrestore(&se_nacl->nacl_sess_lock, flags);
+
+			if (stop)
+				list_del_init(&se_nacl->acl_list);
+		}
+		mutex_unlock(&se_tpg->acl_node_mutex);
+
+		if (stop)
+			target_put_nacl(se_nacl);
+
 		target_put_nacl(se_nacl);
 	}
 	if (se_sess->sess_cmd_map) {
@@ -518,16 +551,12 @@ EXPORT_SYMBOL(transport_free_session);
 void transport_deregister_session(struct se_session *se_sess)
 {
 	struct se_portal_group *se_tpg = se_sess->se_tpg;
-	const struct target_core_fabric_ops *se_tfo;
-	struct se_node_acl *se_nacl;
 	unsigned long flags;
-	bool drop_nacl = false;
 
 	if (!se_tpg) {
 		transport_free_session(se_sess);
 		return;
 	}
-	se_tfo = se_tpg->se_tpg_tfo;
 
 	spin_lock_irqsave(&se_tpg->session_lock, flags);
 	list_del(&se_sess->sess_list);
@@ -535,35 +564,8 @@ void transport_deregister_session(struct se_session *se_sess)
 	se_sess->fabric_sess_ptr = NULL;
 	spin_unlock_irqrestore(&se_tpg->session_lock, flags);
 
-	/*
-	 * Determine if we need to do extra work for this initiator node's
-	 * struct se_node_acl if it had been previously dynamically generated.
-	 */
-	se_nacl = se_sess->se_node_acl;
-
-	mutex_lock(&se_tpg->acl_node_mutex);
-	if (se_nacl && se_nacl->dynamic_node_acl) {
-		if (!se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
-			list_del(&se_nacl->acl_list);
-			drop_nacl = true;
-		}
-	}
-	mutex_unlock(&se_tpg->acl_node_mutex);
-
-	if (drop_nacl) {
-		core_tpg_wait_for_nacl_pr_ref(se_nacl);
-		core_free_device_list_for_node(se_nacl, se_tpg);
-		se_sess->se_node_acl = NULL;
-		kfree(se_nacl);
-	}
 	pr_debug("TARGET_CORE[%s]: Deregistered fabric_sess\n",
 		se_tpg->se_tpg_tfo->get_fabric_name());
-	/*
-	 * If last kref is dropping now for an explicit NodeACL, awake sleeping
-	 * ->acl_free_comp caller to wakeup configfs se_node_acl->acl_group
-	 * removal context from within transport_free_session() code.
-	 */
-
 	transport_free_session(se_sess);
 }
 EXPORT_SYMBOL(transport_deregister_session);
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 43edf82e54ff..edad452c3c25 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -557,7 +557,6 @@ struct se_node_acl {
 	struct config_group	acl_fabric_stat_group;
 	struct list_head	acl_list;
 	struct list_head	acl_sess_list;
-	struct completion	acl_free_comp;
 	struct kref		acl_kref;
 };
 
diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h
index 358041bad1da..1c417731b63d 100644
--- a/include/target/target_core_fabric.h
+++ b/include/target/target_core_fabric.h
@@ -125,7 +125,6 @@ void	transport_register_session(struct se_portal_group *,
 		struct se_node_acl *, struct se_session *, void *);
 ssize_t	target_show_dynamic_sessions(struct se_portal_group *, char *);
 void	transport_free_session(struct se_session *);
-void	target_put_nacl(struct se_node_acl *);
 void	transport_deregister_session_configfs(struct se_session *);
 void	transport_deregister_session(struct se_session *);
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs
  2017-02-07 23:07   ` Christoph Hellwig
@ 2017-02-07 23:12     ` Christoph Hellwig
  2017-02-08  3:46       ` Nicholas A. Bellinger
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2017-02-07 23:12 UTC (permalink / raw)
  To: Nicholas A. Bellinger; +Cc: target-devel, linux-scsi, lkml, Rob Millner

And the real patch after compile fixing it is here of course:

diff --git a/drivers/target/target_core_internal.h b/drivers/target/target_core_internal.h
index 9ab7090f7c83..96c38f30069d 100644
--- a/drivers/target/target_core_internal.h
+++ b/drivers/target/target_core_internal.h
@@ -152,6 +152,7 @@ void	target_qf_do_work(struct work_struct *work);
 bool	target_check_wce(struct se_device *dev);
 bool	target_check_fua(struct se_device *dev);
 void	__target_execute_cmd(struct se_cmd *, bool);
+void	target_put_nacl(struct se_node_acl *);
 
 /* target_core_stat.c */
 void	target_stat_setup_dev_default_groups(struct se_device *);
diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
index d99752c6cd60..08738c87e5b0 100644
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -197,7 +197,6 @@ static struct se_node_acl *target_alloc_node_acl(struct se_portal_group *tpg,
 	INIT_LIST_HEAD(&acl->acl_sess_list);
 	INIT_HLIST_HEAD(&acl->lun_entry_hlist);
 	kref_init(&acl->acl_kref);
-	init_completion(&acl->acl_free_comp);
 	spin_lock_init(&acl->nacl_sess_lock);
 	mutex_init(&acl->lun_entry_mutex);
 	atomic_set(&acl->acl_pr_ref_count, 0);
@@ -370,21 +369,6 @@ void core_tpg_del_initiator_node_acl(struct se_node_acl *acl)
 	target_shutdown_sessions(acl);
 
 	target_put_nacl(acl);
-	/*
-	 * Wait for last target_put_nacl() to complete in target_complete_nacl()
-	 * for active fabric session transport_deregister_session() callbacks.
-	 */
-	wait_for_completion(&acl->acl_free_comp);
-
-	core_tpg_wait_for_nacl_pr_ref(acl);
-	core_free_device_list_for_node(acl, tpg);
-
-	pr_debug("%s_TPG[%hu] - Deleted ACL with TCQ Depth: %d for %s"
-		" Initiator Node: %s\n", tpg->se_tpg_tfo->get_fabric_name(),
-		tpg->se_tpg_tfo->tpg_get_tag(tpg), acl->queue_depth,
-		tpg->se_tpg_tfo->get_fabric_name(), acl->initiatorname);
-
-	kfree(acl);
 }
 
 /*	core_tpg_set_initiator_node_queue_depth():
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 1cadc9eefa21..9ebd86a8ef41 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -453,19 +453,25 @@ ssize_t target_show_dynamic_sessions(struct se_portal_group *se_tpg, char *page)
 }
 EXPORT_SYMBOL(target_show_dynamic_sessions);
 
-static void target_complete_nacl(struct kref *kref)
+static void target_destroy_nacl(struct kref *kref)
 {
 	struct se_node_acl *nacl = container_of(kref,
 				struct se_node_acl, acl_kref);
+	struct se_portal_group *se_tpg = nacl->se_tpg;
 
-	complete(&nacl->acl_free_comp);
+	mutex_lock(&se_tpg->acl_node_mutex);
+	list_del(&nacl->acl_list);
+	mutex_unlock(&se_tpg->acl_node_mutex);
+
+	core_tpg_wait_for_nacl_pr_ref(nacl);
+	core_free_device_list_for_node(nacl, se_tpg);
+	kfree(nacl);
 }
 
 void target_put_nacl(struct se_node_acl *nacl)
 {
-	kref_put(&nacl->acl_kref, target_complete_nacl);
+	kref_put(&nacl->acl_kref, target_destroy_nacl);
 }
-EXPORT_SYMBOL(target_put_nacl);
 
 void transport_deregister_session_configfs(struct se_session *se_sess)
 {
@@ -499,12 +505,40 @@ EXPORT_SYMBOL(transport_deregister_session_configfs);
 void transport_free_session(struct se_session *se_sess)
 {
 	struct se_node_acl *se_nacl = se_sess->se_node_acl;
+
 	/*
 	 * Drop the se_node_acl->nacl_kref obtained from within
 	 * core_tpg_get_initiator_node_acl().
 	 */
 	if (se_nacl) {
+		struct se_portal_group *se_tpg = se_nacl->se_tpg;
+		const struct target_core_fabric_ops *se_tfo = se_tpg->se_tpg_tfo;
+		unsigned long flags;
+		bool stop = false;
+
 		se_sess->se_node_acl = NULL;
+
+		/*
+		 * Also determine if we need to drop the extra ->cmd_kref if
+		 * it had been previously dynamically generated, and
+		 * the endpoint is not caching dynamic ACLs.
+		 */
+		mutex_lock(&se_tpg->acl_node_mutex);
+		if (se_nacl->dynamic_node_acl &&
+		    !se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
+			spin_lock_irqsave(&se_nacl->nacl_sess_lock, flags);
+			if (list_empty(&se_nacl->acl_sess_list))
+				stop = true;
+			spin_unlock_irqrestore(&se_nacl->nacl_sess_lock, flags);
+
+			if (stop)
+				list_del_init(&se_nacl->acl_list);
+		}
+		mutex_unlock(&se_tpg->acl_node_mutex);
+
+		if (stop)
+			target_put_nacl(se_nacl);
+
 		target_put_nacl(se_nacl);
 	}
 	if (se_sess->sess_cmd_map) {
@@ -518,16 +552,12 @@ EXPORT_SYMBOL(transport_free_session);
 void transport_deregister_session(struct se_session *se_sess)
 {
 	struct se_portal_group *se_tpg = se_sess->se_tpg;
-	const struct target_core_fabric_ops *se_tfo;
-	struct se_node_acl *se_nacl;
 	unsigned long flags;
-	bool drop_nacl = false;
 
 	if (!se_tpg) {
 		transport_free_session(se_sess);
 		return;
 	}
-	se_tfo = se_tpg->se_tpg_tfo;
 
 	spin_lock_irqsave(&se_tpg->session_lock, flags);
 	list_del(&se_sess->sess_list);
@@ -535,35 +565,8 @@ void transport_deregister_session(struct se_session *se_sess)
 	se_sess->fabric_sess_ptr = NULL;
 	spin_unlock_irqrestore(&se_tpg->session_lock, flags);
 
-	/*
-	 * Determine if we need to do extra work for this initiator node's
-	 * struct se_node_acl if it had been previously dynamically generated.
-	 */
-	se_nacl = se_sess->se_node_acl;
-
-	mutex_lock(&se_tpg->acl_node_mutex);
-	if (se_nacl && se_nacl->dynamic_node_acl) {
-		if (!se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
-			list_del(&se_nacl->acl_list);
-			drop_nacl = true;
-		}
-	}
-	mutex_unlock(&se_tpg->acl_node_mutex);
-
-	if (drop_nacl) {
-		core_tpg_wait_for_nacl_pr_ref(se_nacl);
-		core_free_device_list_for_node(se_nacl, se_tpg);
-		se_sess->se_node_acl = NULL;
-		kfree(se_nacl);
-	}
 	pr_debug("TARGET_CORE[%s]: Deregistered fabric_sess\n",
 		se_tpg->se_tpg_tfo->get_fabric_name());
-	/*
-	 * If last kref is dropping now for an explicit NodeACL, awake sleeping
-	 * ->acl_free_comp caller to wakeup configfs se_node_acl->acl_group
-	 * removal context from within transport_free_session() code.
-	 */
-
 	transport_free_session(se_sess);
 }
 EXPORT_SYMBOL(transport_deregister_session);
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 43edf82e54ff..edad452c3c25 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -557,7 +557,6 @@ struct se_node_acl {
 	struct config_group	acl_fabric_stat_group;
 	struct list_head	acl_list;
 	struct list_head	acl_sess_list;
-	struct completion	acl_free_comp;
 	struct kref		acl_kref;
 };
 
diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h
index 358041bad1da..1c417731b63d 100644
--- a/include/target/target_core_fabric.h
+++ b/include/target/target_core_fabric.h
@@ -125,7 +125,6 @@ void	transport_register_session(struct se_portal_group *,
 		struct se_node_acl *, struct se_session *, void *);
 ssize_t	target_show_dynamic_sessions(struct se_portal_group *, char *);
 void	transport_free_session(struct se_session *);
-void	target_put_nacl(struct se_node_acl *);
 void	transport_deregister_session_configfs(struct se_session *);
 void	transport_deregister_session(struct se_session *);
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs
  2017-02-07 23:12     ` Christoph Hellwig
@ 2017-02-08  3:46       ` Nicholas A. Bellinger
  2017-02-08 16:14         ` Nicholas A. Bellinger
  0 siblings, 1 reply; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-08  3:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: target-devel, linux-scsi, lkml, Rob Millner

On Tue, 2017-02-07 at 15:12 -0800, Christoph Hellwig wrote:
> And the real patch after compile fixing it is here of course:
> 

Getting rid of the extra se_node_acl->acl_free_comp seems to make sense
here..

The only potential issue is if returning from configfs syscall
rmdir /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/acls/$INITIATOR/
before se_node_acl memory is released has implications when the
underlying ../$WWN/$TPGT/ is removed immediately after.

In any event, I'll verify this patch with the original test case and use
it instead if everything looks OK.

> diff --git a/drivers/target/target_core_internal.h b/drivers/target/target_core_internal.h
> index 9ab7090f7c83..96c38f30069d 100644
> --- a/drivers/target/target_core_internal.h
> +++ b/drivers/target/target_core_internal.h
> @@ -152,6 +152,7 @@ void	target_qf_do_work(struct work_struct *work);
>  bool	target_check_wce(struct se_device *dev);
>  bool	target_check_fua(struct se_device *dev);
>  void	__target_execute_cmd(struct se_cmd *, bool);
> +void	target_put_nacl(struct se_node_acl *);
>  
>  /* target_core_stat.c */
>  void	target_stat_setup_dev_default_groups(struct se_device *);
> diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
> index d99752c6cd60..08738c87e5b0 100644
> --- a/drivers/target/target_core_tpg.c
> +++ b/drivers/target/target_core_tpg.c
> @@ -197,7 +197,6 @@ static struct se_node_acl *target_alloc_node_acl(struct se_portal_group *tpg,
>  	INIT_LIST_HEAD(&acl->acl_sess_list);
>  	INIT_HLIST_HEAD(&acl->lun_entry_hlist);
>  	kref_init(&acl->acl_kref);
> -	init_completion(&acl->acl_free_comp);
>  	spin_lock_init(&acl->nacl_sess_lock);
>  	mutex_init(&acl->lun_entry_mutex);
>  	atomic_set(&acl->acl_pr_ref_count, 0);
> @@ -370,21 +369,6 @@ void core_tpg_del_initiator_node_acl(struct se_node_acl *acl)
>  	target_shutdown_sessions(acl);
>  
>  	target_put_nacl(acl);
> -	/*
> -	 * Wait for last target_put_nacl() to complete in target_complete_nacl()
> -	 * for active fabric session transport_deregister_session() callbacks.
> -	 */
> -	wait_for_completion(&acl->acl_free_comp);
> -
> -	core_tpg_wait_for_nacl_pr_ref(acl);
> -	core_free_device_list_for_node(acl, tpg);
> -
> -	pr_debug("%s_TPG[%hu] - Deleted ACL with TCQ Depth: %d for %s"
> -		" Initiator Node: %s\n", tpg->se_tpg_tfo->get_fabric_name(),
> -		tpg->se_tpg_tfo->tpg_get_tag(tpg), acl->queue_depth,
> -		tpg->se_tpg_tfo->get_fabric_name(), acl->initiatorname);
> -
> -	kfree(acl);
>  }
>  
>  /*	core_tpg_set_initiator_node_queue_depth():
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 1cadc9eefa21..9ebd86a8ef41 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -453,19 +453,25 @@ ssize_t target_show_dynamic_sessions(struct se_portal_group *se_tpg, char *page)
>  }
>  EXPORT_SYMBOL(target_show_dynamic_sessions);
>  
> -static void target_complete_nacl(struct kref *kref)
> +static void target_destroy_nacl(struct kref *kref)
>  {
>  	struct se_node_acl *nacl = container_of(kref,
>  				struct se_node_acl, acl_kref);
> +	struct se_portal_group *se_tpg = nacl->se_tpg;
>  
> -	complete(&nacl->acl_free_comp);
> +	mutex_lock(&se_tpg->acl_node_mutex);
> +	list_del(&nacl->acl_list);
> +	mutex_unlock(&se_tpg->acl_node_mutex);
> +
> +	core_tpg_wait_for_nacl_pr_ref(nacl);
> +	core_free_device_list_for_node(nacl, se_tpg);
> +	kfree(nacl);
>  }
>  
>  void target_put_nacl(struct se_node_acl *nacl)
>  {
> -	kref_put(&nacl->acl_kref, target_complete_nacl);
> +	kref_put(&nacl->acl_kref, target_destroy_nacl);
>  }
> -EXPORT_SYMBOL(target_put_nacl);
>  
>  void transport_deregister_session_configfs(struct se_session *se_sess)
>  {
> @@ -499,12 +505,40 @@ EXPORT_SYMBOL(transport_deregister_session_configfs);
>  void transport_free_session(struct se_session *se_sess)
>  {
>  	struct se_node_acl *se_nacl = se_sess->se_node_acl;
> +
>  	/*
>  	 * Drop the se_node_acl->nacl_kref obtained from within
>  	 * core_tpg_get_initiator_node_acl().
>  	 */
>  	if (se_nacl) {
> +		struct se_portal_group *se_tpg = se_nacl->se_tpg;
> +		const struct target_core_fabric_ops *se_tfo = se_tpg->se_tpg_tfo;
> +		unsigned long flags;
> +		bool stop = false;
> +
>  		se_sess->se_node_acl = NULL;
> +
> +		/*
> +		 * Also determine if we need to drop the extra ->cmd_kref if
> +		 * it had been previously dynamically generated, and
> +		 * the endpoint is not caching dynamic ACLs.
> +		 */
> +		mutex_lock(&se_tpg->acl_node_mutex);
> +		if (se_nacl->dynamic_node_acl &&
> +		    !se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
> +			spin_lock_irqsave(&se_nacl->nacl_sess_lock, flags);
> +			if (list_empty(&se_nacl->acl_sess_list))
> +				stop = true;
> +			spin_unlock_irqrestore(&se_nacl->nacl_sess_lock, flags);
> +
> +			if (stop)
> +				list_del_init(&se_nacl->acl_list);
> +		}
> +		mutex_unlock(&se_tpg->acl_node_mutex);
> +
> +		if (stop)
> +			target_put_nacl(se_nacl);
> +
>  		target_put_nacl(se_nacl);
>  	}
>  	if (se_sess->sess_cmd_map) {
> @@ -518,16 +552,12 @@ EXPORT_SYMBOL(transport_free_session);
>  void transport_deregister_session(struct se_session *se_sess)
>  {
>  	struct se_portal_group *se_tpg = se_sess->se_tpg;
> -	const struct target_core_fabric_ops *se_tfo;
> -	struct se_node_acl *se_nacl;
>  	unsigned long flags;
> -	bool drop_nacl = false;
>  
>  	if (!se_tpg) {
>  		transport_free_session(se_sess);
>  		return;
>  	}
> -	se_tfo = se_tpg->se_tpg_tfo;
>  
>  	spin_lock_irqsave(&se_tpg->session_lock, flags);
>  	list_del(&se_sess->sess_list);
> @@ -535,35 +565,8 @@ void transport_deregister_session(struct se_session *se_sess)
>  	se_sess->fabric_sess_ptr = NULL;
>  	spin_unlock_irqrestore(&se_tpg->session_lock, flags);
>  
> -	/*
> -	 * Determine if we need to do extra work for this initiator node's
> -	 * struct se_node_acl if it had been previously dynamically generated.
> -	 */
> -	se_nacl = se_sess->se_node_acl;
> -
> -	mutex_lock(&se_tpg->acl_node_mutex);
> -	if (se_nacl && se_nacl->dynamic_node_acl) {
> -		if (!se_tfo->tpg_check_demo_mode_cache(se_tpg)) {
> -			list_del(&se_nacl->acl_list);
> -			drop_nacl = true;
> -		}
> -	}
> -	mutex_unlock(&se_tpg->acl_node_mutex);
> -
> -	if (drop_nacl) {
> -		core_tpg_wait_for_nacl_pr_ref(se_nacl);
> -		core_free_device_list_for_node(se_nacl, se_tpg);
> -		se_sess->se_node_acl = NULL;
> -		kfree(se_nacl);
> -	}
>  	pr_debug("TARGET_CORE[%s]: Deregistered fabric_sess\n",
>  		se_tpg->se_tpg_tfo->get_fabric_name());
> -	/*
> -	 * If last kref is dropping now for an explicit NodeACL, awake sleeping
> -	 * ->acl_free_comp caller to wakeup configfs se_node_acl->acl_group
> -	 * removal context from within transport_free_session() code.
> -	 */
> -
>  	transport_free_session(se_sess);
>  }
>  EXPORT_SYMBOL(transport_deregister_session);
> diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
> index 43edf82e54ff..edad452c3c25 100644
> --- a/include/target/target_core_base.h
> +++ b/include/target/target_core_base.h
> @@ -557,7 +557,6 @@ struct se_node_acl {
>  	struct config_group	acl_fabric_stat_group;
>  	struct list_head	acl_list;
>  	struct list_head	acl_sess_list;
> -	struct completion	acl_free_comp;
>  	struct kref		acl_kref;
>  };
>  
> diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h
> index 358041bad1da..1c417731b63d 100644
> --- a/include/target/target_core_fabric.h
> +++ b/include/target/target_core_fabric.h
> @@ -125,7 +125,6 @@ void	transport_register_session(struct se_portal_group *,
>  		struct se_node_acl *, struct se_session *, void *);
>  ssize_t	target_show_dynamic_sessions(struct se_portal_group *, char *);
>  void	transport_free_session(struct se_session *);
> -void	target_put_nacl(struct se_node_acl *);
>  void	transport_deregister_session_configfs(struct se_session *);
>  void	transport_deregister_session(struct se_session *);
>  
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs
  2017-02-08  3:46       ` Nicholas A. Bellinger
@ 2017-02-08 16:14         ` Nicholas A. Bellinger
  0 siblings, 0 replies; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-08 16:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: target-devel, linux-scsi, lkml, Rob Millner

On Tue, 2017-02-07 at 19:46 -0800, Nicholas A. Bellinger wrote:
> On Tue, 2017-02-07 at 15:12 -0800, Christoph Hellwig wrote:
> > And the real patch after compile fixing it is here of course:
> > 
> 
> Getting rid of the extra se_node_acl->acl_free_comp seems to make sense
> here..
> 
> The only potential issue is if returning from configfs syscall
> rmdir /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/acls/$INITIATOR/
> before se_node_acl memory is released has implications when the
> underlying ../$WWN/$TPGT/ is removed immediately after.
> 
> In any event, I'll verify this patch with the original test case and use
> it instead if everything looks OK.

Ok, I remember why se_node_acl->acl_free_comp is needed now..

The issue is when se_node_acl is being explicitly removed from configfs
while se_sessions associated with se_node_acl are still active.

The se_node_acl->acl_group configfs_group_operations->drop_item()
callback must block until all associated se_sessions have done their
target_put_nacl() and completed se_node_acl->acl_free_comp, otherwise
there is nothing preventing parent config_groups of se_node_acl from
being removed while the se_sessions are still active.

That said, dropping the unnecessary list_del_init() usage, but otherwise
keeping the patch as-is.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
  2017-02-07 22:44   ` Christoph Hellwig
@ 2017-02-08 16:16     ` Nicholas A. Bellinger
  0 siblings, 0 replies; 15+ messages in thread
From: Nicholas A. Bellinger @ 2017-02-08 16:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: target-devel, linux-scsi, lkml, Benjamin ESTRABAUD

On Tue, 2017-02-07 at 14:44 -0800, Christoph Hellwig wrote:
> On Tue, Feb 07, 2017 at 01:17:46PM +0000, Nicholas A. Bellinger wrote:
> > +		if (orig->se_lun_acl != NULL) {
> > +			pr_warn_ratelimited("Detected existing explicit"
> > +				" se_lun_acl->se_lun_group reference for %s"
> > +				" mapped_lun: %llu, ignoring\n",
> > +				 nacl->initiatorname, mapped_lun);
> 
> The ignoring in the message confused the heck out of me first.  But it 
> seems that's just an incorrect leftover from the original message, as the
> changelog also says fail instead.  With that fixed up (and maybe the
> whole message in a single string literal on a single line):
> 

Fixed up the message to use 'failed'.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-02-08 16:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-07 13:17 [PATCH 0/5] target: Miscellaneous bug-fixes for >= v4.10 Nicholas A. Bellinger
2017-02-07 13:17 ` [PATCH 1/5] target: Don't BUG_ON during NodeACL dynamic -> explicit conversion Nicholas A. Bellinger
2017-02-07 22:44   ` Christoph Hellwig
2017-02-08 16:16     ` Nicholas A. Bellinger
2017-02-07 13:17 ` [PATCH 2/5] target: Use correct SCSI status during EXTENDED_COPY exception Nicholas A. Bellinger
2017-02-07 22:39   ` Christoph Hellwig
2017-02-07 13:17 ` [PATCH 3/5] target: Fix early transport_generic_handle_tmr abort scenario Nicholas A. Bellinger
2017-02-07 22:45   ` Christoph Hellwig
2017-02-07 13:17 ` [PATCH 4/5] target: Fix multi-session dynamic se_node_acl double free OOPs Nicholas A. Bellinger
2017-02-07 23:07   ` Christoph Hellwig
2017-02-07 23:12     ` Christoph Hellwig
2017-02-08  3:46       ` Nicholas A. Bellinger
2017-02-08 16:14         ` Nicholas A. Bellinger
2017-02-07 13:17 ` [PATCH 5/5] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status Nicholas A. Bellinger
2017-02-07 22:51   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.