All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free
@ 2015-11-15  0:21 Nicholas A. Bellinger
  2015-11-15  0:21 ` [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure Nicholas A. Bellinger
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Nicholas A. Bellinger @ 2015-11-15  0:21 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, Nicholas Bellinger

From: Nicholas Bellinger <nab@linux-iscsi.org>

Hi all,

Here are two patches to address issues encountered over the
last month while stress testing with ESX hosts.

This first is hopefully the last regression around iscsi-target
changes over the last releases to create kthreads on-demand
during login negotiation.  It addresses a case that would end
up leaving left-over iscsi_target_rx_thread() in uninterruptible
sleep, if the failure occured in iscsi_target_do_tx_login_io()
attempting to send the last login response PDU.

The second is a COMPARE_AND_WRITE use-after-free bug, that
is difficult to hit for normal backends, but with just the
right scheduling delays will result in OOPsen.  The problem
centers around the use of SCF_COMPARE_AND_WRITE_POST flag
checking in target_complete_ok_work() to determine the
first or second phase processing of COMPARE_AND_WRITE.

That is, there is nothing that prevents the CAW callbacks
in target_complete_ok_work() from completing in reverse order,
so the dependency on checking cmd->se_cmd_flags is incorrect.
To address this, allow cmd->transport_complete_callback() to
propigate up 'post_ret' to target_complete_ok_work(), and
avoid se_cmd dereference after ->transport_complete_callback().

Both patches are straight-forward fixes, and have been verified
extensively on Linux + ESX hosts the last weeks.

--nab

Nicholas Bellinger (2):
  iscsi-target: Fix rx_login_comp hang after login failure
  target: Fix race for SCF_COMPARE_AND_WRITE_POST checking

 drivers/target/iscsi/iscsi_target.c      | 13 ++++++++++++-
 drivers/target/iscsi/iscsi_target_nego.c |  1 +
 drivers/target/target_core_sbc.c         | 13 +++++++++----
 drivers/target/target_core_transport.c   | 14 ++++++++------
 include/target/target_core_base.h        |  2 +-
 5 files changed, 31 insertions(+), 12 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure
  2015-11-15  0:21 [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Nicholas A. Bellinger
@ 2015-11-15  0:21 ` Nicholas A. Bellinger
  2015-11-16 11:27   ` Sagi Grimberg
  2015-11-15  0:21 ` [PATCH 2/2] target: Fix race for SCF_COMPARE_AND_WRITE_POST checking Nicholas A. Bellinger
  2015-12-18 13:05 ` [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Martin Svec
  2 siblings, 1 reply; 6+ messages in thread
From: Nicholas A. Bellinger @ 2015-11-15  0:21 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, Nicholas Bellinger, Sagi Grimberg

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch addresses a case where iscsi_target_do_tx_login_io()
fails sending the last login response PDU, after the RX/TX
threads have already been started.

The case centers around iscsi_target_rx_thread() not invoking
allow_signal(SIGINT) before the send_sig(SIGINT, ...) occurs
from the failure path, resulting in RX thread hanging
indefinately on iscsi_conn->rx_login_comp.

Note this bug is a regression introduced by:

  commit e54198657b65625085834847ab6271087323ffea
  Author: Nicholas Bellinger <nab@linux-iscsi.org>
  Date:   Wed Jul 22 23:14:19 2015 -0700

      iscsi-target: Fix iscsit_start_kthreads failure OOPs

To address this bug, complete ->rx_login_complete for good
measure in the failure path, and immediately return from
RX thread context if connection state did not actually reach
full feature phase (TARG_CONN_STATE_LOGGED_IN).

Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/iscsi/iscsi_target.c      | 13 ++++++++++++-
 drivers/target/iscsi/iscsi_target_nego.c |  1 +
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 342a07c..72204fb 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4074,6 +4074,17 @@ reject:
 	return iscsit_add_reject(conn, ISCSI_REASON_BOOKMARK_NO_RESOURCES, buf);
 }
 
+static bool iscsi_target_check_conn_state(struct iscsi_conn *conn)
+{
+	bool ret;
+
+	spin_lock_bh(&conn->state_lock);
+	ret = (conn->conn_state != TARG_CONN_STATE_LOGGED_IN);
+	spin_unlock_bh(&conn->state_lock);
+
+	return ret;
+}
+
 int iscsi_target_rx_thread(void *arg)
 {
 	int ret, rc;
@@ -4091,7 +4102,7 @@ int iscsi_target_rx_thread(void *arg)
 	 * incoming iscsi/tcp socket I/O, and/or failing the connection.
 	 */
 	rc = wait_for_completion_interruptible(&conn->rx_login_comp);
-	if (rc < 0)
+	if (rc < 0 || iscsi_target_check_conn_state(conn))
 		return 0;
 
 	if (conn->conn_transport->transport_type == ISCSI_INFINIBAND) {
diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
index 5c964c0..9fc9117 100644
--- a/drivers/target/iscsi/iscsi_target_nego.c
+++ b/drivers/target/iscsi/iscsi_target_nego.c
@@ -388,6 +388,7 @@ err:
 	if (login->login_complete) {
 		if (conn->rx_thread && conn->rx_thread_active) {
 			send_sig(SIGINT, conn->rx_thread, 1);
+			complete(&conn->rx_login_comp);
 			kthread_stop(conn->rx_thread);
 		}
 		if (conn->tx_thread && conn->tx_thread_active) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
  2015-11-15  0:21 [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Nicholas A. Bellinger
  2015-11-15  0:21 ` [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure Nicholas A. Bellinger
@ 2015-11-15  0:21 ` Nicholas A. Bellinger
  2015-12-18 13:05 ` [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Martin Svec
  2 siblings, 0 replies; 6+ messages in thread
From: Nicholas A. Bellinger @ 2015-11-15  0:21 UTC (permalink / raw)
  To: target-devel; +Cc: linux-scsi, Nicholas Bellinger, Sagi Grimberg

From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch addresses a race + use after free where the first
stage of COMPARE_AND_WRITE in compare_and_write_callback()
is rescheduled after the backend sends the secondary WRITE,
resulting in second stage compare_and_write_post() callback
completing in target_complete_ok_work() before the first
can return.

Because current code depends on checking se_cmd->se_cmd_flags
after return from se_cmd->transport_complete_callback(),
this results in first stage having SCF_COMPARE_AND_WRITE_POST
set, which incorrectly falls through into second stage CAW
processing code, eventually triggering a NULL pointer
dereference due to use after free.

To address this bug, pass in a new *post_ret parameter into
se_cmd->transport_complete_callback(), and depend upon this
value instead of ->se_cmd_flags to determine when to return
or fall through into ->queue_status() code for CAW.

Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_sbc.c       | 13 +++++++++----
 drivers/target/target_core_transport.c | 14 ++++++++------
 include/target/target_core_base.h      |  2 +-
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 0b4b2a6..ae24d0f 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -371,7 +371,8 @@ sbc_setup_write_same(struct se_cmd *cmd, unsigned char *flags, struct sbc_ops *o
 	return 0;
 }
 
-static sense_reason_t xdreadwrite_callback(struct se_cmd *cmd, bool success)
+static sense_reason_t xdreadwrite_callback(struct se_cmd *cmd, bool success,
+					   int *post_ret)
 {
 	unsigned char *buf, *addr;
 	struct scatterlist *sg;
@@ -437,7 +438,8 @@ sbc_execute_rw(struct se_cmd *cmd)
 			       cmd->data_direction);
 }
 
-static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success)
+static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success,
+					     int *post_ret)
 {
 	struct se_device *dev = cmd->se_dev;
 
@@ -447,8 +449,10 @@ static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success)
 	 * sent to the backend driver.
 	 */
 	spin_lock_irq(&cmd->t_state_lock);
-	if ((cmd->transport_state & CMD_T_SENT) && !cmd->scsi_status)
+	if ((cmd->transport_state & CMD_T_SENT) && !cmd->scsi_status) {
 		cmd->se_cmd_flags |= SCF_COMPARE_AND_WRITE_POST;
+		*post_ret = 1;
+	}
 	spin_unlock_irq(&cmd->t_state_lock);
 
 	/*
@@ -460,7 +464,8 @@ static sense_reason_t compare_and_write_post(struct se_cmd *cmd, bool success)
 	return TCM_NO_SENSE;
 }
 
-static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool success)
+static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool success,
+						 int *post_ret)
 {
 	struct se_device *dev = cmd->se_dev;
 	struct scatterlist *write_sg = NULL, *sg;
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 5bacc7b..010b8c4 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1658,7 +1658,7 @@ bool target_stop_cmd(struct se_cmd *cmd, unsigned long *flags)
 void transport_generic_request_failure(struct se_cmd *cmd,
 		sense_reason_t sense_reason)
 {
-	int ret = 0;
+	int ret = 0, post_ret = 0;
 
 	pr_debug("-----[ Storage Engine Exception for cmd: %p ITT: 0x%08llx"
 		" CDB: 0x%02x\n", cmd, cmd->tag, cmd->t_task_cdb[0]);
@@ -1680,7 +1680,7 @@ void transport_generic_request_failure(struct se_cmd *cmd,
 	 */
 	if ((cmd->se_cmd_flags & SCF_COMPARE_AND_WRITE) &&
 	     cmd->transport_complete_callback)
-		cmd->transport_complete_callback(cmd, false);
+		cmd->transport_complete_callback(cmd, false, &post_ret);
 
 	switch (sense_reason) {
 	case TCM_NON_EXISTENT_LUN:
@@ -2068,11 +2068,13 @@ static void target_complete_ok_work(struct work_struct *work)
 	 */
 	if (cmd->transport_complete_callback) {
 		sense_reason_t rc;
+		bool caw = (cmd->se_cmd_flags & SCF_COMPARE_AND_WRITE);
+		bool zero_dl = !(cmd->data_length);
+		int post_ret = 0;
 
-		rc = cmd->transport_complete_callback(cmd, true);
-		if (!rc && !(cmd->se_cmd_flags & SCF_COMPARE_AND_WRITE_POST)) {
-			if ((cmd->se_cmd_flags & SCF_COMPARE_AND_WRITE) &&
-			    !cmd->data_length)
+		rc = cmd->transport_complete_callback(cmd, true, &post_ret);
+		if (!rc && !post_ret) {
+			if (caw && zero_dl)
 				goto queue_rsp;
 
 			return;
diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
index 8b9c727..61a0393 100644
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -474,7 +474,7 @@ struct se_cmd {
 	struct completion	cmd_wait_comp;
 	const struct target_core_fabric_ops *se_tfo;
 	sense_reason_t		(*execute_cmd)(struct se_cmd *);
-	sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool);
+	sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *);
 	void			*protocol_data;
 
 	unsigned char		*t_task_cdb;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure
  2015-11-15  0:21 ` [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure Nicholas A. Bellinger
@ 2015-11-16 11:27   ` Sagi Grimberg
  0 siblings, 0 replies; 6+ messages in thread
From: Sagi Grimberg @ 2015-11-16 11:27 UTC (permalink / raw)
  To: Nicholas A. Bellinger, target-devel
  Cc: linux-scsi, Nicholas Bellinger, Sagi Grimberg



On 15/11/2015 02:21, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger <nab@linux-iscsi.org>
>
> This patch addresses a case where iscsi_target_do_tx_login_io()
> fails sending the last login response PDU, after the RX/TX
> threads have already been started.
>
> The case centers around iscsi_target_rx_thread() not invoking
> allow_signal(SIGINT) before the send_sig(SIGINT, ...) occurs
> from the failure path, resulting in RX thread hanging
> indefinately on iscsi_conn->rx_login_comp.
>
> Note this bug is a regression introduced by:
>
>    commit e54198657b65625085834847ab6271087323ffea
>    Author: Nicholas Bellinger <nab@linux-iscsi.org>
>    Date:   Wed Jul 22 23:14:19 2015 -0700
>
>        iscsi-target: Fix iscsit_start_kthreads failure OOPs
>
> To address this bug, complete ->rx_login_complete for good
> measure in the failure path, and immediately return from
> RX thread context if connection state did not actually reach
> full feature phase (TARG_CONN_STATE_LOGGED_IN).

Hey Nic,

Will it make better sense to start the rx/tx threads after
the login was completed (i.e. the conn is in state
TARG_CONN_STATE_LOGGED_IN) instead?

Sagi.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free
  2015-11-15  0:21 [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Nicholas A. Bellinger
  2015-11-15  0:21 ` [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure Nicholas A. Bellinger
  2015-11-15  0:21 ` [PATCH 2/2] target: Fix race for SCF_COMPARE_AND_WRITE_POST checking Nicholas A. Bellinger
@ 2015-12-18 13:05 ` Martin Svec
  2015-12-21  7:19   ` Nicholas A. Bellinger
  2 siblings, 1 reply; 6+ messages in thread
From: Martin Svec @ 2015-12-18 13:05 UTC (permalink / raw)
  To: Nicholas A. Bellinger, target-devel; +Cc: linux-scsi, Nicholas Bellinger

Hello Nick,

should these patches be included in 4.1.x LTS series too?:

Nicholas Bellinger (2):
  iscsi-target: Fix rx_login_comp hang after login failure
  target: Fix race for SCF_COMPARE_AND_WRITE_POST checking

They are marked as #v3.10+ and #v3.12+ but I still don't see them in 4.1.15.

Thank you,

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free
  2015-12-18 13:05 ` [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Martin Svec
@ 2015-12-21  7:19   ` Nicholas A. Bellinger
  0 siblings, 0 replies; 6+ messages in thread
From: Nicholas A. Bellinger @ 2015-12-21  7:19 UTC (permalink / raw)
  To: Martin Svec; +Cc: Nicholas A. Bellinger, target-devel, linux-scsi

On Fri, 2015-12-18 at 14:05 +0100, Martin Svec wrote:
> Hello Nick,
> 
> should these patches be included in 4.1.x LTS series too?:
> 
> Nicholas Bellinger (2):
>   iscsi-target: Fix rx_login_comp hang after login failure
>   target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
> 
> They are marked as #v3.10+ and #v3.12+ but I still don't see them in 4.1.15.
> 

FYI, Kamal has applied both to v3.19.y + v3.13.y stable code.

Me thinks Greg-KH stable backlog is higher than usual due to holidays,
etc.  I'm sure he will be getting back to it soon.  ;)

--nab

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-12-21  7:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-15  0:21 [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Nicholas A. Bellinger
2015-11-15  0:21 ` [PATCH 1/2] iscsi-target: Fix rx_login_comp hang after login failure Nicholas A. Bellinger
2015-11-16 11:27   ` Sagi Grimberg
2015-11-15  0:21 ` [PATCH 2/2] target: Fix race for SCF_COMPARE_AND_WRITE_POST checking Nicholas A. Bellinger
2015-12-18 13:05 ` [PATCH 0/2] target: kthread login failure hung task + CAW use-after-free Martin Svec
2015-12-21  7:19   ` Nicholas A. Bellinger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.