All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi: target: put lun_ref at end of tmr processing
@ 2020-05-12 16:17 ` Bodo Stroesser
  0 siblings, 0 replies; 6+ messages in thread
From: Bodo Stroesser @ 2020-05-12 16:17 UTC (permalink / raw)
  To: martin.petersen, bvanassche, mchristi, linux-scsi, target-devel, bly
  Cc: Bodo Stroesser

Testing with Loopback I found, that after a Loopback LUN
has executed a TMR, I can no longer unlink the LUN.
The rm command hangs in transport_clear_lun_ref() at
wait_for_completion(&lun->lun_shutdown_comp)
The reason is, that transport_lun_remove_cmd() is not
called at the end of target_tmr_work().

It seems, that in other fabrics this call happens implicitly
when the fabric drivers call transport_generic_free_cmd()
during their ->queue_tm_rsp().

Unfortunately Loopback seems to not comply to the common way
of calling transport_generic_free_cmd() from ->queue_*().
Instead it calls transport_generic_free_cmd() from its
  ->check_stop_free() only.

But the ->check_stop_free() is called by
transport_cmd_check_stop_to_fabric() after it has reset the
se_cmd->se_lun pointer.
Therefore the following transport_generic_free_cmd() skips the
transport_lun_remove_cmd().

So this patch re-adds the transport_lun_remove_cmd() at the end
of target_tmr_work(), which was removed during commit
2c9fa49e100f962af988f1c0529231bf14905cda
"scsi: target/core: Make ABORT and LUN RESET handling synchronous"

For fabrics using transport_generic_free_cmd() in the usual way
the double call to transport_lun_remove_cmd() doesn't harm, as
transport_lun_remove_cmd() checks for this situation and does
not release lun_ref twice.

Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Tested-by: Bryant G. Ly <bryangly@gmail.com>
---
 drivers/target/target_core_transport.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 594b724bbf79..264a822c0bfa 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -3350,6 +3350,7 @@ static void target_tmr_work(struct work_struct *work)
 
 	cmd->se_tfo->queue_tm_rsp(cmd);
 
+	transport_lun_remove_cmd(cmd);
 	transport_cmd_check_stop_to_fabric(cmd);
 	return;
 
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] scsi: target: put lun_ref at end of tmr processing
@ 2020-05-12 16:17 ` Bodo Stroesser
  0 siblings, 0 replies; 6+ messages in thread
From: Bodo Stroesser @ 2020-05-12 16:17 UTC (permalink / raw)
  To: martin.petersen, bvanassche, mchristi, linux-scsi, target-devel, bly
  Cc: Bodo Stroesser

Testing with Loopback I found, that after a Loopback LUN
has executed a TMR, I can no longer unlink the LUN.
The rm command hangs in transport_clear_lun_ref() at
wait_for_completion(&lun->lun_shutdown_comp)
The reason is, that transport_lun_remove_cmd() is not
called at the end of target_tmr_work().

It seems, that in other fabrics this call happens implicitly
when the fabric drivers call transport_generic_free_cmd()
during their ->queue_tm_rsp().

Unfortunately Loopback seems to not comply to the common way
of calling transport_generic_free_cmd() from ->queue_*().
Instead it calls transport_generic_free_cmd() from its
  ->check_stop_free() only.

But the ->check_stop_free() is called by
transport_cmd_check_stop_to_fabric() after it has reset the
se_cmd->se_lun pointer.
Therefore the following transport_generic_free_cmd() skips the
transport_lun_remove_cmd().

So this patch re-adds the transport_lun_remove_cmd() at the end
of target_tmr_work(), which was removed during commit
2c9fa49e100f962af988f1c0529231bf14905cda
"scsi: target/core: Make ABORT and LUN RESET handling synchronous"

For fabrics using transport_generic_free_cmd() in the usual way
the double call to transport_lun_remove_cmd() doesn't harm, as
transport_lun_remove_cmd() checks for this situation and does
not release lun_ref twice.

Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Tested-by: Bryant G. Ly <bryangly@gmail.com>
---
 drivers/target/target_core_transport.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 594b724bbf79..264a822c0bfa 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -3350,6 +3350,7 @@ static void target_tmr_work(struct work_struct *work)
 
 	cmd->se_tfo->queue_tm_rsp(cmd);
 
+	transport_lun_remove_cmd(cmd);
 	transport_cmd_check_stop_to_fabric(cmd);
 	return;
 
-- 
2.12.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
  2020-05-12 16:17 ` Bodo Stroesser
@ 2020-05-12 23:05   ` Bart Van Assche
  -1 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2020-05-12 23:05 UTC (permalink / raw)
  To: Bodo Stroesser, martin.petersen, mchristi, linux-scsi, target-devel, bly

On 2020-05-12 09:17, Bodo Stroesser wrote:
> Testing with Loopback I found, that after a Loopback LUN
> has executed a TMR, I can no longer unlink the LUN.
> The rm command hangs in transport_clear_lun_ref() at
> wait_for_completion(&lun->lun_shutdown_comp)
> The reason is, that transport_lun_remove_cmd() is not
> called at the end of target_tmr_work().
> 
> It seems, that in other fabrics this call happens implicitly
> when the fabric drivers call transport_generic_free_cmd()
> during their ->queue_tm_rsp().
> 
> Unfortunately Loopback seems to not comply to the common way
> of calling transport_generic_free_cmd() from ->queue_*().
> Instead it calls transport_generic_free_cmd() from its
>   ->check_stop_free() only.
> 
> But the ->check_stop_free() is called by
> transport_cmd_check_stop_to_fabric() after it has reset the
> se_cmd->se_lun pointer.
> Therefore the following transport_generic_free_cmd() skips the
> transport_lun_remove_cmd().
> 
> So this patch re-adds the transport_lun_remove_cmd() at the end
> of target_tmr_work(), which was removed during commit
> 2c9fa49e100f962af988f1c0529231bf14905cda
> "scsi: target/core: Make ABORT and LUN RESET handling synchronous"
> 
> For fabrics using transport_generic_free_cmd() in the usual way
> the double call to transport_lun_remove_cmd() doesn't harm, as
> transport_lun_remove_cmd() checks for this situation and does
> not release lun_ref twice.
> 
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> Tested-by: Bryant G. Ly <bryangly@gmail.com>

Please add Fixes: ... and Cc: stable tags. Anyway:

Reviewed-by: Bart van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
@ 2020-05-12 23:05   ` Bart Van Assche
  0 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2020-05-12 23:05 UTC (permalink / raw)
  To: Bodo Stroesser, martin.petersen, mchristi, linux-scsi, target-devel, bly

On 2020-05-12 09:17, Bodo Stroesser wrote:
> Testing with Loopback I found, that after a Loopback LUN
> has executed a TMR, I can no longer unlink the LUN.
> The rm command hangs in transport_clear_lun_ref() at
> wait_for_completion(&lun->lun_shutdown_comp)
> The reason is, that transport_lun_remove_cmd() is not
> called at the end of target_tmr_work().
> 
> It seems, that in other fabrics this call happens implicitly
> when the fabric drivers call transport_generic_free_cmd()
> during their ->queue_tm_rsp().
> 
> Unfortunately Loopback seems to not comply to the common way
> of calling transport_generic_free_cmd() from ->queue_*().
> Instead it calls transport_generic_free_cmd() from its
>   ->check_stop_free() only.
> 
> But the ->check_stop_free() is called by
> transport_cmd_check_stop_to_fabric() after it has reset the
> se_cmd->se_lun pointer.
> Therefore the following transport_generic_free_cmd() skips the
> transport_lun_remove_cmd().
> 
> So this patch re-adds the transport_lun_remove_cmd() at the end
> of target_tmr_work(), which was removed during commit
> 2c9fa49e100f962af988f1c0529231bf14905cda
> "scsi: target/core: Make ABORT and LUN RESET handling synchronous"
> 
> For fabrics using transport_generic_free_cmd() in the usual way
> the double call to transport_lun_remove_cmd() doesn't harm, as
> transport_lun_remove_cmd() checks for this situation and does
> not release lun_ref twice.
> 
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> Tested-by: Bryant G. Ly <bryangly@gmail.com>

Please add Fixes: ... and Cc: stable tags. Anyway:

Reviewed-by: Bart van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
  2020-05-12 16:17 ` Bodo Stroesser
  (?)
  (?)
@ 2020-05-20 15:50 ` Pavel Zakharov
  -1 siblings, 0 replies; 6+ messages in thread
From: Pavel Zakharov @ 2020-05-20 15:50 UTC (permalink / raw)
  To: target-devel

Hopefully I didn’t mess-up the recipients/Subject list as I just subscribed to target-devel and so following-up to an email I found in the archive.

Bodo, I happened to have run into the same issue and was just about to submit a similar patch when I randomly stumbled on yours.
I think we can improve on that patch by moving the call to transport_lun_remove_cmd() into transport_cmd_check_stop_to_fabric().

Here’s the patch I was about to propose. Let me know what you think.
Note that we could also do this simplification as a separate patch later if you think it’s better.

---
When ABORT and LUN RESET were made synchronous, a call to
transport_lun_remove_cmd() was acidentally removed. This caused a lun_ref
to be leaked every time an ABORT or LUN RESET was processed in
target_tmr_work().

Given that transport_lun_remove_cmd() should always be called before
transport_cmd_check_stop_to_fabric(), move that call into this function.
This ensures that the lun_ref is released before setting .se_lun to NULL.

Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
---
 drivers/target/target_core_transport.c | 36 ++++++++++++--------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 594b724bbf79..ec578822709a 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -653,6 +653,17 @@ static void target_remove_from_state_list(struct se_cmd *cmd)
 	spin_unlock_irqrestore(&dev->execute_task_lock, flags);
 }

+static void transport_lun_remove_cmd(struct se_cmd *cmd)
+{
+	struct se_lun *lun = cmd->se_lun;
+
+	if (!lun)
+		return;
+
+	if (cmpxchg(&cmd->lun_ref_active, true, false))
+		percpu_ref_put(&lun->lun_ref);
+}
+
 /*
  * This function is called by the target core after the target core has
  * finished processing a SCSI command or SCSI TMF. Both the regular command
@@ -664,6 +675,11 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
 {
 	unsigned long flags;

+	/*
+	 * Remove cmd ref from lun if needed before clearing the se_lun.
+	 */
+	transport_lun_remove_cmd(cmd);
+
 	target_remove_from_state_list(cmd);

 	/*
@@ -698,17 +714,6 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
 	return cmd->se_tfo->check_stop_free(cmd);
 }

-static void transport_lun_remove_cmd(struct se_cmd *cmd)
-{
-	struct se_lun *lun = cmd->se_lun;
-
-	if (!lun)
-		return;
-
-	if (cmpxchg(&cmd->lun_ref_active, true, false))
-		percpu_ref_put(&lun->lun_ref);
-}
-
 static void target_complete_failure_work(struct work_struct *work)
 {
 	struct se_cmd *cmd = container_of(work, struct se_cmd, work);
@@ -799,8 +804,6 @@ static void target_handle_abort(struct se_cmd *cmd)

 	WARN_ON_ONCE(kref_read(&cmd->cmd_kref) == 0);

-	transport_lun_remove_cmd(cmd);
-
 	transport_cmd_check_stop_to_fabric(cmd);
 }

@@ -1726,7 +1729,6 @@ static void target_complete_tmr_failure(struct work_struct *work)
 	se_cmd->se_tmr_req->response = TMR_LUN_DOES_NOT_EXIST;
 	se_cmd->se_tfo->queue_tm_rsp(se_cmd);

-	transport_lun_remove_cmd(se_cmd);
 	transport_cmd_check_stop_to_fabric(se_cmd);
 }

@@ -1918,7 +1920,6 @@ void transport_generic_request_failure(struct se_cmd *cmd,
 		goto queue_full;

 check_stop:
-	transport_lun_remove_cmd(cmd);
 	transport_cmd_check_stop_to_fabric(cmd);
 	return;

@@ -2216,7 +2217,6 @@ static void transport_complete_qf(struct se_cmd *cmd)
 		transport_handle_queue_full(cmd, cmd->se_dev, ret, false);
 		return;
 	}
-	transport_lun_remove_cmd(cmd);
 	transport_cmd_check_stop_to_fabric(cmd);
 }

@@ -2311,7 +2311,6 @@ static void target_complete_ok_work(struct work_struct *work)
 		if (ret)
 			goto queue_full;

-		transport_lun_remove_cmd(cmd);
 		transport_cmd_check_stop_to_fabric(cmd);
 		return;
 	}
@@ -2337,7 +2336,6 @@ static void target_complete_ok_work(struct work_struct *work)
 			if (ret)
 				goto queue_full;

-			transport_lun_remove_cmd(cmd);
 			transport_cmd_check_stop_to_fabric(cmd);
 			return;
 		}
@@ -2373,7 +2371,6 @@ static void target_complete_ok_work(struct work_struct *work)
 			if (ret)
 				goto queue_full;

-			transport_lun_remove_cmd(cmd);
 			transport_cmd_check_stop_to_fabric(cmd);
 			return;
 		}
@@ -2409,7 +2406,6 @@ static void target_complete_ok_work(struct work_struct *work)
 		break;
 	}

-	transport_lun_remove_cmd(cmd);
 	transport_cmd_check_stop_to_fabric(cmd);
 	return;

--
2.21.1 (Apple Git-122.3)

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
  2020-05-12 16:17 ` Bodo Stroesser
                   ` (2 preceding siblings ...)
  (?)
@ 2020-05-20 17:14 ` Bodo Stroesser
  -1 siblings, 0 replies; 6+ messages in thread
From: Bodo Stroesser @ 2020-05-20 17:14 UTC (permalink / raw)
  To: target-devel

On 05/20/20 17:50, Pavel Zakharov wrote:
> Hopefully I didn’t mess-up the recipients/Subject list as I just subscribed to target-devel and so following-up to an email I found in the archive.
> 
> Bodo, I happened to have run into the same issue and was just about to submit a similar patch when I randomly stumbled on yours.
> I think we can improve on that patch by moving the call to transport_lun_remove_cmd() into transport_cmd_check_stop_to_fabric().

Yeah, I also thought about doing this. But then decided to better
add the one lost line for fixing only.

> 
> Here’s the patch I was about to propose. Let me know what you think.
> Note that we could also do this simplification as a separate patch later if you think it’s better.

I think it is a good idea to do this simplification in a separate patch.


> 
> ---
> When ABORT and LUN RESET were made synchronous, a call to
> transport_lun_remove_cmd() was acidentally removed. This caused a lun_ref
> to be leaked every time an ABORT or LUN RESET was processed in
> target_tmr_work().
> 
> Given that transport_lun_remove_cmd() should always be called before
> transport_cmd_check_stop_to_fabric(), move that call into this function.
> This ensures that the lun_ref is released before setting .se_lun to NULL.
> 
> Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
> ---
>   drivers/target/target_core_transport.c | 36 ++++++++++++--------------
>   1 file changed, 16 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 594b724bbf79..ec578822709a 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -653,6 +653,17 @@ static void target_remove_from_state_list(struct se_cmd *cmd)
>   	spin_unlock_irqrestore(&dev->execute_task_lock, flags);
>   }
> 
> +static void transport_lun_remove_cmd(struct se_cmd *cmd)
> +{
> +	struct se_lun *lun = cmd->se_lun;
> +
> +	if (!lun)
> +		return;
> +
> +	if (cmpxchg(&cmd->lun_ref_active, true, false))
> +		percpu_ref_put(&lun->lun_ref);
> +}
> +
>   /*
>    * This function is called by the target core after the target core has
>    * finished processing a SCSI command or SCSI TMF. Both the regular command
> @@ -664,6 +675,11 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
>   {
>   	unsigned long flags;
> 
> +	/*
> +	 * Remove cmd ref from lun if needed before clearing the se_lun.
> +	 */
> +	transport_lun_remove_cmd(cmd);
> +
>   	target_remove_from_state_list(cmd);
> 
>   	/*
> @@ -698,17 +714,6 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
>   	return cmd->se_tfo->check_stop_free(cmd);
>   }
> 
> -static void transport_lun_remove_cmd(struct se_cmd *cmd)
> -{
> -	struct se_lun *lun = cmd->se_lun;
> -
> -	if (!lun)
> -		return;
> -
> -	if (cmpxchg(&cmd->lun_ref_active, true, false))
> -		percpu_ref_put(&lun->lun_ref);
> -}
> -
>   static void target_complete_failure_work(struct work_struct *work)
>   {
>   	struct se_cmd *cmd = container_of(work, struct se_cmd, work);
> @@ -799,8 +804,6 @@ static void target_handle_abort(struct se_cmd *cmd)
> 
>   	WARN_ON_ONCE(kref_read(&cmd->cmd_kref) == 0);
> 
> -	transport_lun_remove_cmd(cmd);
> -
>   	transport_cmd_check_stop_to_fabric(cmd);
>   }
> 
> @@ -1726,7 +1729,6 @@ static void target_complete_tmr_failure(struct work_struct *work)
>   	se_cmd->se_tmr_req->response = TMR_LUN_DOES_NOT_EXIST;
>   	se_cmd->se_tfo->queue_tm_rsp(se_cmd);
> 
> -	transport_lun_remove_cmd(se_cmd);
>   	transport_cmd_check_stop_to_fabric(se_cmd);
>   }
> 
> @@ -1918,7 +1920,6 @@ void transport_generic_request_failure(struct se_cmd *cmd,
>   		goto queue_full;
> 
>   check_stop:
> -	transport_lun_remove_cmd(cmd);
>   	transport_cmd_check_stop_to_fabric(cmd);
>   	return;
> 
> @@ -2216,7 +2217,6 @@ static void transport_complete_qf(struct se_cmd *cmd)
>   		transport_handle_queue_full(cmd, cmd->se_dev, ret, false);
>   		return;
>   	}
> -	transport_lun_remove_cmd(cmd);
>   	transport_cmd_check_stop_to_fabric(cmd);
>   }
> 
> @@ -2311,7 +2311,6 @@ static void target_complete_ok_work(struct work_struct *work)
>   		if (ret)
>   			goto queue_full;
> 
> -		transport_lun_remove_cmd(cmd);
>   		transport_cmd_check_stop_to_fabric(cmd);
>   		return;
>   	}
> @@ -2337,7 +2336,6 @@ static void target_complete_ok_work(struct work_struct *work)
>   			if (ret)
>   				goto queue_full;
> 
> -			transport_lun_remove_cmd(cmd);
>   			transport_cmd_check_stop_to_fabric(cmd);
>   			return;
>   		}
> @@ -2373,7 +2371,6 @@ static void target_complete_ok_work(struct work_struct *work)
>   			if (ret)
>   				goto queue_full;
> 
> -			transport_lun_remove_cmd(cmd);
>   			transport_cmd_check_stop_to_fabric(cmd);
>   			return;
>   		}
> @@ -2409,7 +2406,6 @@ static void target_complete_ok_work(struct work_struct *work)
>   		break;
>   	}
> 
> -	transport_lun_remove_cmd(cmd);
>   	transport_cmd_check_stop_to_fabric(cmd);
>   	return;
> 
> --
> 2.21.1 (Apple Git-122.3)
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-05-20 17:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-12 16:17 [PATCH] scsi: target: put lun_ref at end of tmr processing Bodo Stroesser
2020-05-12 16:17 ` Bodo Stroesser
2020-05-12 23:05 ` Bart Van Assche
2020-05-12 23:05   ` Bart Van Assche
2020-05-20 15:50 ` Pavel Zakharov
2020-05-20 17:14 ` Bodo Stroesser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.