* [PATCH] scsi: target: put lun_ref at end of tmr processing
@ 2020-05-12 16:17 ` Bodo Stroesser
0 siblings, 0 replies; 6+ messages in thread
From: Bodo Stroesser @ 2020-05-12 16:17 UTC (permalink / raw)
To: martin.petersen, bvanassche, mchristi, linux-scsi, target-devel, bly
Cc: Bodo Stroesser
Testing with Loopback I found, that after a Loopback LUN
has executed a TMR, I can no longer unlink the LUN.
The rm command hangs in transport_clear_lun_ref() at
wait_for_completion(&lun->lun_shutdown_comp)
The reason is, that transport_lun_remove_cmd() is not
called at the end of target_tmr_work().
It seems, that in other fabrics this call happens implicitly
when the fabric drivers call transport_generic_free_cmd()
during their ->queue_tm_rsp().
Unfortunately Loopback seems to not comply to the common way
of calling transport_generic_free_cmd() from ->queue_*().
Instead it calls transport_generic_free_cmd() from its
->check_stop_free() only.
But the ->check_stop_free() is called by
transport_cmd_check_stop_to_fabric() after it has reset the
se_cmd->se_lun pointer.
Therefore the following transport_generic_free_cmd() skips the
transport_lun_remove_cmd().
So this patch re-adds the transport_lun_remove_cmd() at the end
of target_tmr_work(), which was removed during commit
2c9fa49e100f962af988f1c0529231bf14905cda
"scsi: target/core: Make ABORT and LUN RESET handling synchronous"
For fabrics using transport_generic_free_cmd() in the usual way
the double call to transport_lun_remove_cmd() doesn't harm, as
transport_lun_remove_cmd() checks for this situation and does
not release lun_ref twice.
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Tested-by: Bryant G. Ly <bryangly@gmail.com>
---
drivers/target/target_core_transport.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 594b724bbf79..264a822c0bfa 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -3350,6 +3350,7 @@ static void target_tmr_work(struct work_struct *work)
cmd->se_tfo->queue_tm_rsp(cmd);
+ transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
--
2.12.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] scsi: target: put lun_ref at end of tmr processing
@ 2020-05-12 16:17 ` Bodo Stroesser
0 siblings, 0 replies; 6+ messages in thread
From: Bodo Stroesser @ 2020-05-12 16:17 UTC (permalink / raw)
To: martin.petersen, bvanassche, mchristi, linux-scsi, target-devel, bly
Cc: Bodo Stroesser
Testing with Loopback I found, that after a Loopback LUN
has executed a TMR, I can no longer unlink the LUN.
The rm command hangs in transport_clear_lun_ref() at
wait_for_completion(&lun->lun_shutdown_comp)
The reason is, that transport_lun_remove_cmd() is not
called at the end of target_tmr_work().
It seems, that in other fabrics this call happens implicitly
when the fabric drivers call transport_generic_free_cmd()
during their ->queue_tm_rsp().
Unfortunately Loopback seems to not comply to the common way
of calling transport_generic_free_cmd() from ->queue_*().
Instead it calls transport_generic_free_cmd() from its
->check_stop_free() only.
But the ->check_stop_free() is called by
transport_cmd_check_stop_to_fabric() after it has reset the
se_cmd->se_lun pointer.
Therefore the following transport_generic_free_cmd() skips the
transport_lun_remove_cmd().
So this patch re-adds the transport_lun_remove_cmd() at the end
of target_tmr_work(), which was removed during commit
2c9fa49e100f962af988f1c0529231bf14905cda
"scsi: target/core: Make ABORT and LUN RESET handling synchronous"
For fabrics using transport_generic_free_cmd() in the usual way
the double call to transport_lun_remove_cmd() doesn't harm, as
transport_lun_remove_cmd() checks for this situation and does
not release lun_ref twice.
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Tested-by: Bryant G. Ly <bryangly@gmail.com>
---
drivers/target/target_core_transport.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 594b724bbf79..264a822c0bfa 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -3350,6 +3350,7 @@ static void target_tmr_work(struct work_struct *work)
cmd->se_tfo->queue_tm_rsp(cmd);
+ transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
--
2.12.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
2020-05-12 16:17 ` Bodo Stroesser
@ 2020-05-12 23:05 ` Bart Van Assche
-1 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2020-05-12 23:05 UTC (permalink / raw)
To: Bodo Stroesser, martin.petersen, mchristi, linux-scsi, target-devel, bly
On 2020-05-12 09:17, Bodo Stroesser wrote:
> Testing with Loopback I found, that after a Loopback LUN
> has executed a TMR, I can no longer unlink the LUN.
> The rm command hangs in transport_clear_lun_ref() at
> wait_for_completion(&lun->lun_shutdown_comp)
> The reason is, that transport_lun_remove_cmd() is not
> called at the end of target_tmr_work().
>
> It seems, that in other fabrics this call happens implicitly
> when the fabric drivers call transport_generic_free_cmd()
> during their ->queue_tm_rsp().
>
> Unfortunately Loopback seems to not comply to the common way
> of calling transport_generic_free_cmd() from ->queue_*().
> Instead it calls transport_generic_free_cmd() from its
> ->check_stop_free() only.
>
> But the ->check_stop_free() is called by
> transport_cmd_check_stop_to_fabric() after it has reset the
> se_cmd->se_lun pointer.
> Therefore the following transport_generic_free_cmd() skips the
> transport_lun_remove_cmd().
>
> So this patch re-adds the transport_lun_remove_cmd() at the end
> of target_tmr_work(), which was removed during commit
> 2c9fa49e100f962af988f1c0529231bf14905cda
> "scsi: target/core: Make ABORT and LUN RESET handling synchronous"
>
> For fabrics using transport_generic_free_cmd() in the usual way
> the double call to transport_lun_remove_cmd() doesn't harm, as
> transport_lun_remove_cmd() checks for this situation and does
> not release lun_ref twice.
>
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> Tested-by: Bryant G. Ly <bryangly@gmail.com>
Please add Fixes: ... and Cc: stable tags. Anyway:
Reviewed-by: Bart van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
@ 2020-05-12 23:05 ` Bart Van Assche
0 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2020-05-12 23:05 UTC (permalink / raw)
To: Bodo Stroesser, martin.petersen, mchristi, linux-scsi, target-devel, bly
On 2020-05-12 09:17, Bodo Stroesser wrote:
> Testing with Loopback I found, that after a Loopback LUN
> has executed a TMR, I can no longer unlink the LUN.
> The rm command hangs in transport_clear_lun_ref() at
> wait_for_completion(&lun->lun_shutdown_comp)
> The reason is, that transport_lun_remove_cmd() is not
> called at the end of target_tmr_work().
>
> It seems, that in other fabrics this call happens implicitly
> when the fabric drivers call transport_generic_free_cmd()
> during their ->queue_tm_rsp().
>
> Unfortunately Loopback seems to not comply to the common way
> of calling transport_generic_free_cmd() from ->queue_*().
> Instead it calls transport_generic_free_cmd() from its
> ->check_stop_free() only.
>
> But the ->check_stop_free() is called by
> transport_cmd_check_stop_to_fabric() after it has reset the
> se_cmd->se_lun pointer.
> Therefore the following transport_generic_free_cmd() skips the
> transport_lun_remove_cmd().
>
> So this patch re-adds the transport_lun_remove_cmd() at the end
> of target_tmr_work(), which was removed during commit
> 2c9fa49e100f962af988f1c0529231bf14905cda
> "scsi: target/core: Make ABORT and LUN RESET handling synchronous"
>
> For fabrics using transport_generic_free_cmd() in the usual way
> the double call to transport_lun_remove_cmd() doesn't harm, as
> transport_lun_remove_cmd() checks for this situation and does
> not release lun_ref twice.
>
> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
> Tested-by: Bryant G. Ly <bryangly@gmail.com>
Please add Fixes: ... and Cc: stable tags. Anyway:
Reviewed-by: Bart van Assche <bvanassche@acm.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
2020-05-12 16:17 ` Bodo Stroesser
(?)
(?)
@ 2020-05-20 15:50 ` Pavel Zakharov
-1 siblings, 0 replies; 6+ messages in thread
From: Pavel Zakharov @ 2020-05-20 15:50 UTC (permalink / raw)
To: target-devel
Hopefully I didn’t mess-up the recipients/Subject list as I just subscribed to target-devel and so following-up to an email I found in the archive.
Bodo, I happened to have run into the same issue and was just about to submit a similar patch when I randomly stumbled on yours.
I think we can improve on that patch by moving the call to transport_lun_remove_cmd() into transport_cmd_check_stop_to_fabric().
Here’s the patch I was about to propose. Let me know what you think.
Note that we could also do this simplification as a separate patch later if you think it’s better.
---
When ABORT and LUN RESET were made synchronous, a call to
transport_lun_remove_cmd() was acidentally removed. This caused a lun_ref
to be leaked every time an ABORT or LUN RESET was processed in
target_tmr_work().
Given that transport_lun_remove_cmd() should always be called before
transport_cmd_check_stop_to_fabric(), move that call into this function.
This ensures that the lun_ref is released before setting .se_lun to NULL.
Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
---
drivers/target/target_core_transport.c | 36 ++++++++++++--------------
1 file changed, 16 insertions(+), 20 deletions(-)
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 594b724bbf79..ec578822709a 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -653,6 +653,17 @@ static void target_remove_from_state_list(struct se_cmd *cmd)
spin_unlock_irqrestore(&dev->execute_task_lock, flags);
}
+static void transport_lun_remove_cmd(struct se_cmd *cmd)
+{
+ struct se_lun *lun = cmd->se_lun;
+
+ if (!lun)
+ return;
+
+ if (cmpxchg(&cmd->lun_ref_active, true, false))
+ percpu_ref_put(&lun->lun_ref);
+}
+
/*
* This function is called by the target core after the target core has
* finished processing a SCSI command or SCSI TMF. Both the regular command
@@ -664,6 +675,11 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
{
unsigned long flags;
+ /*
+ * Remove cmd ref from lun if needed before clearing the se_lun.
+ */
+ transport_lun_remove_cmd(cmd);
+
target_remove_from_state_list(cmd);
/*
@@ -698,17 +714,6 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
return cmd->se_tfo->check_stop_free(cmd);
}
-static void transport_lun_remove_cmd(struct se_cmd *cmd)
-{
- struct se_lun *lun = cmd->se_lun;
-
- if (!lun)
- return;
-
- if (cmpxchg(&cmd->lun_ref_active, true, false))
- percpu_ref_put(&lun->lun_ref);
-}
-
static void target_complete_failure_work(struct work_struct *work)
{
struct se_cmd *cmd = container_of(work, struct se_cmd, work);
@@ -799,8 +804,6 @@ static void target_handle_abort(struct se_cmd *cmd)
WARN_ON_ONCE(kref_read(&cmd->cmd_kref) == 0);
- transport_lun_remove_cmd(cmd);
-
transport_cmd_check_stop_to_fabric(cmd);
}
@@ -1726,7 +1729,6 @@ static void target_complete_tmr_failure(struct work_struct *work)
se_cmd->se_tmr_req->response = TMR_LUN_DOES_NOT_EXIST;
se_cmd->se_tfo->queue_tm_rsp(se_cmd);
- transport_lun_remove_cmd(se_cmd);
transport_cmd_check_stop_to_fabric(se_cmd);
}
@@ -1918,7 +1920,6 @@ void transport_generic_request_failure(struct se_cmd *cmd,
goto queue_full;
check_stop:
- transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
@@ -2216,7 +2217,6 @@ static void transport_complete_qf(struct se_cmd *cmd)
transport_handle_queue_full(cmd, cmd->se_dev, ret, false);
return;
}
- transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
}
@@ -2311,7 +2311,6 @@ static void target_complete_ok_work(struct work_struct *work)
if (ret)
goto queue_full;
- transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
}
@@ -2337,7 +2336,6 @@ static void target_complete_ok_work(struct work_struct *work)
if (ret)
goto queue_full;
- transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
}
@@ -2373,7 +2371,6 @@ static void target_complete_ok_work(struct work_struct *work)
if (ret)
goto queue_full;
- transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
}
@@ -2409,7 +2406,6 @@ static void target_complete_ok_work(struct work_struct *work)
break;
}
- transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
--
2.21.1 (Apple Git-122.3)
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] scsi: target: put lun_ref at end of tmr processing
2020-05-12 16:17 ` Bodo Stroesser
` (2 preceding siblings ...)
(?)
@ 2020-05-20 17:14 ` Bodo Stroesser
-1 siblings, 0 replies; 6+ messages in thread
From: Bodo Stroesser @ 2020-05-20 17:14 UTC (permalink / raw)
To: target-devel
On 05/20/20 17:50, Pavel Zakharov wrote:
> Hopefully I didn’t mess-up the recipients/Subject list as I just subscribed to target-devel and so following-up to an email I found in the archive.
>
> Bodo, I happened to have run into the same issue and was just about to submit a similar patch when I randomly stumbled on yours.
> I think we can improve on that patch by moving the call to transport_lun_remove_cmd() into transport_cmd_check_stop_to_fabric().
Yeah, I also thought about doing this. But then decided to better
add the one lost line for fixing only.
>
> Here’s the patch I was about to propose. Let me know what you think.
> Note that we could also do this simplification as a separate patch later if you think it’s better.
I think it is a good idea to do this simplification in a separate patch.
>
> ---
> When ABORT and LUN RESET were made synchronous, a call to
> transport_lun_remove_cmd() was acidentally removed. This caused a lun_ref
> to be leaked every time an ABORT or LUN RESET was processed in
> target_tmr_work().
>
> Given that transport_lun_remove_cmd() should always be called before
> transport_cmd_check_stop_to_fabric(), move that call into this function.
> This ensures that the lun_ref is released before setting .se_lun to NULL.
>
> Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
> ---
> drivers/target/target_core_transport.c | 36 ++++++++++++--------------
> 1 file changed, 16 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 594b724bbf79..ec578822709a 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -653,6 +653,17 @@ static void target_remove_from_state_list(struct se_cmd *cmd)
> spin_unlock_irqrestore(&dev->execute_task_lock, flags);
> }
>
> +static void transport_lun_remove_cmd(struct se_cmd *cmd)
> +{
> + struct se_lun *lun = cmd->se_lun;
> +
> + if (!lun)
> + return;
> +
> + if (cmpxchg(&cmd->lun_ref_active, true, false))
> + percpu_ref_put(&lun->lun_ref);
> +}
> +
> /*
> * This function is called by the target core after the target core has
> * finished processing a SCSI command or SCSI TMF. Both the regular command
> @@ -664,6 +675,11 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
> {
> unsigned long flags;
>
> + /*
> + * Remove cmd ref from lun if needed before clearing the se_lun.
> + */
> + transport_lun_remove_cmd(cmd);
> +
> target_remove_from_state_list(cmd);
>
> /*
> @@ -698,17 +714,6 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
> return cmd->se_tfo->check_stop_free(cmd);
> }
>
> -static void transport_lun_remove_cmd(struct se_cmd *cmd)
> -{
> - struct se_lun *lun = cmd->se_lun;
> -
> - if (!lun)
> - return;
> -
> - if (cmpxchg(&cmd->lun_ref_active, true, false))
> - percpu_ref_put(&lun->lun_ref);
> -}
> -
> static void target_complete_failure_work(struct work_struct *work)
> {
> struct se_cmd *cmd = container_of(work, struct se_cmd, work);
> @@ -799,8 +804,6 @@ static void target_handle_abort(struct se_cmd *cmd)
>
> WARN_ON_ONCE(kref_read(&cmd->cmd_kref) == 0);
>
> - transport_lun_remove_cmd(cmd);
> -
> transport_cmd_check_stop_to_fabric(cmd);
> }
>
> @@ -1726,7 +1729,6 @@ static void target_complete_tmr_failure(struct work_struct *work)
> se_cmd->se_tmr_req->response = TMR_LUN_DOES_NOT_EXIST;
> se_cmd->se_tfo->queue_tm_rsp(se_cmd);
>
> - transport_lun_remove_cmd(se_cmd);
> transport_cmd_check_stop_to_fabric(se_cmd);
> }
>
> @@ -1918,7 +1920,6 @@ void transport_generic_request_failure(struct se_cmd *cmd,
> goto queue_full;
>
> check_stop:
> - transport_lun_remove_cmd(cmd);
> transport_cmd_check_stop_to_fabric(cmd);
> return;
>
> @@ -2216,7 +2217,6 @@ static void transport_complete_qf(struct se_cmd *cmd)
> transport_handle_queue_full(cmd, cmd->se_dev, ret, false);
> return;
> }
> - transport_lun_remove_cmd(cmd);
> transport_cmd_check_stop_to_fabric(cmd);
> }
>
> @@ -2311,7 +2311,6 @@ static void target_complete_ok_work(struct work_struct *work)
> if (ret)
> goto queue_full;
>
> - transport_lun_remove_cmd(cmd);
> transport_cmd_check_stop_to_fabric(cmd);
> return;
> }
> @@ -2337,7 +2336,6 @@ static void target_complete_ok_work(struct work_struct *work)
> if (ret)
> goto queue_full;
>
> - transport_lun_remove_cmd(cmd);
> transport_cmd_check_stop_to_fabric(cmd);
> return;
> }
> @@ -2373,7 +2371,6 @@ static void target_complete_ok_work(struct work_struct *work)
> if (ret)
> goto queue_full;
>
> - transport_lun_remove_cmd(cmd);
> transport_cmd_check_stop_to_fabric(cmd);
> return;
> }
> @@ -2409,7 +2406,6 @@ static void target_complete_ok_work(struct work_struct *work)
> break;
> }
>
> - transport_lun_remove_cmd(cmd);
> transport_cmd_check_stop_to_fabric(cmd);
> return;
>
> --
> 2.21.1 (Apple Git-122.3)
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-05-20 17:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-12 16:17 [PATCH] scsi: target: put lun_ref at end of tmr processing Bodo Stroesser
2020-05-12 16:17 ` Bodo Stroesser
2020-05-12 23:05 ` Bart Van Assche
2020-05-12 23:05 ` Bart Van Assche
2020-05-20 15:50 ` Pavel Zakharov
2020-05-20 17:14 ` Bodo Stroesser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.