All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michael.christie@oracle.com>
To: Maurizio Lombardi <mlombard@redhat.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	bvanassche@acm.org, m.lombardi85@gmail.com
Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task
Date: Tue, 10 Nov 2020 23:08:04 +0000	[thread overview]
Message-ID: <4e336e71-2739-186d-1f8a-5a2f406aebdb@oracle.com> (raw)
In-Reply-To: <840cb2fe-5642-78d0-e700-d3652021cb5d@redhat.com>

On 11/10/20 3:29 PM, Maurizio Lombardi wrote:
> 
> 
> Dne 28. 10. 20 v 21:37 Mike Christie napsal(a):
>>>
>>> Possible solutions that I can think of:
>>>
>>> - Make iscsit_release_commands_from_conn() wait for the abort task to finish
>>
>> Yeah you could set a completion in there then have aborted_task do the complete() call maybe?
>>
> 
> We could do something like this, what do you think?
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index 067074ef50818..ffd3dbc53a42f 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -490,13 +490,16 @@ EXPORT_SYMBOL(iscsit_queue_rsp);
>   
>   void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>   {
> +	struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL;
> +
>   	spin_lock_bh(&conn->cmd_lock);
> -	if (!list_empty(&cmd->i_conn_node) &&
> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
> +	if (!list_empty(&cmd->i_conn_node))
>   		list_del_init(&cmd->i_conn_node);
>   	spin_unlock_bh(&conn->cmd_lock);
>   
>   	__iscsit_free_cmd(cmd, true);
> +	if (se_cmd && se_cmd->abrt_task_compl)
> +		complete(se_cmd->abrt_task_compl);
>   }
>   EXPORT_SYMBOL(iscsit_aborted_task);
>   
> @@ -4080,6 +4083,7 @@ int iscsi_target_rx_thread(void *arg)
>   
>   static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>   {
> +	DECLARE_COMPLETION_ONSTACK(compl);
>   	LIST_HEAD(tmp_list);
>   	struct iscsi_cmd *cmd = NULL, *cmd_tmp = NULL;
>   	struct iscsi_session *sess = conn->sess;
> @@ -4096,8 +4100,24 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>   
>   		if (se_cmd->se_tfo != NULL) {
>   			spin_lock_irq(&se_cmd->t_state_lock);
> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
> +				/*
> +				 * LIO's abort path owns the cleanup for this,
> +				 * so put it back on the list and let
> +				 * aborted_task handle it.
> +				 */
> +				list_move_tail(&cmd->i_conn_node, &conn->conn_cmd_list);
> +				WARN_ON_ONCE(se_cmd->abrt_task_compl);
> +				se_cmd->abrt_task_compl = &compl;
> +			}
>   			se_cmd->transport_state |= CMD_T_FABRIC_STOP;
>   			spin_unlock_irq(&se_cmd->t_state_lock);
> +
> +			if (se_cmd->abrt_task_compl) {
> +				spin_unlock_bh(&conn->cmd_lock);
> +				wait_for_completion(&compl);
> +				spin_lock_bh(&conn->cmd_lock);

You can still hit your freed conn race I think. The aborted_task callout 
is not the last time we are referencing the iscsi cmd and conn. That 
code path still has to do the release_cmd callout for example. Once we 
get past this wait_for_completion the abort code path could be still 
running. If this iscsit_release_commands_from_conn completes first then 
we can hit your case where we free the conn before the abort code path 
has done release_cmd and we could hit that cmd->conn->sess reference.

I think if you do the complete in iscsit_release_cmd then we would not 
hit that issue.

There might be a second issue though. What happens if aborted_task ran 
first and deleted the cmd from the conn_cmd_list. It would then be 
running while iscsit_release_commands_from_conn is running. We would 
then not do the wait_for_completion above.


> +			}
>   		}
>   	}
>   	spin_unlock_bh(&conn->cmd_lock);
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index db53a0d649da7..5611e6c00f18c 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -1391,6 +1391,7 @@ void transport_init_se_cmd(
>   	init_completion(&cmd->t_transport_stop_comp);
>   	cmd->free_compl = NULL;
>   	cmd->abrt_compl = NULL;
> +	cmd->abrt_task_compl = NULL;
>   	spin_lock_init(&cmd->t_state_lock);
>   	INIT_WORK(&cmd->work, NULL);
>   	kref_init(&cmd->cmd_kref);
> diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
> index 549947d407cfd..25cc451930281 100644
> --- a/include/target/target_core_base.h
> +++ b/include/target/target_core_base.h
> @@ -491,6 +491,7 @@ struct se_cmd {
>   	struct list_head	se_cmd_list;
>   	struct completion	*free_compl;
>   	struct completion	*abrt_compl;
> +	struct completion	*abrt_task_compl;

This should be on the iscsi cmd since only iscsi uses it.

>   	const struct target_core_fabric_ops *se_tfo;
>   	sense_reason_t		(*execute_cmd)(struct se_cmd *);
>   	sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *);
> 

WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michael.christie@oracle.com>
To: Maurizio Lombardi <mlombard@redhat.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	bvanassche@acm.org, m.lombardi85@gmail.com
Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task
Date: Tue, 10 Nov 2020 17:08:04 -0600	[thread overview]
Message-ID: <4e336e71-2739-186d-1f8a-5a2f406aebdb@oracle.com> (raw)
In-Reply-To: <840cb2fe-5642-78d0-e700-d3652021cb5d@redhat.com>

On 11/10/20 3:29 PM, Maurizio Lombardi wrote:
> 
> 
> Dne 28. 10. 20 v 21:37 Mike Christie napsal(a):
>>>
>>> Possible solutions that I can think of:
>>>
>>> - Make iscsit_release_commands_from_conn() wait for the abort task to finish
>>
>> Yeah you could set a completion in there then have aborted_task do the complete() call maybe?
>>
> 
> We could do something like this, what do you think?
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index 067074ef50818..ffd3dbc53a42f 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -490,13 +490,16 @@ EXPORT_SYMBOL(iscsit_queue_rsp);
>   
>   void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
>   {
> +	struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL;
> +
>   	spin_lock_bh(&conn->cmd_lock);
> -	if (!list_empty(&cmd->i_conn_node) &&
> -	    !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP))
> +	if (!list_empty(&cmd->i_conn_node))
>   		list_del_init(&cmd->i_conn_node);
>   	spin_unlock_bh(&conn->cmd_lock);
>   
>   	__iscsit_free_cmd(cmd, true);
> +	if (se_cmd && se_cmd->abrt_task_compl)
> +		complete(se_cmd->abrt_task_compl);
>   }
>   EXPORT_SYMBOL(iscsit_aborted_task);
>   
> @@ -4080,6 +4083,7 @@ int iscsi_target_rx_thread(void *arg)
>   
>   static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>   {
> +	DECLARE_COMPLETION_ONSTACK(compl);
>   	LIST_HEAD(tmp_list);
>   	struct iscsi_cmd *cmd = NULL, *cmd_tmp = NULL;
>   	struct iscsi_session *sess = conn->sess;
> @@ -4096,8 +4100,24 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>   
>   		if (se_cmd->se_tfo != NULL) {
>   			spin_lock_irq(&se_cmd->t_state_lock);
> +			if (se_cmd->transport_state & CMD_T_ABORTED) {
> +				/*
> +				 * LIO's abort path owns the cleanup for this,
> +				 * so put it back on the list and let
> +				 * aborted_task handle it.
> +				 */
> +				list_move_tail(&cmd->i_conn_node, &conn->conn_cmd_list);
> +				WARN_ON_ONCE(se_cmd->abrt_task_compl);
> +				se_cmd->abrt_task_compl = &compl;
> +			}
>   			se_cmd->transport_state |= CMD_T_FABRIC_STOP;
>   			spin_unlock_irq(&se_cmd->t_state_lock);
> +
> +			if (se_cmd->abrt_task_compl) {
> +				spin_unlock_bh(&conn->cmd_lock);
> +				wait_for_completion(&compl);
> +				spin_lock_bh(&conn->cmd_lock);

You can still hit your freed conn race I think. The aborted_task callout 
is not the last time we are referencing the iscsi cmd and conn. That 
code path still has to do the release_cmd callout for example. Once we 
get past this wait_for_completion the abort code path could be still 
running. If this iscsit_release_commands_from_conn completes first then 
we can hit your case where we free the conn before the abort code path 
has done release_cmd and we could hit that cmd->conn->sess reference.

I think if you do the complete in iscsit_release_cmd then we would not 
hit that issue.

There might be a second issue though. What happens if aborted_task ran 
first and deleted the cmd from the conn_cmd_list. It would then be 
running while iscsit_release_commands_from_conn is running. We would 
then not do the wait_for_completion above.


> +			}
>   		}
>   	}
>   	spin_unlock_bh(&conn->cmd_lock);
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index db53a0d649da7..5611e6c00f18c 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -1391,6 +1391,7 @@ void transport_init_se_cmd(
>   	init_completion(&cmd->t_transport_stop_comp);
>   	cmd->free_compl = NULL;
>   	cmd->abrt_compl = NULL;
> +	cmd->abrt_task_compl = NULL;
>   	spin_lock_init(&cmd->t_state_lock);
>   	INIT_WORK(&cmd->work, NULL);
>   	kref_init(&cmd->cmd_kref);
> diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
> index 549947d407cfd..25cc451930281 100644
> --- a/include/target/target_core_base.h
> +++ b/include/target/target_core_base.h
> @@ -491,6 +491,7 @@ struct se_cmd {
>   	struct list_head	se_cmd_list;
>   	struct completion	*free_compl;
>   	struct completion	*abrt_compl;
> +	struct completion	*abrt_task_compl;

This should be on the iscsi cmd since only iscsi uses it.

>   	const struct target_core_fabric_ops *se_tfo;
>   	sense_reason_t		(*execute_cmd)(struct se_cmd *);
>   	sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *);
> 


  reply	other threads:[~2020-11-10 23:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 14:53 [PATCH 0/2] fix race conditions with task aborts Maurizio Lombardi
2020-10-07 14:53 ` Maurizio Lombardi
2020-10-07 14:53 ` [PATCH 1/2] target: iscsi: prevent a race condition in iscsit_unmap_cmd() Maurizio Lombardi
2020-10-07 14:53   ` Maurizio Lombardi
2020-10-08  2:15   ` Bart Van Assche
2020-10-08  2:15     ` Bart Van Assche
2020-10-08  9:42     ` Maurizio Lombardi
2020-10-08  9:42       ` Maurizio Lombardi
2020-10-07 14:53 ` [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Maurizio Lombardi
2020-10-07 14:53   ` Maurizio Lombardi
2020-10-22  2:42   ` Mike Christie
2020-10-22  2:42     ` Mike Christie
2020-10-27 13:49     ` Maurizio Lombardi
2020-10-27 13:49       ` Maurizio Lombardi
2020-10-27 17:54       ` Mike Christie
2020-10-27 17:54         ` Mike Christie
2020-10-27 20:03         ` Michael Christie
2020-10-27 20:03           ` Michael Christie
2020-10-28 17:09           ` Maurizio Lombardi
2020-10-28 17:09             ` Maurizio Lombardi
2020-10-28 20:37             ` Mike Christie
2020-10-28 20:37               ` Mike Christie
2020-11-10 21:29               ` Maurizio Lombardi
2020-11-10 21:29                 ` Maurizio Lombardi
2020-11-10 23:08                 ` Mike Christie [this message]
2020-11-10 23:08                   ` Mike Christie
2020-11-11  2:16             ` Mike Christie
2020-11-11  2:16               ` Mike Christie
2020-11-11 14:58               ` Maurizio Lombardi
2020-11-11 14:58                 ` Maurizio Lombardi
2020-11-11 15:37                 ` Michael Christie
2020-11-11 15:37                   ` Michael Christie
2020-11-11 15:48                   ` Maurizio Lombardi
2020-11-11 15:48                     ` Maurizio Lombardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e336e71-2739-186d-1f8a-5a2f406aebdb@oracle.com \
    --to=michael.christie@oracle.com \
    --cc=bvanassche@acm.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=m.lombardi85@gmail.com \
    --cc=martin.petersen@oracle.com \
    --cc=mlombard@redhat.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.