From: Mike Christie <michael.christie@oracle.com> To: Maurizio Lombardi <mlombard@redhat.com>, "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, bvanassche@acm.org, m.lombardi85@gmail.com Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Date: Tue, 10 Nov 2020 23:08:04 +0000 [thread overview] Message-ID: <4e336e71-2739-186d-1f8a-5a2f406aebdb@oracle.com> (raw) In-Reply-To: <840cb2fe-5642-78d0-e700-d3652021cb5d@redhat.com> On 11/10/20 3:29 PM, Maurizio Lombardi wrote: > > > Dne 28. 10. 20 v 21:37 Mike Christie napsal(a): >>> >>> Possible solutions that I can think of: >>> >>> - Make iscsit_release_commands_from_conn() wait for the abort task to finish >> >> Yeah you could set a completion in there then have aborted_task do the complete() call maybe? >> > > We could do something like this, what do you think? > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > index 067074ef50818..ffd3dbc53a42f 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -490,13 +490,16 @@ EXPORT_SYMBOL(iscsit_queue_rsp); > > void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd) > { > + struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL; > + > spin_lock_bh(&conn->cmd_lock); > - if (!list_empty(&cmd->i_conn_node) && > - !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP)) > + if (!list_empty(&cmd->i_conn_node)) > list_del_init(&cmd->i_conn_node); > spin_unlock_bh(&conn->cmd_lock); > > __iscsit_free_cmd(cmd, true); > + if (se_cmd && se_cmd->abrt_task_compl) > + complete(se_cmd->abrt_task_compl); > } > EXPORT_SYMBOL(iscsit_aborted_task); > > @@ -4080,6 +4083,7 @@ int iscsi_target_rx_thread(void *arg) > > static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) > { > + DECLARE_COMPLETION_ONSTACK(compl); > LIST_HEAD(tmp_list); > struct iscsi_cmd *cmd = NULL, *cmd_tmp = NULL; > struct iscsi_session *sess = conn->sess; > @@ -4096,8 +4100,24 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) > > if (se_cmd->se_tfo != NULL) { > spin_lock_irq(&se_cmd->t_state_lock); > + if (se_cmd->transport_state & CMD_T_ABORTED) { > + /* > + * LIO's abort path owns the cleanup for this, > + * so put it back on the list and let > + * aborted_task handle it. > + */ > + list_move_tail(&cmd->i_conn_node, &conn->conn_cmd_list); > + WARN_ON_ONCE(se_cmd->abrt_task_compl); > + se_cmd->abrt_task_compl = &compl; > + } > se_cmd->transport_state |= CMD_T_FABRIC_STOP; > spin_unlock_irq(&se_cmd->t_state_lock); > + > + if (se_cmd->abrt_task_compl) { > + spin_unlock_bh(&conn->cmd_lock); > + wait_for_completion(&compl); > + spin_lock_bh(&conn->cmd_lock); You can still hit your freed conn race I think. The aborted_task callout is not the last time we are referencing the iscsi cmd and conn. That code path still has to do the release_cmd callout for example. Once we get past this wait_for_completion the abort code path could be still running. If this iscsit_release_commands_from_conn completes first then we can hit your case where we free the conn before the abort code path has done release_cmd and we could hit that cmd->conn->sess reference. I think if you do the complete in iscsit_release_cmd then we would not hit that issue. There might be a second issue though. What happens if aborted_task ran first and deleted the cmd from the conn_cmd_list. It would then be running while iscsit_release_commands_from_conn is running. We would then not do the wait_for_completion above. > + } > } > } > spin_unlock_bh(&conn->cmd_lock); > diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c > index db53a0d649da7..5611e6c00f18c 100644 > --- a/drivers/target/target_core_transport.c > +++ b/drivers/target/target_core_transport.c > @@ -1391,6 +1391,7 @@ void transport_init_se_cmd( > init_completion(&cmd->t_transport_stop_comp); > cmd->free_compl = NULL; > cmd->abrt_compl = NULL; > + cmd->abrt_task_compl = NULL; > spin_lock_init(&cmd->t_state_lock); > INIT_WORK(&cmd->work, NULL); > kref_init(&cmd->cmd_kref); > diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h > index 549947d407cfd..25cc451930281 100644 > --- a/include/target/target_core_base.h > +++ b/include/target/target_core_base.h > @@ -491,6 +491,7 @@ struct se_cmd { > struct list_head se_cmd_list; > struct completion *free_compl; > struct completion *abrt_compl; > + struct completion *abrt_task_compl; This should be on the iscsi cmd since only iscsi uses it. > const struct target_core_fabric_ops *se_tfo; > sense_reason_t (*execute_cmd)(struct se_cmd *); > sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *); >
WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michael.christie@oracle.com> To: Maurizio Lombardi <mlombard@redhat.com>, "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, bvanassche@acm.org, m.lombardi85@gmail.com Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Date: Tue, 10 Nov 2020 17:08:04 -0600 [thread overview] Message-ID: <4e336e71-2739-186d-1f8a-5a2f406aebdb@oracle.com> (raw) In-Reply-To: <840cb2fe-5642-78d0-e700-d3652021cb5d@redhat.com> On 11/10/20 3:29 PM, Maurizio Lombardi wrote: > > > Dne 28. 10. 20 v 21:37 Mike Christie napsal(a): >>> >>> Possible solutions that I can think of: >>> >>> - Make iscsit_release_commands_from_conn() wait for the abort task to finish >> >> Yeah you could set a completion in there then have aborted_task do the complete() call maybe? >> > > We could do something like this, what do you think? > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > index 067074ef50818..ffd3dbc53a42f 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -490,13 +490,16 @@ EXPORT_SYMBOL(iscsit_queue_rsp); > > void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd) > { > + struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL; > + > spin_lock_bh(&conn->cmd_lock); > - if (!list_empty(&cmd->i_conn_node) && > - !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP)) > + if (!list_empty(&cmd->i_conn_node)) > list_del_init(&cmd->i_conn_node); > spin_unlock_bh(&conn->cmd_lock); > > __iscsit_free_cmd(cmd, true); > + if (se_cmd && se_cmd->abrt_task_compl) > + complete(se_cmd->abrt_task_compl); > } > EXPORT_SYMBOL(iscsit_aborted_task); > > @@ -4080,6 +4083,7 @@ int iscsi_target_rx_thread(void *arg) > > static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) > { > + DECLARE_COMPLETION_ONSTACK(compl); > LIST_HEAD(tmp_list); > struct iscsi_cmd *cmd = NULL, *cmd_tmp = NULL; > struct iscsi_session *sess = conn->sess; > @@ -4096,8 +4100,24 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) > > if (se_cmd->se_tfo != NULL) { > spin_lock_irq(&se_cmd->t_state_lock); > + if (se_cmd->transport_state & CMD_T_ABORTED) { > + /* > + * LIO's abort path owns the cleanup for this, > + * so put it back on the list and let > + * aborted_task handle it. > + */ > + list_move_tail(&cmd->i_conn_node, &conn->conn_cmd_list); > + WARN_ON_ONCE(se_cmd->abrt_task_compl); > + se_cmd->abrt_task_compl = &compl; > + } > se_cmd->transport_state |= CMD_T_FABRIC_STOP; > spin_unlock_irq(&se_cmd->t_state_lock); > + > + if (se_cmd->abrt_task_compl) { > + spin_unlock_bh(&conn->cmd_lock); > + wait_for_completion(&compl); > + spin_lock_bh(&conn->cmd_lock); You can still hit your freed conn race I think. The aborted_task callout is not the last time we are referencing the iscsi cmd and conn. That code path still has to do the release_cmd callout for example. Once we get past this wait_for_completion the abort code path could be still running. If this iscsit_release_commands_from_conn completes first then we can hit your case where we free the conn before the abort code path has done release_cmd and we could hit that cmd->conn->sess reference. I think if you do the complete in iscsit_release_cmd then we would not hit that issue. There might be a second issue though. What happens if aborted_task ran first and deleted the cmd from the conn_cmd_list. It would then be running while iscsit_release_commands_from_conn is running. We would then not do the wait_for_completion above. > + } > } > } > spin_unlock_bh(&conn->cmd_lock); > diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c > index db53a0d649da7..5611e6c00f18c 100644 > --- a/drivers/target/target_core_transport.c > +++ b/drivers/target/target_core_transport.c > @@ -1391,6 +1391,7 @@ void transport_init_se_cmd( > init_completion(&cmd->t_transport_stop_comp); > cmd->free_compl = NULL; > cmd->abrt_compl = NULL; > + cmd->abrt_task_compl = NULL; > spin_lock_init(&cmd->t_state_lock); > INIT_WORK(&cmd->work, NULL); > kref_init(&cmd->cmd_kref); > diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h > index 549947d407cfd..25cc451930281 100644 > --- a/include/target/target_core_base.h > +++ b/include/target/target_core_base.h > @@ -491,6 +491,7 @@ struct se_cmd { > struct list_head se_cmd_list; > struct completion *free_compl; > struct completion *abrt_compl; > + struct completion *abrt_task_compl; This should be on the iscsi cmd since only iscsi uses it. > const struct target_core_fabric_ops *se_tfo; > sense_reason_t (*execute_cmd)(struct se_cmd *); > sense_reason_t (*transport_complete_callback)(struct se_cmd *, bool, int *); >
next prev parent reply other threads:[~2020-11-10 23:08 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-10-07 14:53 [PATCH 0/2] fix race conditions with task aborts Maurizio Lombardi 2020-10-07 14:53 ` Maurizio Lombardi 2020-10-07 14:53 ` [PATCH 1/2] target: iscsi: prevent a race condition in iscsit_unmap_cmd() Maurizio Lombardi 2020-10-07 14:53 ` Maurizio Lombardi 2020-10-08 2:15 ` Bart Van Assche 2020-10-08 2:15 ` Bart Van Assche 2020-10-08 9:42 ` Maurizio Lombardi 2020-10-08 9:42 ` Maurizio Lombardi 2020-10-07 14:53 ` [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Maurizio Lombardi 2020-10-07 14:53 ` Maurizio Lombardi 2020-10-22 2:42 ` Mike Christie 2020-10-22 2:42 ` Mike Christie 2020-10-27 13:49 ` Maurizio Lombardi 2020-10-27 13:49 ` Maurizio Lombardi 2020-10-27 17:54 ` Mike Christie 2020-10-27 17:54 ` Mike Christie 2020-10-27 20:03 ` Michael Christie 2020-10-27 20:03 ` Michael Christie 2020-10-28 17:09 ` Maurizio Lombardi 2020-10-28 17:09 ` Maurizio Lombardi 2020-10-28 20:37 ` Mike Christie 2020-10-28 20:37 ` Mike Christie 2020-11-10 21:29 ` Maurizio Lombardi 2020-11-10 21:29 ` Maurizio Lombardi 2020-11-10 23:08 ` Mike Christie [this message] 2020-11-10 23:08 ` Mike Christie 2020-11-11 2:16 ` Mike Christie 2020-11-11 2:16 ` Mike Christie 2020-11-11 14:58 ` Maurizio Lombardi 2020-11-11 14:58 ` Maurizio Lombardi 2020-11-11 15:37 ` Michael Christie 2020-11-11 15:37 ` Michael Christie 2020-11-11 15:48 ` Maurizio Lombardi 2020-11-11 15:48 ` Maurizio Lombardi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4e336e71-2739-186d-1f8a-5a2f406aebdb@oracle.com \ --to=michael.christie@oracle.com \ --cc=bvanassche@acm.org \ --cc=linux-scsi@vger.kernel.org \ --cc=m.lombardi85@gmail.com \ --cc=martin.petersen@oracle.com \ --cc=mlombard@redhat.com \ --cc=target-devel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.