From: Mike Christie <michael.christie@oracle.com> To: Maurizio Lombardi <mlombard@redhat.com>, martin.petersen@oracle.com Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, bvanassche@acm.org Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Date: Tue, 27 Oct 2020 17:54:10 +0000 [thread overview] Message-ID: <184667b1-032b-c36f-d1e7-5cfef961c763@oracle.com> (raw) In-Reply-To: <1852a8bd-3edc-5c49-fa51-9afe52f125a8@redhat.com> On 10/27/20 8:49 AM, Maurizio Lombardi wrote: > Hello Mike, > > Dne 22. 10. 20 v 4:42 Mike Christie napsal(a): >> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do: >> >> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP. >> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do: >> >> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd >> >> The cmd is now freed. >> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released. >> >> > > Thanks for the review! > > There are definitely some problems with task aborts and commands' refcounting * > but this is a different bug than the one this patch is trying to solve (a race to list_del_init()); > unless you are saying that abort tasks should never be executed when the connection > is going down and we have to prevent such cases from happening at all. Yeah, I think if we prevent the race then we fix the refcount issue and your issue. Here is a patch that is only compile tested: From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001 From: Mike Christie <michael.christie@oracle.com> Date: Tue, 27 Oct 2020 12:30:53 -0500 Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race The abort and cmd stop paths can race where: 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP. 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do: target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd The cmd is now freed. 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released. In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the CMD_T_ABORTED if the driver is not cleaning up the cmd because of a session shutdown. However, iscsit_release_commands_from_conn only sets the CMD_T_FABRIC_STOP and does not check to see if the abort path has claimed completion ownership of the command. This adds a check in iscsit_release_commands_from_conn so only the abort or fabric stop path cleanup the command. --- drivers/target/iscsi/iscsi_target.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index f77e5ee..85027d3 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd) void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd) { spin_lock_bh(&conn->cmd_lock); - if (!list_empty(&cmd->i_conn_node) && - !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP)) + if (!list_empty(&cmd->i_conn_node)) list_del_init(&cmd->i_conn_node); spin_unlock_bh(&conn->cmd_lock); @@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) if (se_cmd->se_tfo != NULL) { spin_lock_irq(&se_cmd->t_state_lock); + if (se_cmd->transport_state & CMD_T_ABORTED) { + /* + * LIO's abort path owns the cleanup for this, + * so put it back on the list and let + * aborted_task handle it. + */ + list_add_tail(&cmd->i_conn_node, + &conn->conn_cmd_list); + continue; + } se_cmd->transport_state |= CMD_T_FABRIC_STOP; spin_unlock_irq(&se_cmd->t_state_lock); } -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michael.christie@oracle.com> To: Maurizio Lombardi <mlombard@redhat.com>, martin.petersen@oracle.com Cc: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, bvanassche@acm.org Subject: Re: [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Date: Tue, 27 Oct 2020 12:54:10 -0500 [thread overview] Message-ID: <184667b1-032b-c36f-d1e7-5cfef961c763@oracle.com> (raw) In-Reply-To: <1852a8bd-3edc-5c49-fa51-9afe52f125a8@redhat.com> On 10/27/20 8:49 AM, Maurizio Lombardi wrote: > Hello Mike, > > Dne 22. 10. 20 v 4:42 Mike Christie napsal(a): >> If we free the cmd from the abort path, then for your conn stop plus abort race case, could we do: >> >> 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP. >> 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do: >> >> target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd >> >> The cmd is now freed. >> 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released. >> >> > > Thanks for the review! > > There are definitely some problems with task aborts and commands' refcounting * > but this is a different bug than the one this patch is trying to solve (a race to list_del_init()); > unless you are saying that abort tasks should never be executed when the connection > is going down and we have to prevent such cases from happening at all. Yeah, I think if we prevent the race then we fix the refcount issue and your issue. Here is a patch that is only compile tested: From 209709bcedd9a6ce6003e6bb86f3ebf547dca6af Mon Sep 17 00:00:00 2001 From: Mike Christie <michael.christie@oracle.com> Date: Tue, 27 Oct 2020 12:30:53 -0500 Subject: [PATCH] iscsi target: fix cmd abort vs fabric stop race The abort and cmd stop paths can race where: 1. thread1 runs iscsit_release_commands_from_conn and sets CMD_T_FABRIC_STOP. 2. thread2 runs iscsit_aborted_task and then does __iscsit_free_cmd. It then returns from the aborted_task callout and we finish target_handle_abort and do: target_handle_abort -> transport_cmd_check_stop_to_fabric -> lio_check_stop_free -> target_put_sess_cmd The cmd is now freed. 3. thread1 now finishes iscsit_release_commands_from_conn and runs iscsit_free_cmd while accessing a command we just released. In __target_check_io_state we check for CMD_T_FABRIC_STOP and set the CMD_T_ABORTED if the driver is not cleaning up the cmd because of a session shutdown. However, iscsit_release_commands_from_conn only sets the CMD_T_FABRIC_STOP and does not check to see if the abort path has claimed completion ownership of the command. This adds a check in iscsit_release_commands_from_conn so only the abort or fabric stop path cleanup the command. --- drivers/target/iscsi/iscsi_target.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index f77e5ee..85027d3 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -483,8 +483,7 @@ int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd) void iscsit_aborted_task(struct iscsi_conn *conn, struct iscsi_cmd *cmd) { spin_lock_bh(&conn->cmd_lock); - if (!list_empty(&cmd->i_conn_node) && - !(cmd->se_cmd.transport_state & CMD_T_FABRIC_STOP)) + if (!list_empty(&cmd->i_conn_node)) list_del_init(&cmd->i_conn_node); spin_unlock_bh(&conn->cmd_lock); @@ -4088,6 +4087,16 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) if (se_cmd->se_tfo != NULL) { spin_lock_irq(&se_cmd->t_state_lock); + if (se_cmd->transport_state & CMD_T_ABORTED) { + /* + * LIO's abort path owns the cleanup for this, + * so put it back on the list and let + * aborted_task handle it. + */ + list_add_tail(&cmd->i_conn_node, + &conn->conn_cmd_list); + continue; + } se_cmd->transport_state |= CMD_T_FABRIC_STOP; spin_unlock_irq(&se_cmd->t_state_lock); } -- 1.8.3.1
next prev parent reply other threads:[~2020-10-27 17:54 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-10-07 14:53 [PATCH 0/2] fix race conditions with task aborts Maurizio Lombardi 2020-10-07 14:53 ` Maurizio Lombardi 2020-10-07 14:53 ` [PATCH 1/2] target: iscsi: prevent a race condition in iscsit_unmap_cmd() Maurizio Lombardi 2020-10-07 14:53 ` Maurizio Lombardi 2020-10-08 2:15 ` Bart Van Assche 2020-10-08 2:15 ` Bart Van Assche 2020-10-08 9:42 ` Maurizio Lombardi 2020-10-08 9:42 ` Maurizio Lombardi 2020-10-07 14:53 ` [PATCH 2/2] target: iscsi: fix a race condition when aborting a task Maurizio Lombardi 2020-10-07 14:53 ` Maurizio Lombardi 2020-10-22 2:42 ` Mike Christie 2020-10-22 2:42 ` Mike Christie 2020-10-27 13:49 ` Maurizio Lombardi 2020-10-27 13:49 ` Maurizio Lombardi 2020-10-27 17:54 ` Mike Christie [this message] 2020-10-27 17:54 ` Mike Christie 2020-10-27 20:03 ` Michael Christie 2020-10-27 20:03 ` Michael Christie 2020-10-28 17:09 ` Maurizio Lombardi 2020-10-28 17:09 ` Maurizio Lombardi 2020-10-28 20:37 ` Mike Christie 2020-10-28 20:37 ` Mike Christie 2020-11-10 21:29 ` Maurizio Lombardi 2020-11-10 21:29 ` Maurizio Lombardi 2020-11-10 23:08 ` Mike Christie 2020-11-10 23:08 ` Mike Christie 2020-11-11 2:16 ` Mike Christie 2020-11-11 2:16 ` Mike Christie 2020-11-11 14:58 ` Maurizio Lombardi 2020-11-11 14:58 ` Maurizio Lombardi 2020-11-11 15:37 ` Michael Christie 2020-11-11 15:37 ` Michael Christie 2020-11-11 15:48 ` Maurizio Lombardi 2020-11-11 15:48 ` Maurizio Lombardi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=184667b1-032b-c36f-d1e7-5cfef961c763@oracle.com \ --to=michael.christie@oracle.com \ --cc=bvanassche@acm.org \ --cc=linux-scsi@vger.kernel.org \ --cc=martin.petersen@oracle.com \ --cc=mlombard@redhat.com \ --cc=target-devel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.