All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michael.christie@oracle.com>
To: Dmitry Bogdanov <d.bogdanov@yadro.com>
Cc: mlombard@redhat.com, martin.petersen@oracle.com,
	mgurtovoy@nvidia.com, sagi@grimberg.me,
	linux-scsi@vger.kernel.org, target-devel@vger.kernel.org
Subject: Re: [PATCH v2 11/13] scsi: target: Treat CMD_T_FABRIC_STOP like CMD_T_STOP
Date: Tue, 17 Jan 2023 12:45:56 -0600	[thread overview]
Message-ID: <5574fcc5-49ea-0cb4-bb95-cb011f21fc8c@oracle.com> (raw)
In-Reply-To: <20230117115257.GC31614@yadro.com>

On 1/17/23 05:52, Dmitry Bogdanov wrote:
> On Fri, Jan 13, 2023 at 05:15:12PM +0300, Dmitry Bogdanov wrote:
>> On Wed, Jan 11, 2023 at 09:08:30PM -0600, Mike Christie wrote:
>>>
>>> iscsit will set CMD_T_FABRIC_STOP on running commands when its transport
>>> connection is down and it can't send/recv IO (tx/rx threads are killed
>>> or the cleanup thread is run from the one thats up). It will then loop
>>> over running commands and wait for LIO core to complete them or clean
>>> them up if they were on an internal queue waiting to be sent or ackd.
>>>
>>> Currently, CMD_T_FABRIC_STOP only stops TMRs from operating on the
>>> command but for isert we need to prevent LIO core from calling into
>>> iscsit callouts when the connection is being brought down. If LIO core
>>> queues commands to iscsit and it ends up adding to an internal queue
>>> instead of passing back to the driver then we can end up hanging waiting
>>> on command completion that never occurs because it's stuck on the internal
>>> list (the tx thread is stopped at this time, so it will never loop over
>>> the response list and call into isert). We also want to sync up on a
>>> point where we no longer call into isert so it can cleanup it's structs.
>>>
>>> This has LIO core treat CMD_T_FABRIC_STOP like CMD_T_STOP during
>>> command execution and also fixes the locking around the
>>> target_cmd_interrupted calls so we don't have a case where a command
>>> is marked CMD_T_COMPLETE and CMD_T_STOP|CMD_T_FABRIC_STOP at the same
>>> time.
>>>
>>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
>>> ---
>>>  drivers/target/target_core_sbc.c       |  2 +-
>>>  drivers/target/target_core_transport.c | 27 +++++++++++++++-----------
>>>  2 files changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
>>> index 7536ca797606..56136613767f 100644
>>> --- a/drivers/target/target_core_sbc.c
>>> +++ b/drivers/target/target_core_sbc.c
>>> @@ -459,7 +459,7 @@ static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool succes
>>>                  * we don't have to perform the write operation.
>>>                  */
>>>                 WARN_ON(!(cmd->transport_state &
>>> -                       (CMD_T_ABORTED | CMD_T_STOP)));
>>> +                       (CMD_T_ABORTED | CMD_T_STOP | CMD_T_FABRIC_STOP)));
>>>                 goto out;
>>>         }
>>>         /*
>>> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
>>> index cb3fdc81ba3b..02a9476945dc 100644
>>> --- a/drivers/target/target_core_transport.c
>>> +++ b/drivers/target/target_core_transport.c
>>> @@ -737,8 +737,8 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
>>>          * Determine if frontend context caller is requesting the stopping of
>>>          * this command for frontend exceptions.
>>>          */
>>> -       if (cmd->transport_state & CMD_T_STOP) {
>>> -               pr_debug("%s:%d CMD_T_STOP for ITT: 0x%08llx\n",
>>> +       if (cmd->transport_state & (CMD_T_STOP | CMD_T_FABRIC_STOP)) {
>>> +               pr_debug("%s:%d CMD_T_STOP|CMD_T_FABRIC_STOP for ITT: 0x%08llx\n",
>>>                         __func__, __LINE__, cmd->tag);
>>>
>>>                 spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>>> @@ -889,7 +889,7 @@ static bool target_cmd_interrupted(struct se_cmd *cmd)
>>>                 INIT_WORK(&cmd->work, target_abort_work);
>>>                 queue_work(target_completion_wq, &cmd->work);
>>>                 return true;
>>> -       } else if (cmd->transport_state & CMD_T_STOP) {
>>> +       } else if (cmd->transport_state & (CMD_T_STOP | CMD_T_FABRIC_STOP)) {
>>>                 if (cmd->transport_complete_callback)
>>>                         cmd->transport_complete_callback(cmd, false, &post_ret);
>>>                 complete_all(&cmd->t_transport_stop_comp);
>>> @@ -907,13 +907,15 @@ void target_complete_cmd_with_sense(struct se_cmd *cmd, u8 scsi_status,
>>>         int success, cpu;
>>>         unsigned long flags;
>>>
>>> -       if (target_cmd_interrupted(cmd))
>>> +       spin_lock_irqsave(&cmd->t_state_lock, flags);
>>
>> That leads to a hardlock because
>> target_cmd_interrupted() => cmd->transport_complete_callback() also tooks
>> cmd->t_state_lock.

We shouldn't lock up because for the failure cases the transport_complete_callback
functions don't take the lock. They just cleanup and return.

> But the taking the lock for read+write of cmd->t*_state is absolutelly right!
> It would be great if you manage to move transport_complete_callback()
> into other thread/job.

I'm not sure why we want to do that since none of the transport_complete_callback
sleep on failure. It seems to just add more complexity and we only have the 2
transport_complete_callback uses now, so it seems overkill.




  reply	other threads:[~2023-01-17 19:52 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12  3:08 [PATCH v2 00/13] target task management fixes Mike Christie
2023-01-12  3:08 ` [PATCH v2 01/13] scsi: target: Move sess cmd counter to new struct Mike Christie
2023-01-12  3:08 ` [PATCH v2 02/13] scsi: target: Move cmd counter allocation Mike Christie
2023-01-12  3:08 ` [PATCH v2 03/13] scsi: target: Pass in cmd counter to use during cmd setup Mike Christie
2023-01-12  3:08 ` [PATCH v2 04/13] scsi: target: iscsit/isert: Alloc per conn cmd counter Mike Christie
2023-01-12  3:08 ` [PATCH v2 05/13] scsi: target: iscsit: stop/wait on cmds during conn close Mike Christie
2023-01-12  3:08 ` [PATCH v2 06/13] scsi: target: iscsit: Fix TAS handling during conn cleanup Mike Christie
2023-01-12  3:08 ` [PATCH v2 07/13] scsi: target: Fix multiple LUN_RESET handling Mike Christie
2023-01-12  3:08 ` [PATCH v2 08/13] scsi: target: drop tas arg from __transport_wait_for_tasks Mike Christie
2023-01-12  3:08 ` [PATCH v2 09/13] scsi: target: iscsit: Fix isert disconnect handling during login Mike Christie
2023-01-13 14:08   ` Dmitry Bogdanov
2023-01-18  2:14     ` Mike Christie
2023-01-23  9:53       ` Sagi Grimberg
2023-01-24  2:40         ` michael.christie
2023-01-12  3:08 ` [PATCH v2 10/13] scsi: target: iscsit: Add helper to check when cmd has failed Mike Christie
2023-01-12  3:08 ` [PATCH v2 11/13] scsi: target: Treat CMD_T_FABRIC_STOP like CMD_T_STOP Mike Christie
2023-01-13 14:15   ` Dmitry Bogdanov
2023-01-17 11:52     ` Dmitry Bogdanov
2023-01-17 18:45       ` Mike Christie [this message]
2023-01-17 19:55         ` Dmitry Bogdanov
2023-01-18 19:12           ` Mike Christie
2023-01-12  3:08 ` [PATCH v2 12/13] scsi: target: iscsit: Cleanup isert commands at conn closure Mike Christie
2023-01-12  3:08 ` [PATCH v2 13/13] IB/isert: Fix hang in target_wait_for_cmds Mike Christie
2023-01-23  9:50   ` Sagi Grimberg
2023-01-12  3:13 ` [PATCH v2 00/13] target task management fixes Mike Christie
2023-01-23  9:54   ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5574fcc5-49ea-0cb4-bb95-cb011f21fc8c@oracle.com \
    --to=michael.christie@oracle.com \
    --cc=d.bogdanov@yadro.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=mlombard@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.