target-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Bogdanov <d.bogdanov@yadro.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: <mlombard@redhat.com>, <martin.petersen@oracle.com>,
	<mgurtovoy@nvidia.com>, <sagi@grimberg.me>,
	<linux-scsi@vger.kernel.org>, <target-devel@vger.kernel.org>
Subject: Re: [PATCH 13/18] scsi: target: Fix multiple LUN_RESET handling
Date: Wed, 15 Mar 2023 22:11:41 +0300	[thread overview]
Message-ID: <20230315191141.GF1031@yadro.com> (raw)
In-Reply-To: <50afe378-c0e8-7914-377e-ae8c91f82455@oracle.com>

On Wed, Mar 15, 2023 at 11:44:48AM -0500, Mike Christie wrote:
> 
> On 3/15/23 11:13 AM, Dmitry Bogdanov wrote:
> > On Thu, Mar 09, 2023 at 04:33:07PM -0600, Mike Christie wrote:
> >>
> >> This fixes a bug where an initiator thinks a LUN_RESET has cleaned
> >> up running commands when it hasn't. The bug was added in:
> >>
> >> commit 51ec502a3266 ("target: Delete tmr from list before processing")
> >>
> >> The problem occurs when:
> >>
> >> 1. We have N IO cmds running in the target layer spread over 2 sessions.
> >> 2. The initiator sends a LUN_RESET for each session.
> >> 3. session1's LUN_RESET loops over all the running commands from both
> >> sessions and moves them to its local drain_task_list.
> >> 4. session2's LUN_RESET does not see the LUN_RESET from session1 because
> >> the commit above has it remove itself. session2 also does not see any
> >> commands since the other reset moved them off the state lists.
> >> 5. sessions2's LUN_RESET will then complete with a successful response.
> >> 6. sessions2's inititor believes the running commands on its session are
> >> now cleaned up due to the successful response and cleans up the running
> >> commands from its side. It then restarts them.
> >> 7. The commands do eventually complete on the backend and the target
> >> starts to return aborted task statuses for them. The initiator will
> >> either throw a invalid ITT error or might accidentally lookup a new task
> >> if the ITT has been reallocated already.
> >>
> >> This fixes the bug by reverting the patch, and also serializes the
> >> execution of LUN_RESETs and Preempt and Aborts. The latter is necessary
> >> because it turns out the commit accidentally fixed a bug where if there
> >> are 2 LUN RESETs executing they can see each other on the dev_tmr_list,
> >> put the other one on their local drain list, then end up waiting on each
> >> other resulting in a deadlock.
> >
> > If LUN_RESET is not in TMR list anymore there is no need to serialize
> > core_tmr_drain_tmr_list.
> 
> Ah shoot yeah I miswrote that. I meant I needed the serialization for my
> bug not yours.

I still did not get why you wrapping core_tmr_drain_*_list by mutex.
general_tmr_list have only aborts now and they do not wait for other aborts.

> 
> >>
> >>         if (cmd->transport_state & CMD_T_ABORTED)
> >> @@ -3596,6 +3597,22 @@ static void target_tmr_work(struct work_struct *work)
> >>                         target_dev_ua_allocate(dev, 0x29,
> >>                                                ASCQ_29H_BUS_DEVICE_RESET_FUNCTION_OCCURRED);
> >>                 }
> >> +
> >> +               /*
> >> +                * If this is the last reset the device can be freed after we
> >> +                * run transport_cmd_check_stop_to_fabric. Figure out if there
> >> +                * are other resets that need to be scheduled while we know we
> >> +                * have a refcount on the device.
> >> +                */
> >> +               spin_lock_irq(&dev->se_tmr_lock);
> >
> > tmr->tmr_list is removed from the list in the very end of se_cmd lifecycle
> > so any number of LUN_RESETs can be in lun_reset_tmr_list. And all of them
> > can be finished but not yet removed from the list.
> 
> Don't we remove it from the list a little later in this function when
> we call transport_lun_remove_cmd?

OMG, yes, of course, you a right. I messed up something.

But I have concerns still:
transport_lookup_tmr_lun (where LUN_RESET is added to the list) and
transport_generic_handle_tmr(where LUN_RESET is scheduled to handle)
are not serialized. And below you can start the second LUN_RESET while
transport_generic_handle_tmr is not yet called for it. The _handle_tmr
could be delayed for a such time that is enough to handle that second
LUN_RESET and to clear the flag. _handle_tmr will then start the work
again.

Is it worth doing that list management? Is it not enough just wrap
calling core_tmr_lun_reset() in target_tmr_work by a mutex?


> >
> > You may delete lun_reset here with nulling tmr->tmr_dev:
> > +                     list_del_init(&cmd->se_tmr_req->tmr_list);
> > +                     cmd->se_tmr_req->tmr_dev = NULL;
> >
> > Then the check below will be just
> > +                     if (!list_empty(dev->lun_reset_tmr_list))
> 
> I could go either way on this. Normally it's best to just have the one
> place where we handle something like the deletion and clearing. If I'm
> correct then it's already done a little later in this function so we
> are ok.
> 
> On the other hand, yeah my test is kind of gross.
> 
> 
> >>
> >> +       spin_lock_irqsave(&dev->se_tmr_lock, flags);
> >> +       if (cmd->se_tmr_req->function == TMR_LUN_RESET) {
> >> +               /*
> >> +                * We only allow one reset to execute at a time to prevent
> >> +                * one reset waiting on another, and to make sure one reset
> >> +                * does not claim all the cmds causing the other reset to
> >> +                * return early.
> >> +                */
> >> +               if (dev->dev_flags & DF_RESETTING_LUN) {
> >> +                       spin_unlock_irqrestore(&dev->se_tmr_lock, flags);
> >> +                       goto done;
> >> +               }
> >> +
> >> +               dev->dev_flags |= DF_RESETTING_LUN;
> >
> > Not good choise of flag variable. It is used at configuration time and
> > not under a lock. Configfs file dev/alias can be changed in any time
> > and could race with LUN_RESET.
> 
> I didn't see any places where one place can overwrite other flags. Are
> you just saying in general it could happen. If so, would you also not
> want dev->transport_flags to be used then?

Yes, in general, bit setting is not atomic, write of some bit can
clear other bit being write in parallel.
Better to have a separarte variable used only under lock.




  reply	other threads:[~2023-03-15 19:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-09 22:32 [PATCH 00/18] target: TMF and recovery fixes Mike Christie
2023-03-09 22:32 ` [PATCH 01/18] scsi: target: Move sess cmd counter to new struct Mike Christie
2023-03-09 22:32 ` [PATCH 02/18] scsi: target: Move cmd counter allocation Mike Christie
2023-03-09 22:32 ` [PATCH 03/18] scsi: target: Pass in cmd counter to use during cmd setup Mike Christie
2023-03-09 22:32 ` [PATCH 04/18] scsi: target: iscsit/isert: Alloc per conn cmd counter Mike Christie
2023-03-09 22:32 ` [PATCH 05/18] scsi: target: iscsit: stop/wait on cmds during conn close Mike Christie
2023-03-09 22:33 ` [PATCH 06/18] scsi: target: Drop t_state_lock use in compare_and_write_post Mike Christie
2023-03-09 22:33 ` [PATCH 07/18] scsi: target: Treat CMD_T_FABRIC_STOP like CMD_T_STOP Mike Christie
2023-03-15 10:47   ` Dmitry Bogdanov
2023-03-15 22:54     ` Mike Christie
2023-03-16  0:01       ` michael.christie
2023-03-09 22:33 ` [PATCH 08/18] scsi: target: iscsit: Add helper to check when cmd has failed Mike Christie
2023-03-09 22:33 ` [PATCH 09/18] scsi: target: iscsit: Cleanup isert commands at conn closure Mike Christie
2023-03-09 22:33 ` [PATCH 10/18] IB/isert: Fix hang in target_wait_for_cmds Mike Christie
2023-03-09 22:33 ` [PATCH 11/18] IB/isert: Fix use after free during conn cleanup Mike Christie
2023-03-15 15:21   ` Sagi Grimberg
2023-03-09 22:33 ` [PATCH 12/18] scsi: target: iscsit: free cmds before session free Mike Christie
2023-03-09 22:33 ` [PATCH 13/18] scsi: target: Fix multiple LUN_RESET handling Mike Christie
2023-03-15 16:13   ` Dmitry Bogdanov
2023-03-15 16:44     ` Mike Christie
2023-03-15 19:11       ` Dmitry Bogdanov [this message]
2023-03-15 21:42         ` Mike Christie
2023-03-16 10:39           ` Dmitry Bogdanov
2023-03-16 16:03             ` Mike Christie
2023-03-16 16:07             ` Mike Christie
2023-03-09 22:33 ` [PATCH 14/18] scsi: target: Don't set CMD_T_FABRIC_STOP for aborted tasks Mike Christie
2023-03-09 22:33 ` [PATCH 15/18] scsi: target: iscsit: Fix TAS handling during conn cleanup Mike Christie
2023-03-09 22:33 ` [PATCH 16/18] scsi: target: drop tas arg from __transport_wait_for_tasks Mike Christie
2023-03-09 22:33 ` [PATCH 17/18] scsi: target: Remove sess_cmd_lock Mike Christie
2023-03-09 22:33 ` [PATCH 18/18] scsi: target: Move tag pr_debug to before we do a put on the cmd Mike Christie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230315191141.GF1031@yadro.com \
    --to=d.bogdanov@yadro.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=michael.christie@oracle.com \
    --cc=mlombard@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).