All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
To: "Elliott, Robert (Servers)" <elliott@hpe.com>
Cc: "martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"sathya.prakash@broadcom.com" <sathya.prakash@broadcom.com>,
	"suganath-prabu.subramani@broadcom.com" 
	<suganath-prabu.subramani@broadcom.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"amit@kernel.org" <amit@kernel.org>
Subject: Re: [PATCH] mpt3sas: Fix kernel panic observed on soft HBA unplug
Date: Mon, 16 Mar 2020 11:45:15 +0530	[thread overview]
Message-ID: <CAK=zhgqWJs+Wbmgy9xp6WDRp2w5e+5BGD+R5mck-dVh5oOUQ0g@mail.gmail.com> (raw)
In-Reply-To: <DF4PR8401MB12415ADC9760286F3930DBE4ABFB0@DF4PR8401MB1241.NAMPRD84.PROD.OUTLOOK.COM>

On Sat, Mar 14, 2020 at 7:56 AM Elliott, Robert (Servers)
<elliott@hpe.com> wrote:
>
>
>
> > -----Original Message-----
> > From: linux-scsi-owner@vger.kernel.org <linux-scsi-
> > owner@vger.kernel.org> On Behalf Of Sreekanth Reddy
> > Sent: Wednesday, March 11, 2020 5:37 AM
> > To: martin.petersen@oracle.com
> > Cc: linux-scsi@vger.kernel.org; sathya.prakash@broadcom.com; suganath-
> > prabu.subramani@broadcom.com; stable@vger.kernel.org; amit@kernel.org;
> > Sreekanth Reddy <sreekanth.reddy@broadcom.com>
> > Subject: [PATCH] mpt3sas: Fix kernel panic observed on soft HBA unplug
> >
> > Generic protection fault type kernel panic is observed when user
> > performs soft(ordered) HBA unplug operation while IOs are running
> > on drives connected to HBA.
> >
> > When user performs ordered HBA removal operation then kernel calls
> > PCI device's .remove() call back function where driver is flushing out
> > all the outstanding SCSI IO commands with DID_NO_CONNECT host byte and
> > also un-maps sg buffers allocated for these IO commands.
> > But in the ordered HBA removal case (unlike of real HBA hot unplug)
> > HBA device is still alive and hence HBA hardware is performing the
> > DMA operations to those buffers on the system memory which are already
> > unmapped while flushing out the outstanding SCSI IO commands
> > and this leads to Kernel panic.
> >
> > Fix:
> > Don't flush out the outstanding IOs from .remove() path in case of
> > ordered HBA removal since HBA will be still alive in this case and
> > it can complete the outstanding IOs. Flush out the outstanding IOs
> > only in case physical HBA hot unplug where their won't be any
> > communication with the HBA.
> >
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
> > ---
> >  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > index 778d5e6..04a40af 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -9908,8 +9908,8 @@ static void scsih_remove(struct pci_dev *pdev)
> >
> >       ioc->remove_host = 1;
> >
> > -     mpt3sas_wait_for_commands_to_complete(ioc);
>
> Immediately removing the driver with IOs pending seems dangerous.
>
> That function includes a timeout to avoid hanging forever, which
> is reasonable (avoid hanging during system shutdown). Perhaps the
> kernel panic was happening because that function timed out?
>
> Reporting a warning or error and doing special handling might be
> appropriate if that occurs. That should be rare, though; the normal
> case should be to cleanly finish any outstanding commands.
>
> > -     _scsih_flush_running_cmds(ioc);
> > +     if (!pci_device_is_present(pdev))
> > +             _scsih_flush_running_cmds(ioc);
>
> If that branch is not taken, then it proceeds to remove the driver
> with IOs pending. That'll wipe out all sorts of ioc structures
> and things like interrupt handler code, leaving memory mapped forever
> (no code left to call scsi_dma_unmap). That might be better than
> a kernel panic, but still not good.

In the unload path driver call sas_remove_host() API before releasing
the resources. This sas_remove_host() API waits for all the
outstanding IOs to be completed. So here, indirectly driver is waiting
for the outstanding IOs to be processed before releasing the HBA
resources.  So only in the cases where HBA is inaccessible (e.g. HBA
unplug case), driver is flushing out the outstanding commands to avoid
SCSI error handling over head and can quilkey complete the driver
unload operation.

>
>

  reply	other threads:[~2020-03-16  6:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-11 10:36 [PATCH] mpt3sas: Fix kernel panic observed on soft HBA unplug Sreekanth Reddy
2020-03-11 11:04 ` Amit Shah
2020-03-11 11:25   ` Sreekanth Reddy
2020-03-11 11:49     ` Sreekanth Reddy
2020-03-11 14:48       ` Amit Shah
2020-03-14  2:25       ` Elliott, Robert (Servers)
2020-03-14  2:25 ` Elliott, Robert (Servers)
2020-03-16  6:15   ` Sreekanth Reddy [this message]
2020-03-27  1:50     ` Martin K. Petersen
2020-03-27 10:35       ` Sreekanth Reddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK=zhgqWJs+Wbmgy9xp6WDRp2w5e+5BGD+R5mck-dVh5oOUQ0g@mail.gmail.com' \
    --to=sreekanth.reddy@broadcom.com \
    --cc=amit@kernel.org \
    --cc=elliott@hpe.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=sathya.prakash@broadcom.com \
    --cc=stable@vger.kernel.org \
    --cc=suganath-prabu.subramani@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.