linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
       [not found] ` <1537935759-14754-2-git-send-email-suganath-prabu.subramani@broadcom.com>
@ 2018-09-26 21:32   ` Bjorn Helgaas
  2018-09-27  7:03     ` Lukas Wunner
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2018-09-26 21:32 UTC (permalink / raw)
  To: Suganath Prabu S
  Cc: linux-scsi, linux-pci, lukas, andy.shevchenko, Sathya.Prakash,
	sreekanth.reddy, linux-kernel

[+cc LKML]

On Wed, Sep 26, 2018 at 09:52:34AM +0530, Suganath Prabu S wrote:
> * Driver uses "pci_device_is_present" to check whether
> If Hot unplugged:
> the outstanding IOs with 'DID_NO_CONNECT' before removing the drives
> attached to the HBA.

This sentence needs a verb.

> "DID_NO_CONNECT" status and free the smid, if driver detects that
> HBA is hot unplugged.

This sentence also needs a verb.

> * In the hard reset flush out all the outstanding IOs even if diag reset
> fails and also if driver detects that HBA is hot unplugged.
> 
> v1 change set:
> ==============
> unlock mutex before goto "out_unlocked",
> if active reset is in progress.
> 
> v2 change set:
> ==============
> 1) Use pci_device_is_present instead of
> mpt3sas_base_pci_device_is_unplugged.
> 2) As suggested by Lukas, removed using
> watchdog thread for checking hba hot unplug(Patch 02 of V1).
> Added Hot unplug checks in scan finish and reset paths.
> 
> v3 Change Set:
> =============
> Simplified function "mpt3sas_base_pci_device_is_available" and
> made inline.
> 
> v4 Changes:
> ===========
> Dont split strings in print statement.
> 
> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.c  | 39 ++++++++++++++++++++++++++++
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  3 ++-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 50 ++++++++++++++++++++++++++++++++----
>  3 files changed, 86 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index 59d7844..c880e72 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
> @@ -543,6 +543,20 @@ static int mpt3sas_remove_dead_ioc_func(void *arg)
>  }
>  
>  /**
> + * mpt3sas_base_pci_device_is_available - check whether pci device is
> + *			available for any transactions with FW
> + *
> + * @ioc: per adapter object
> + *
> + * Return 1 if pci device state is up and running else return 0.
> + */
> +inline bool
> +mpt3sas_base_pci_device_is_available(struct MPT3SAS_ADAPTER *ioc)
> +{
> +	return !ioc->pci_error_recovery && pci_device_is_present(ioc->pdev);
> +}
> +
> +/**
>   * _base_fault_reset_work - workq handling ioc fault conditions
>   * @work: input argument, used to derive ioc
>   *
> @@ -6122,6 +6136,11 @@ _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
>  
>  	count = 0;
>  	do {
> +		if (!pci_device_is_present(ioc->pdev)) {
> +			ioc->remove_host = 1;
> +			pr_err(MPT3SAS_FMT "Hba Hot unplugged\n", ioc->name);

You capitalized as "HBA" above.

> +			goto out;
> +		}
>  		/* Write magic sequence to WriteSequence register
>  		 * Loop until in diagnostic mode
>  		 */
> @@ -6853,6 +6872,13 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
>  
>  	ioc->pending_io_count = 0;
>  
> +	if (!mpt3sas_base_pci_device_is_available(ioc)) {
> +		pr_err(MPT3SAS_FMT
> +		    "%s: pci error recovery reset or pci device unplug occurred\n",
> +		    ioc->name, __func__);
> +		return;
> +	}
> +
>  	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);

This is a good example of why I don't like pci_device_is_present(): it
is fundamentally racy and gives a false sense of security.  Here we
*think* we're making the code safer, but in fact we could have this
sequence:

  mpt3sas_base_pci_device_is_available()    # returns true
  # device is removed
  ioc_state = mpt3sas_base_get_iocstate()

In this case the readl() inside mpt3sas_base_get_iocstate() will
probably return 0xffffffff data, and we assume that's valid and
continue on our merry way, pretending that "ioc_state" makes sense
when it really doesn't.

>  	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
>  		return;

Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-09-26 21:32   ` [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available Bjorn Helgaas
@ 2018-09-27  7:03     ` Lukas Wunner
  2018-09-27 18:47       ` Bjorn Helgaas
  0 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2018-09-27  7:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Suganath Prabu S, linux-scsi, linux-pci, andy.shevchenko,
	Sathya.Prakash, sreekanth.reddy, linux-kernel

On Wed, Sep 26, 2018 at 04:32:41PM -0500, Bjorn Helgaas wrote:
> On Wed, Sep 26, 2018 at 09:52:34AM +0530, Suganath Prabu S wrote:
> > @@ -6853,6 +6872,13 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
> >  
> >  	ioc->pending_io_count = 0;
> >  
> > +	if (!mpt3sas_base_pci_device_is_available(ioc)) {
> > +		pr_err(MPT3SAS_FMT
> > +		    "%s: pci error recovery reset or pci device unplug occurred\n",
> > +		    ioc->name, __func__);
> > +		return;
> > +	}
> > +
> >  	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> 
> This is a good example of why I don't like pci_device_is_present(): it
> is fundamentally racy and gives a false sense of security.  Here we
> *think* we're making the code safer, but in fact we could have this
> sequence:
> 
>   mpt3sas_base_pci_device_is_available()    # returns true
>   # device is removed
>   ioc_state = mpt3sas_base_get_iocstate()
> 
> In this case the readl() inside mpt3sas_base_get_iocstate() will
> probably return 0xffffffff data, and we assume that's valid and
> continue on our merry way, pretending that "ioc_state" makes sense
> when it really doesn't.

The function does the following:

	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
		return;

where MPI2_IOC_STATE_MASK is 0xF0000000 and MPI2_IOC_STATE_OPERATIONAL
is 0x20000000.  If the device is removed after the call to
mpt3sas_base_pci_device_is_available(), the result of the bitwise "and"
operation would be 0xF0000000, which is unequal to 0x20000000.
Hence this looks safe.

I agree that pci_device_is_present() (and the pci_dev_is_disconnected()
it calls) must be used judiciously, but here it seems to have been done
correctly.

One thing to be aware of is that a return value of "true" from
pci_dev_is_disconnected() is definitive and can be trusted.
On the other hand a return value of "false" is more like a fuzzy
"likely not disconnected, but can't give any guarantees".
So the boolean return value is kind of the problem here.
Boolean logic doesn't really fit these "definitive if true,
not definitive if false" semantics.

However being able to get the definitive answer in the disconnected
case is valuable:  pciehp is the only entity that can determine
surprise removal authoritatively and unambiguously (albeit with
a latency).  All the other tools that we have at our disposal don't
have that quality:  E.g. checking the Vendor ID is ambiguous because
it returns a valid value if a device was quickly replaced with another
one.  Also, all ones may be returned in the case of an Uncorrectable
Error, but the device may revert to valid responses if the error can
be recovered.  (Please correct me if I'm wrong.)

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-09-27  7:03     ` Lukas Wunner
@ 2018-09-27 18:47       ` Bjorn Helgaas
  2018-09-27 19:09         ` Lukas Wunner
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2018-09-27 18:47 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Suganath Prabu S, linux-scsi, linux-pci, andy.shevchenko,
	Sathya.Prakash, sreekanth.reddy, linux-kernel,
	Benjamin Herrenschmidt, Russell Currey, Sam Bobroff,
	Oliver O'Halloran

[+cc Ben, Russell, Sam, Oliver in case they have pertinent experience from
powerpc error handling; thread begins at
https://lore.kernel.org/linux-pci/1537935759-14754-1-git-send-email-suganath-prabu.subramani@broadcom.com/]

On Thu, Sep 27, 2018 at 09:03:27AM +0200, Lukas Wunner wrote:
> On Wed, Sep 26, 2018 at 04:32:41PM -0500, Bjorn Helgaas wrote:
> > On Wed, Sep 26, 2018 at 09:52:34AM +0530, Suganath Prabu S wrote:
> > > @@ -6853,6 +6872,13 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
> > >  
> > >  	ioc->pending_io_count = 0;
> > >  
> > > +	if (!mpt3sas_base_pci_device_is_available(ioc)) {
> > > +		pr_err(MPT3SAS_FMT
> > > +		    "%s: pci error recovery reset or pci device unplug occurred\n",
> > > +		    ioc->name, __func__);
> > > +		return;
> > > +	}
> > > +
> > >  	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> > 
> > This is a good example of why I don't like pci_device_is_present(): it
> > is fundamentally racy and gives a false sense of security.  Here we
> > *think* we're making the code safer, but in fact we could have this
> > sequence:
> > 
> >   mpt3sas_base_pci_device_is_available()    # returns true
> >   # device is removed
> >   ioc_state = mpt3sas_base_get_iocstate()
> > 
> > In this case the readl() inside mpt3sas_base_get_iocstate() will
> > probably return 0xffffffff data, and we assume that's valid and
> > continue on our merry way, pretending that "ioc_state" makes sense
> > when it really doesn't.
> 
> The function does the following:
> 
> 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> 	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
> 		return;
> 
> where MPI2_IOC_STATE_MASK is 0xF0000000 and MPI2_IOC_STATE_OPERATIONAL
> is 0x20000000.  If the device is removed after the call to
> mpt3sas_base_pci_device_is_available(), the result of the bitwise "and"
> operation would be 0xF0000000, which is unequal to 0x20000000.
> Hence this looks safe.

I agree this particular case is technically safe, but figuring that
out requires an unreasonable amount of analysis.  And there's no hint
in the code that we need to be concerned about whether the readl()
returns valid data, so the need for the analysis won't even occur to
most readers.

I don't feel good about encouraging this style of adding an explicit
test for whether the device is available, followed by a completely
implicit test that accidentally happens to correctly handle a device
that was removed after the explicit test.

If we instead added a test for ~0 after the readl(), we would avoid
the race and give the reader a clue that *any* read from the device
can potentially fail without advance warning.

> I agree that pci_device_is_present() (and the pci_dev_is_disconnected()
> it calls) must be used judiciously, but here it seems to have been done
> correctly.
> 
> One thing to be aware of is that a return value of "true" from
> pci_dev_is_disconnected() is definitive and can be trusted.
> On the other hand a return value of "false" is more like a fuzzy
> "likely not disconnected, but can't give any guarantees".
> So the boolean return value is kind of the problem here.
> Boolean logic doesn't really fit these "definitive if true,
> not definitive if false" semantics.
> 
> However being able to get the definitive answer in the disconnected
> case is valuable:  pciehp is the only entity that can determine
> surprise removal authoritatively and unambiguously (albeit with
> a latency).  All the other tools that we have at our disposal don't
> have that quality:  E.g. checking the Vendor ID is ambiguous because
> it returns a valid value if a device was quickly replaced with another
> one.  Also, all ones may be returned in the case of an Uncorrectable
> Error, but the device may revert to valid responses if the error can
> be recovered.  (Please correct me if I'm wrong.)

I think everything you said above is true, but I'm not yet convinced
that it's being applied usefully in mpt3sas.

  bool pci_dev_is_disconnected(pdev)       # "true" is definitive
  {
    return test_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
  }

  bool pci_device_is_present(pdev)         # "false" is definitive
  {
    if (pci_dev_is_disconnected(pdev))
      return false;
    return pci_bus_read_dev_vendor_id(...);
  }

  mpt3sas_base_pci_device_is_available(ioc)  # "false" is definitive
  {
    return !ioc->pci_error_recovery && pci_device_is_present(ioc->pdev);
  }

  mpt3sas_wait_for_commands_to_complete(...)
  {
    ...
 +  if (!mpt3sas_base_pci_device_is_available(ioc))
 +    return;

    # In the definitive case, we returned above.  If we get here, we
    # think the device *might* be present.  The readl() inside
    # mpt3sas_base_get_iocstate() might fail (returning ~0 data).
    # It happens that (~0 & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL
    # so we will return below if the device was removed after the
    # check above.

    ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
    if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
      return;
    ...


I think this code is unreasonably complicated.  The multiple layers
and negations make it very difficult to figure out what's definitive.

I'm not sure how mpt3sas benefits from adding
mpt3sas_base_pci_device_is_available() here, other than the fact that
it avoids an MMIO read to the device in most cases.  I think it would
be simpler and better to make mpt3sas_base_get_iocstate() smarter
about handling ~0 values from the readl().

Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-09-27 18:47       ` Bjorn Helgaas
@ 2018-09-27 19:09         ` Lukas Wunner
  2018-10-01  6:57           ` Suganath Prabu Subramani
  0 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2018-09-27 19:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Suganath Prabu S, linux-scsi, linux-pci, andy.shevchenko,
	Sathya.Prakash, sreekanth.reddy, linux-kernel,
	Benjamin Herrenschmidt, Russell Currey, Sam Bobroff,
	Oliver O'Halloran

On Thu, Sep 27, 2018 at 01:47:46PM -0500, Bjorn Helgaas wrote:
>   mpt3sas_wait_for_commands_to_complete(...)
>   {
>     ...
>  +  if (!mpt3sas_base_pci_device_is_available(ioc))
>  +    return;
> 
>     # In the definitive case, we returned above.  If we get here, we
>     # think the device *might* be present.  The readl() inside
>     # mpt3sas_base_get_iocstate() might fail (returning ~0 data).
>     # It happens that (~0 & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL
>     # so we will return below if the device was removed after the
>     # check above.
> 
>     ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
>     if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
>       return;
>     ...
> 
> 
> I think this code is unreasonably complicated.  The multiple layers
> and negations make it very difficult to figure out what's definitive.

I agree there's room for improvement.


> I'm not sure how mpt3sas benefits from adding
> mpt3sas_base_pci_device_is_available() here, other than the fact that
> it avoids an MMIO read to the device in most cases.  I think it would
> be simpler and better to make mpt3sas_base_get_iocstate() smarter
> about handling ~0 values from the readl().

Avoiding an MMIO read when it's known to run into a Completion Timeout
buys about 17 ms.  If the function is executed many times (I don't know
if that's the case here, I'm referring to the general case), or if bailing
out of it early avoids significant amounts of further I/O, then checking
for disconnectedness makes sense.

The 17 ms number is from this talk:
https://www.youtube.com/watch?v=GJ6B0xzgvlM&t=832

It's the typical Completion Timeout on an Intel chip, but it can be up to
50 ms in the default setting or up to 64 s if so configured in the Device
Control 2 register (PCIe r4.0 sec 7.8.16).

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-09-27 19:09         ` Lukas Wunner
@ 2018-10-01  6:57           ` Suganath Prabu Subramani
  2018-10-01 20:40             ` Bjorn Helgaas
  0 siblings, 1 reply; 12+ messages in thread
From: Suganath Prabu Subramani @ 2018-10-01  6:57 UTC (permalink / raw)
  To: lukas
  Cc: helgaas, linux-scsi, linux-pci, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel, benh, ruscur, sbobroff, oohall

Hi Lucas and Bjorn,
Thanks for reviewing and providing useful comments.

On Fri, Sep 28, 2018 at 12:40 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Thu, Sep 27, 2018 at 01:47:46PM -0500, Bjorn Helgaas wrote:
> >   mpt3sas_wait_for_commands_to_complete(...)
> >   {
> >     ...
> >  +  if (!mpt3sas_base_pci_device_is_available(ioc))
> >  +    return;
> >
> >     # In the definitive case, we returned above.  If we get here, we
> >     # think the device *might* be present.  The readl() inside
> >     # mpt3sas_base_get_iocstate() might fail (returning ~0 data).
> >     # It happens that (~0 & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL
> >     # so we will return below if the device was removed after the
> >     # check above.
> >
> >     ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> >     if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
> >       return;
> >     ...
> >
> >
> > I think this code is unreasonably complicated.  The multiple layers
> > and negations make it very difficult to figure out what's definitive.
>
> I agree there's room for improvement.
>
>
> > I'm not sure how mpt3sas benefits from adding
> > mpt3sas_base_pci_device_is_available() here, other than the fact that
> > it avoids an MMIO read to the device in most cases.  I think it would
> > be simpler and better to make mpt3sas_base_get_iocstate() smarter
> > about handling ~0 values from the readl().
>
> Avoiding an MMIO read when it's known to run into a Completion Timeout
> buys about 17 ms.  If the function is executed many times (I don't know
> if that's the case here, I'm referring to the general case), or if bailing
> out of it early avoids significant amounts of further I/O, then checking
> for disconnectedness makes sense.
>
> The 17 ms number is from this talk:
> https://www.youtube.com/watch?v=GJ6B0xzgvlM&t=832
>
> It's the typical Completion Timeout on an Intel chip, but it can be up to
> 50 ms in the default setting or up to 64 s if so configured in the Device
> Control 2 register (PCIe r4.0 sec 7.8.16).
>

This function is called only during controller reset, system shutdown
and driver unload operations.
For this case we can remove mpt3sas_base_pci_device_is_available() check,
since we can make the device removal from readl in
mpt3sas_base_get_iocstate() as you suggested.
But we need mpt3sas_base_pci_device_is_available() in other places
like while submitting the
IO/TMs to the HBA firmware etc where driver is not checking for the
IOC state (through readl() operation)
 to gracefully handle the HBA hot-unplug operation.

> Thanks,
>
> Lukas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/6] mpt3sas: Introdude _scsih_get_shost_and_ioc.
       [not found]     ` <CA+RiK66toWULgXRa+nZVOEom70ntQW8PTO7N_8qFn2AvXB_8QA@mail.gmail.com>
@ 2018-10-01 13:40       ` Bjorn Helgaas
  2018-10-08  6:47         ` Suganath Prabu Subramani
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2018-10-01 13:40 UTC (permalink / raw)
  To: Suganath Prabu Subramani
  Cc: linux-scsi, linux-pci, lukas, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel

[+cc LKML]

On Mon, Oct 01, 2018 at 12:57:01PM +0530, Suganath Prabu Subramani wrote:
> On Thu, Sep 27, 2018 at 2:39 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, Sep 26, 2018 at 09:52:36AM +0530, Suganath Prabu S wrote:
> > > The code for getting shost and IOC is redundant so
> > > moved that to function "scsih_get_shost_and_ioc".
> > > Also checks for NULL are added to IOC and shost.
> > >
> > > Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
> > > ---
> > >  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 98 ++++++++++++++++++++++++++++++------
> > >  1 file changed, 82 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > index 566a550..f6e92eb 100644
> > > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > @@ -9809,6 +9809,35 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc)
> > >  }
> > >
> > >  /**
> > > + * _scsih_get_shost_and_ioc - get shost and ioc
> > > + *                   and verify whether they are NULL or not
> > > + * @pdev: PCI device struct
> > > + * @shost: address of scsi host pointer
> > > + * @ioc: address of HBA adapter pointer
> > > + *
> > > + * Return zero if *shost and *ioc are not NULL otherwise return error number.
> > > + */
> > > +static int
> > > +_scsih_get_shost_and_ioc(struct pci_dev *pdev,
> > > +     struct Scsi_Host **shost, struct MPT3SAS_ADAPTER **ioc)
> > > +{
> > > +     *shost = pci_get_drvdata(pdev);
> > > +     if (*shost == NULL) {
> > > +             dev_err(&pdev->dev, "pdev's driver data is null\n");
> > > +             return -ENXIO;
> > > +     }
> > > +
> > > +     *ioc = shost_priv(*shost);
> > > +     if (*ioc == NULL) {
> > > +             dev_err(&pdev->dev, "shost's private data is null\n");
> > > +             return -ENXIO;
> >
> > I think it's better to omit NULL pointer checks like these because
> > there should not be a path where we can execute this code when these
> > pointers are NULL.
> >
> > If there *is* such a path, I think that's a serious bug and it's
> > better to oops when we try to dereference the NULL pointer.  If we
> > just return an error code, it's likely the bug will be ignored and
> > never fixed.
> >
> We have added the ioc and shost checks, because of kernel panic in
> below scenario.
> Have 3 HBA's in system and perform below operation.
> 1) Run “poweroff”.
> 2) Immediate hot unplug HBA.
> We have observed, kernel's shutdown process has removed all the 3
> HBA devices smoothly, and also user have unplugged the HBA device
> during this time. PCI subsystem's hot-plug thread is also trying to
> remove the hot plugged PCI device which is already removed/cleaned
> by the shutdown process. (Which is not expected for the already
> removed device) Due to this kernel panic is observed. And we are not
> sure whether it has to fixed from pciehp layer, so we added sanity
> checks in driver.

This is a great example.  scsih_remove() is the mpt3sas_driver.remove
method.  It sounds like it's getting called twice for the same HBA.
I think that's a PCI core defect and we should fix it there instead of
trying to work around it in every driver.

The trace below is from v4.4.55.  Can you reproduce the same problem
with a current tree, e.g., v4.19-rc?  There have been many changes
since v4.4 that may have fixed this problem.

> [ 1745.605510] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000a98
> [ 1745.606554] IP: [<ffffffffa054c750>] scsih_remove+0x20/0x2d0 [mpt3sas]
> [ 1745.607609] PGD 0
> [ 1745.608621] Oops: 0000 [#1] SMP
> [ 1745.622989] CPU: 0 PID: 668 Comm: kworker/0:2 Tainted: G
> O    4.4.55-1.el7.elrepo.x86_64 #1
> [ 1745.624800] Hardware name: PRO-MNU65930231
> PRO-NME69559126/BRD-PRO55212588, BIOS 0.51e 05/08/2017
> [ 1745.626673] Workqueue: pciehp-3 pciehp_power_thread
> [ 1745.628566] task: ffff881fe50dd880 ti: ffff881fe88e4000 task.ti:
> ffff881fe88e4000
> [ 1745.630530] RIP: 0010:[<ffffffffa054c750>]  [<ffffffffa054c750>]
> scsih_remove+0x20/0x2d0 [mpt3sas]
> [ 1745.632577] RSP: 0018:ffff881fe88e7c98  EFLAGS: 00010292
> [ 1745.634639] RAX: 0000000000000001 RBX: ffff881feef5c000 RCX: 0000000000000000
> [ 1745.636718] RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff881feef5c000
> [ 1745.638832] RBP: ffff881fe88e7cc8 R08: 0000000000000000 R09: 0000000180220002
> [ 1745.640972] R10: 00000000eaf4a401 R11: ffffea007fabd280 R12: 0000000000000000
> [ 1745.643136] R13: ffffffffa0576020 R14: ffff881fe9af8240 R15: 0000000000000000
> [ 1745.645320] FS:  0000000000000000(0000) GS:ffff881ffde00000(0000)
> knlGS:0000000000000000
> [ 1745.647572] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1745.649833] CR2: 0000000000000a98 CR3: 0000001fe76df000 CR4: 00000000003406f0
> [ 1745.652147] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1745.654476] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1745.656825] Stack:
> [ 1745.659138]  ffff881fe88e7cc8 ffff881feef5c098 ffff881feef5c000
> ffffffffa0576020
> [ 1745.661562]  ffff881fe9af8240 0000000000000000 ffff881fe88e7cf0
> ffffffff8137f9d9
> [ 1745.663990]  ffff881feef5c098 ffffffffa0576088 ffff881feef5c000
> ffff881fe88e7d10
> [ 1745.666428] Call Trace:
> [ 1745.668830]  [<ffffffff8137f9d9>] pci_device_remove+0x39/0xc0
> [ 1745.671256]  [<ffffffff8147db36>] __device_release_driver+0x96/0x130
> [ 1745.673664]  [<ffffffff8147dbf3>] device_release_driver+0x23/0x30
> [ 1745.676071]  [<ffffffff8137851c>] pci_stop_bus_device+0x8c/0xa0
> [ 1745.678485]  [<ffffffff81378612>] pci_stop_and_remove_bus_device+0x12/0x20
> [ 1745.680909]  [<ffffffff81392eea>] pciehp_unconfigure_device+0xaa/0x1b0
> [ 1745.683331]  [<ffffffff813929e2>] pciehp_disable_slot+0x52/0xd0
> [ 1745.685767]  [<ffffffff81392aed>] pciehp_power_thread+0x8d/0xb0
> [ 1745.688210]  [<ffffffff8109728f>] process_one_work+0x14f/0x400
> [ 1745.690633]  [<ffffffff81097b04>] worker_thread+0x114/0x470
> [ 1745.693080]  [<ffffffff810979f0>] ? rescuer_thread+0x310/0x310
> [ 1745.695518]  [<ffffffff8109d648>] kthread+0xd8/0xf0
> [ 1745.697936]  [<ffffffff8109d570>] ? kthread_park+0x60/0x60
> [ 1745.700359]  [<ffffffff816f854f>] ret_from_fork+0x3f/0x70
> [ 1745.702747]  [<ffffffff8109d570>] ? kthread_park+0x60/0x60

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-10-01  6:57           ` Suganath Prabu Subramani
@ 2018-10-01 20:40             ` Bjorn Helgaas
  2018-10-02 14:04               ` Bjorn Helgaas
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2018-10-01 20:40 UTC (permalink / raw)
  To: Suganath Prabu Subramani
  Cc: lukas, linux-scsi, linux-pci, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel, benh, ruscur, sbobroff, oohall

On Mon, Oct 01, 2018 at 12:27:28PM +0530, Suganath Prabu Subramani wrote:
> On Fri, Sep 28, 2018 at 12:40 AM Lukas Wunner <lukas@wunner.de> wrote:
> > On Thu, Sep 27, 2018 at 01:47:46PM -0500, Bjorn Helgaas wrote:

> > > I'm not sure how mpt3sas benefits from adding
> > > mpt3sas_base_pci_device_is_available() here, other than the fact that
> > > it avoids an MMIO read to the device in most cases.  I think it would
> > > be simpler and better to make mpt3sas_base_get_iocstate() smarter
> > > about handling ~0 values from the readl().
> >
> > Avoiding an MMIO read when it's known to run into a Completion Timeout
> > buys about 17 ms.  If the function is executed many times (I don't know
> > if that's the case here, I'm referring to the general case), or if bailing
> > out of it early avoids significant amounts of further I/O, then checking
> > for disconnectedness makes sense.

I agree that if we know the device is gone, it's helpful to avoid
further I/O to it.  The misleading pattern used in this patch is:

  R1) Check for device presence
  R2) Read from device
  R3) Consume data from device

That promotes the misconception that "the device is present, so we got
valid data from it."  But in fact the device may disappear between R1
and R2, so the following pattern is better:

  A) Read from device
  B) Check data for validity, e.g., look for ~0
  C) Consume data if valid

If we're writing to the device, we don't expect any response, and it
makes sense to do:

  W1) Check for device presence
  W2) If device present, write to device

Obviously the device can disappear between W1 and W2, so this can't
eliminate *all* writes to non-existent devices, but if we want to
reduce them and we're only doing writes to the device (with no reads),
this is the best we can do.

We can learn that the device is gone in several ways: pciehp might
notice the link is down, the driver might notice a ~0 return, etc.

The driver writer knows the structure of communication with the
device, so he/she should know the appropriate places to check for
valid data from reads and where to check to minimize the number of
writes to a non-existent device.

> This function is called only during controller reset, system shutdown
> and driver unload operations.
> For this case we can remove mpt3sas_base_pci_device_is_available() check,
> since we can make the device removal from readl in
> mpt3sas_base_get_iocstate() as you suggested.

> But we need mpt3sas_base_pci_device_is_available() in other places
> like while submitting the
> IO/TMs to the HBA firmware etc where driver is not checking for the
> IOC state (through readl() operation)
> to gracefully handle the HBA hot-unplug operation.

This is the W1/W2 case above, and what you can do is constrained by
the device model, i.e., where it requires reads from the device.  I
agree you may want to check whether the device is absent to minimize
writes after a device has been removed.

I think the names "pci_device_is_present()" and
"mpt3sas_base_pci_device_is_available()" contribute to the problem
because they make promises that can't be kept -- all we can say is
that the device *was* present, but we know whether it is *still*
present.  I think it would be better if the interfaces were something
like "pci_device_is_absent()" because that gives a result we can rely
on.  If that returns true, we know the device is definitely gone.

Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-10-01 20:40             ` Bjorn Helgaas
@ 2018-10-02 14:04               ` Bjorn Helgaas
  2018-10-08  6:44                 ` Suganath Prabu Subramani
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2018-10-02 14:04 UTC (permalink / raw)
  To: Suganath Prabu Subramani
  Cc: lukas, linux-scsi, linux-pci, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel, benh, ruscur, sbobroff, oohall

On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> I think the names "pci_device_is_present()" and
> "mpt3sas_base_pci_device_is_available()" contribute to the problem
> because they make promises that can't be kept -- all we can say is
> that the device *was* present, but we know whether it is *still*
> present.

Oops, I meant "we DON'T know whether it is still present."

> I think it would be better if the interfaces were something
> like "pci_device_is_absent()" because that gives a result we can rely
> on.  If that returns true, we know the device is definitely gone.
> 
> Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-10-02 14:04               ` Bjorn Helgaas
@ 2018-10-08  6:44                 ` Suganath Prabu Subramani
  2018-10-12  5:47                   ` Sreekanth Reddy
  2018-10-12 23:43                   ` Bjorn Helgaas
  0 siblings, 2 replies; 12+ messages in thread
From: Suganath Prabu Subramani @ 2018-10-08  6:44 UTC (permalink / raw)
  To: helgaas
  Cc: lukas, linux-scsi, linux-pci, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel, benh, ruscur, sbobroff, Oliver

On Tue, Oct 2, 2018 at 7:34 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> > I think the names "pci_device_is_present()" and
> > "mpt3sas_base_pci_device_is_available()" contribute to the problem
> > because they make promises that can't be kept -- all we can say is
> > that the device *was* present, but we know whether it is *still*
> > present.

Bjorn,

In the patch we are using '!' (i.e. not operation) of pci_device_is_present(),
which is logically same as pci_device_is absent, and it is
same for mpt3sas_base_pci_device_is_available().
My understanding is that, you want us to rename these functions for
better readability
Is that correct ?


> Oops, I meant "we DON'T know whether it is still present."
>
> > I think it would be better if the interfaces were something
> > like "pci_device_is_absent()" because that gives a result we can rely
> > on.  If that returns true, we know the device is definitely gone.
> >
> > Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/6] mpt3sas: Introdude _scsih_get_shost_and_ioc.
  2018-10-01 13:40       ` [PATCH v4 3/6] mpt3sas: Introdude _scsih_get_shost_and_ioc Bjorn Helgaas
@ 2018-10-08  6:47         ` Suganath Prabu Subramani
  0 siblings, 0 replies; 12+ messages in thread
From: Suganath Prabu Subramani @ 2018-10-08  6:47 UTC (permalink / raw)
  To: helgaas
  Cc: linux-scsi, linux-pci, lukas, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel

Hi Bjorn,
We tried to recreate the issue with latest kernel multiple
times.(Without including this patch,)
but we were not able to recreate it. On 4.4 kernel we hit this issue
time long back.
So we are not sure that this issue is really fixed with latest kernel or not.
We prefer to keep this patch as this only adds sanity check.
We are also okay, if you suggest us to completely remove this patch.

Thanks,
Suganath Prabu
On Mon, Oct 1, 2018 at 7:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc LKML]
>
> On Mon, Oct 01, 2018 at 12:57:01PM +0530, Suganath Prabu Subramani wrote:
> > On Thu, Sep 27, 2018 at 2:39 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Wed, Sep 26, 2018 at 09:52:36AM +0530, Suganath Prabu S wrote:
> > > > The code for getting shost and IOC is redundant so
> > > > moved that to function "scsih_get_shost_and_ioc".
> > > > Also checks for NULL are added to IOC and shost.
> > > >
> > > > Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
> > > > ---
> > > >  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 98 ++++++++++++++++++++++++++++++------
> > > >  1 file changed, 82 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > > index 566a550..f6e92eb 100644
> > > > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > > @@ -9809,6 +9809,35 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc)
> > > >  }
> > > >
> > > >  /**
> > > > + * _scsih_get_shost_and_ioc - get shost and ioc
> > > > + *                   and verify whether they are NULL or not
> > > > + * @pdev: PCI device struct
> > > > + * @shost: address of scsi host pointer
> > > > + * @ioc: address of HBA adapter pointer
> > > > + *
> > > > + * Return zero if *shost and *ioc are not NULL otherwise return error number.
> > > > + */
> > > > +static int
> > > > +_scsih_get_shost_and_ioc(struct pci_dev *pdev,
> > > > +     struct Scsi_Host **shost, struct MPT3SAS_ADAPTER **ioc)
> > > > +{
> > > > +     *shost = pci_get_drvdata(pdev);
> > > > +     if (*shost == NULL) {
> > > > +             dev_err(&pdev->dev, "pdev's driver data is null\n");
> > > > +             return -ENXIO;
> > > > +     }
> > > > +
> > > > +     *ioc = shost_priv(*shost);
> > > > +     if (*ioc == NULL) {
> > > > +             dev_err(&pdev->dev, "shost's private data is null\n");
> > > > +             return -ENXIO;
> > >
> > > I think it's better to omit NULL pointer checks like these because
> > > there should not be a path where we can execute this code when these
> > > pointers are NULL.
> > >
> > > If there *is* such a path, I think that's a serious bug and it's
> > > better to oops when we try to dereference the NULL pointer.  If we
> > > just return an error code, it's likely the bug will be ignored and
> > > never fixed.
> > >
> > We have added the ioc and shost checks, because of kernel panic in
> > below scenario.
> > Have 3 HBA's in system and perform below operation.
> > 1) Run “poweroff”.
> > 2) Immediate hot unplug HBA.
> > We have observed, kernel's shutdown process has removed all the 3
> > HBA devices smoothly, and also user have unplugged the HBA device
> > during this time. PCI subsystem's hot-plug thread is also trying to
> > remove the hot plugged PCI device which is already removed/cleaned
> > by the shutdown process. (Which is not expected for the already
> > removed device) Due to this kernel panic is observed. And we are not
> > sure whether it has to fixed from pciehp layer, so we added sanity
> > checks in driver.
>
> This is a great example.  scsih_remove() is the mpt3sas_driver.remove
> method.  It sounds like it's getting called twice for the same HBA.
> I think that's a PCI core defect and we should fix it there instead of
> trying to work around it in every driver.
>
> The trace below is from v4.4.55.  Can you reproduce the same problem
> with a current tree, e.g., v4.19-rc?  There have been many changes
> since v4.4 that may have fixed this problem.
>
> > [ 1745.605510] BUG: unable to handle kernel NULL pointer dereference
> > at 0000000000000a98
> > [ 1745.606554] IP: [<ffffffffa054c750>] scsih_remove+0x20/0x2d0 [mpt3sas]
> > [ 1745.607609] PGD 0
> > [ 1745.608621] Oops: 0000 [#1] SMP
> > [ 1745.622989] CPU: 0 PID: 668 Comm: kworker/0:2 Tainted: G
> > O    4.4.55-1.el7.elrepo.x86_64 #1
> > [ 1745.624800] Hardware name: PRO-MNU65930231
> > PRO-NME69559126/BRD-PRO55212588, BIOS 0.51e 05/08/2017
> > [ 1745.626673] Workqueue: pciehp-3 pciehp_power_thread
> > [ 1745.628566] task: ffff881fe50dd880 ti: ffff881fe88e4000 task.ti:
> > ffff881fe88e4000
> > [ 1745.630530] RIP: 0010:[<ffffffffa054c750>]  [<ffffffffa054c750>]
> > scsih_remove+0x20/0x2d0 [mpt3sas]
> > [ 1745.632577] RSP: 0018:ffff881fe88e7c98  EFLAGS: 00010292
> > [ 1745.634639] RAX: 0000000000000001 RBX: ffff881feef5c000 RCX: 0000000000000000
> > [ 1745.636718] RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff881feef5c000
> > [ 1745.638832] RBP: ffff881fe88e7cc8 R08: 0000000000000000 R09: 0000000180220002
> > [ 1745.640972] R10: 00000000eaf4a401 R11: ffffea007fabd280 R12: 0000000000000000
> > [ 1745.643136] R13: ffffffffa0576020 R14: ffff881fe9af8240 R15: 0000000000000000
> > [ 1745.645320] FS:  0000000000000000(0000) GS:ffff881ffde00000(0000)
> > knlGS:0000000000000000
> > [ 1745.647572] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1745.649833] CR2: 0000000000000a98 CR3: 0000001fe76df000 CR4: 00000000003406f0
> > [ 1745.652147] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 1745.654476] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 1745.656825] Stack:
> > [ 1745.659138]  ffff881fe88e7cc8 ffff881feef5c098 ffff881feef5c000
> > ffffffffa0576020
> > [ 1745.661562]  ffff881fe9af8240 0000000000000000 ffff881fe88e7cf0
> > ffffffff8137f9d9
> > [ 1745.663990]  ffff881feef5c098 ffffffffa0576088 ffff881feef5c000
> > ffff881fe88e7d10
> > [ 1745.666428] Call Trace:
> > [ 1745.668830]  [<ffffffff8137f9d9>] pci_device_remove+0x39/0xc0
> > [ 1745.671256]  [<ffffffff8147db36>] __device_release_driver+0x96/0x130
> > [ 1745.673664]  [<ffffffff8147dbf3>] device_release_driver+0x23/0x30
> > [ 1745.676071]  [<ffffffff8137851c>] pci_stop_bus_device+0x8c/0xa0
> > [ 1745.678485]  [<ffffffff81378612>] pci_stop_and_remove_bus_device+0x12/0x20
> > [ 1745.680909]  [<ffffffff81392eea>] pciehp_unconfigure_device+0xaa/0x1b0
> > [ 1745.683331]  [<ffffffff813929e2>] pciehp_disable_slot+0x52/0xd0
> > [ 1745.685767]  [<ffffffff81392aed>] pciehp_power_thread+0x8d/0xb0
> > [ 1745.688210]  [<ffffffff8109728f>] process_one_work+0x14f/0x400
> > [ 1745.690633]  [<ffffffff81097b04>] worker_thread+0x114/0x470
> > [ 1745.693080]  [<ffffffff810979f0>] ? rescuer_thread+0x310/0x310
> > [ 1745.695518]  [<ffffffff8109d648>] kthread+0xd8/0xf0
> > [ 1745.697936]  [<ffffffff8109d570>] ? kthread_park+0x60/0x60
> > [ 1745.700359]  [<ffffffff816f854f>] ret_from_fork+0x3f/0x70
> > [ 1745.702747]  [<ffffffff8109d570>] ? kthread_park+0x60/0x60

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-10-08  6:44                 ` Suganath Prabu Subramani
@ 2018-10-12  5:47                   ` Sreekanth Reddy
  2018-10-12 23:43                   ` Bjorn Helgaas
  1 sibling, 0 replies; 12+ messages in thread
From: Sreekanth Reddy @ 2018-10-12  5:47 UTC (permalink / raw)
  To: Suganath Prabu Subramani, helgaas
  Cc: lukas, linux-scsi, linux-pci, Andy Shevchenko,
	Sathya Prakash Veerichetty, linux-kernel, benh, ruscur, sbobroff,
	Oliver

HI Bjorn,

Please provide your valuable suggestion/reply here.

Thank you,
Sreekanth

-----Original Message-----
From: Suganath Prabu Subramani
[mailto:suganath-prabu.subramani@broadcom.com]
Sent: Monday, October 8, 2018 12:15 PM
To: helgaas@kernel.org
Cc: lukas@wunner.de; linux-scsi@vger.kernel.org; linux-pci@vger.kernel.org;
Andy Shevchenko; Sathya Prakash; Sreekanth Reddy;
linux-kernel@vger.kernel.org; benh@kernel.crashing.org; ruscur@russell.cc;
sbobroff@linux.ibm.com; Oliver
Subject: Re: [PATCH v4 1/6] mpt3sas: Introduce
mpt3sas_base_pci_device_is_available

On Tue, Oct 2, 2018 at 7:34 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> > I think the names "pci_device_is_present()" and
> > "mpt3sas_base_pci_device_is_available()" contribute to the problem
> > because they make promises that can't be kept -- all we can say is
> > that the device *was* present, but we know whether it is *still*
> > present.

Bjorn,

In the patch we are using '!' (i.e. not operation) of
pci_device_is_present(),
which is logically same as pci_device_is absent, and it is
same for mpt3sas_base_pci_device_is_available().
My understanding is that, you want us to rename these functions for
better readability
Is that correct ?


> Oops, I meant "we DON'T know whether it is still present."
>
> > I think it would be better if the interfaces were something
> > like "pci_device_is_absent()" because that gives a result we can rely
> > on.  If that returns true, we know the device is definitely gone.
> >
> > Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available
  2018-10-08  6:44                 ` Suganath Prabu Subramani
  2018-10-12  5:47                   ` Sreekanth Reddy
@ 2018-10-12 23:43                   ` Bjorn Helgaas
  1 sibling, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2018-10-12 23:43 UTC (permalink / raw)
  To: Suganath Prabu Subramani
  Cc: lukas, linux-scsi, linux-pci, Andy Shevchenko, Sathya Prakash,
	Sreekanth Reddy, linux-kernel, benh, ruscur, sbobroff, Oliver

On Mon, Oct 08, 2018 at 12:14:40PM +0530, Suganath Prabu Subramani wrote:
> On Tue, Oct 2, 2018 at 7:34 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> > > I think the names "pci_device_is_present()" and
> > > "mpt3sas_base_pci_device_is_available()" contribute to the
> > > problem because they make promises that can't be kept -- all we
> > > can say is that the device *was* present, but we don't know
> > > whether it is *still* present.
> 
> In the patch we are using '!' (i.e. not operation) of
> pci_device_is_present(), which is logically same as pci_device_is
> absent, and it is same for mpt3sas_base_pci_device_is_available().
>
> My understanding is that, you want us to rename these functions for
> better readability.  Is that correct ?

I think "pci_device_is_present()" is a poor name, but I'm not
suggesting you fix it in this patch.  Changing that would be a PCI
core change that would also touch the tg3, igb, nvme, and now mpt3sas
drivers that use it.

My personal opinion is that you should do something like the patch
below.  The main point is that you should for the device being
unreachable *at the places where you might learn that*.

If you WRITE to a device that has been removed, the write will mostly
likely get dropped and you won't learn anything.  But when you READ
from a device that has been removed, you'll most likely get ~0 data
back, and you can check for that.

Of course, it's also possible that pci_device_is_present() can tell
you something.  So you *could* sprinkle those all over, as in your
patch.  But you end up with code like this:

  if (!pci_device_is_present())
    return;

  writel();
  readl();

Here, the device might be removed right after pci_device_is_present()
returns true.  You do the writel() and the readl() anyway, so you
really haven't gained anything.  And if the readl() returned ~0, you
assume it's valid data when it's not.

It's better to do this:

  writel();
  val = readl();
  if (val == ~0) {
    /* device is unreachable, clean things up */
    ...
  }

(Obviously it's possible that ~0 is a valid value for whatever you
read from the device.  That depends on the device and you have to use
your knowledge of the hardware.  If you read ~0 from a register where
that might be a valid value, you can read from a different register
for with ~0 is *not* a valid value.)

I don't claim the patch below is complete because I don't know
anything about your hardware and what sort of registers or ring
buffers it uses.  You still have to arrange to flush or complete
whatever is outstanding when you learn the device is gone.

Bjorn


diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 59d7844ee022..37b09362b31a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -6145,6 +6145,11 @@ _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
 		drsprintk(ioc, pr_info(MPT3SAS_FMT
 			"wrote magic sequence: count(%d), host_diagnostic(0x%08x)\n",
 		    ioc->name, count, host_diagnostic));
+		if (host_diagnostic == ~0) {
+			ioc->remove_host = 1;
+			pr_err(MPT3SAS_FMT "HBA unreachable\n", ioc->name);
+			goto out;
+		}
 
 	} while ((host_diagnostic & MPI2_DIAG_DIAG_WRITE_ENABLE) == 0);
 
@@ -6237,6 +6242,11 @@ _base_make_ioc_ready(struct MPT3SAS_ADAPTER *ioc, enum reset_type type)
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
 	dhsprintk(ioc, pr_info(MPT3SAS_FMT "%s: ioc_state(0x%08x)\n",
 	    ioc->name, __func__, ioc_state));
+	if (ioc_state == ~0) {
+		pr_err(MPT3SAS_FMT "%s: HBA unreachable (ioc_state=0x%x)\n",
+			ioc->name, __func__, ioc_state);
+		return -EFAULT;
+	}
 
 	/* if in RESET state, it should move to READY state shortly */
 	count = 0;
@@ -6251,6 +6261,11 @@ _base_make_ioc_ready(struct MPT3SAS_ADAPTER *ioc, enum reset_type type)
 			}
 			ssleep(1);
 			ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+			if (ioc_state == ~0) {
+				pr_err(MPT3SAS_FMT "%s: HBA unreachable (ioc_state=0x%x)\n",
+					ioc->name, __func__, ioc_state);
+				return -EFAULT;
+			}
 		}
 	}
 
@@ -6854,6 +6869,11 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
 	ioc->pending_io_count = 0;
 
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+	if (ioc_state == ~0) {
+		pr_err(MPT3SAS_FMT "%s: HBA unreachable\n",
+		       ioc->name, __func__);
+		return;
+	}
 	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
 		return;
 
@@ -6909,6 +6929,14 @@ mpt3sas_base_hard_reset_handler(struct MPT3SAS_ADAPTER *ioc,
 	    MPT3_DIAG_BUFFER_IS_RELEASED))) {
 		is_trigger = 1;
 		ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+		if (ioc_state == ~0) {
+			pr_err(MPT3SAS_FMT "%s: HBA unreachable\n",
+			       ioc->name, __func__);
+			ioc->schedule_dead_ioc_flush_running_cmds(ioc);
+			r = 0;
+			mutex_unlock(&ioc->reset_in_progress_mutex);
+			goto out_unlocked;
+		}
 		if ((ioc_state & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT)
 			is_fault = 1;
 	}
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 53133cfd420f..d0d4c8d94a10 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2653,6 +2653,11 @@ mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, u64 lun,
 	}
 
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+	if (ioc_state == ~0) {
+		pr_info(MPT3SAS_FMT "%s: HBA unreachable\n",
+                    __func__, ioc->name);
+		return FAILED;
+	}
 	if (ioc_state & MPI2_DOORBELL_USED) {
 		dhsprintk(ioc, pr_info(MPT3SAS_FMT
 			"unexpected doorbell active!\n", ioc->name));

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-10-12 23:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1537935759-14754-1-git-send-email-suganath-prabu.subramani@broadcom.com>
     [not found] ` <1537935759-14754-2-git-send-email-suganath-prabu.subramani@broadcom.com>
2018-09-26 21:32   ` [PATCH v4 1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available Bjorn Helgaas
2018-09-27  7:03     ` Lukas Wunner
2018-09-27 18:47       ` Bjorn Helgaas
2018-09-27 19:09         ` Lukas Wunner
2018-10-01  6:57           ` Suganath Prabu Subramani
2018-10-01 20:40             ` Bjorn Helgaas
2018-10-02 14:04               ` Bjorn Helgaas
2018-10-08  6:44                 ` Suganath Prabu Subramani
2018-10-12  5:47                   ` Sreekanth Reddy
2018-10-12 23:43                   ` Bjorn Helgaas
     [not found] ` <1537935759-14754-4-git-send-email-suganath-prabu.subramani@broadcom.com>
     [not found]   ` <20180926210918.GH28024@bhelgaas-glaptop.roam.corp.google.com>
     [not found]     ` <CA+RiK66toWULgXRa+nZVOEom70ntQW8PTO7N_8qFn2AvXB_8QA@mail.gmail.com>
2018-10-01 13:40       ` [PATCH v4 3/6] mpt3sas: Introdude _scsih_get_shost_and_ioc Bjorn Helgaas
2018-10-08  6:47         ` Suganath Prabu Subramani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).