linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <Alex_Gagniuc@Dellteam.com>
To: <oohall@gmail.com>, <gregkh@linuxfoundation.org>
Cc: <keith.busch@intel.com>, <helgaas@kernel.org>,
	<mr.nuke.me@gmail.com>, <linux-pci@vger.kernel.org>,
	<Austin.Bolen@dell.com>, <Shyam.Iyer@dell.com>,
	<linux-kernel@vger.kernel.org>, <jonathan.derrick@intel.com>,
	<lukas@wunner.de>, <ruscur@russell.cc>, <sbobroff@linux.ibm.com>,
	<linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected
Date: Mon, 12 Nov 2018 20:05:41 +0000	[thread overview]
Message-ID: <df85813c9860463d85f6c302dfe07b12@ausx13mps321.AMER.DELL.COM> (raw)
In-Reply-To: 5da8d8aa9f3818af649b1ac547bc4e6062626ddf.camel@gmail.com

On 11/11/2018 11:50 PM, Oliver O'Halloran wrote:
> 
> [EXTERNAL EMAIL]
> Please report any suspicious attachments, links, or requests for sensitive information.
> 
> 
> On Thu, 2018-11-08 at 23:06 +0000, Alex_Gagniuc@Dellteam.com wrote:
>> On 11/08/2018 04:51 PM, Greg KH wrote:
>>> On Thu, Nov 08, 2018 at 10:49:08PM +0000, Alex_Gagniuc@Dellteam.com wrote:
>>>> In the case that we're trying to fix, this code executing is a result of
>>>> the device being gone, so we can guarantee race-free operation. I agree
>>>> that there is a race, in the general case. As far as checking the result
>>>> for all F's, that's not an option when firmware crashes the system as a
>>>> result of the mmio read/write. It's never pretty when firmware gets
>>>> involved.
>>>
>>> If you have firmware that crashes the system when you try to read from a
>>> PCI device that was hot-removed, that is broken firmware and needs to be
>>> fixed.  The kernel can not work around that as again, you will never win
>>> that race.
>>
>> But it's not the firmware that crashes. It's linux as a result of a
>> fatal error message from the firmware. And we can't fix that because FFS
>> handling requires that the system reboots [1].
> 
> Do we know the exact circumsances that result in firmware requesting a
> reboot? If it happen on any PCIe error I don't see what we can do to
> prevent that beyond masking UEs entirely (are we even allowed to do
> that on FFS systems?).

Pull a drive out at an angle, push two drives in at the same time, pull 
out a drive really slow. If an error is even reported to the OS depends 
on PD state, and proprietary mechanisms and logic in the HW and FW. OS 
is not supposed to mask errors (touch AER bits) on FFS.

Sadly, with FFS, behavior can and does change from BIOS version to BIOS 
version. On one product, for example, we eliminated a lot of crashes by 
simply not reporting some classes of PCIe errors to the OS.

Alex

>> If we're going to say that we don't want to support FFS because it's a
>> separate code path, and different flow, that's fine. I am myself, not a
>> fan of FFS. But if we're going to continue supporting it, I think we'll
>> continue to have to resolve these sort of unintended consequences.
>>
>> Alex
>>
>> [1] ACPI 6.2, 18.1 - Hardware Errors and Error Sources
> 
> 


  reply	other threads:[~2018-11-12 20:05 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-18 22:15 [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected Alexandru Gagniuc
2018-11-06  0:32 ` Alex G.
2018-11-07 17:04   ` Derrick, Jonathan
2018-11-07 23:42 ` Bjorn Helgaas
2018-11-08 20:09   ` Bjorn Helgaas
2018-11-08 21:49     ` Keith Busch
2018-11-08 22:01     ` Greg Kroah-Hartman
2018-11-08 22:32       ` Keith Busch
2018-11-08 22:42         ` Greg Kroah-Hartman
2018-11-08 22:49           ` Alex_Gagniuc
2018-11-08 22:51             ` Greg KH
2018-11-08 23:06               ` Alex_Gagniuc
2018-11-12  5:49                 ` Oliver O'Halloran
2018-11-12 20:05                   ` Alex_Gagniuc [this message]
2018-11-13  5:02                     ` Bjorn Helgaas
2018-11-13 22:39                       ` Alex_Gagniuc
2018-11-13 22:52                         ` Keith Busch
2018-11-14  0:31                           ` Alex_Gagniuc
2018-11-14  5:59                         ` Bjorn Helgaas
2018-11-14 19:22                           ` Alex_Gagniuc
2018-11-14 19:41                             ` Derrick, Jonathan
2018-11-14 20:23                             ` Keith Busch
2018-11-14 20:52                               ` Alex_Gagniuc
2018-11-14 20:58                                 ` Keith Busch
2018-11-15  6:24                             ` Bjorn Helgaas
2018-11-16  0:19                               ` Alex_Gagniuc
2018-11-08 23:03           ` Keith Busch
2018-11-09  7:29       ` Lukas Wunner
2018-11-09 11:32         ` Greg Kroah-Hartman
2018-11-09 16:36           ` Keith Busch
2018-11-08 22:20     ` Alex_Gagniuc
2018-11-09  7:11     ` Lukas Wunner
2018-11-12  5:48       ` Oliver O'Halloran
2018-12-27 19:28     ` Alex_Gagniuc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df85813c9860463d85f6c302dfe07b12@ausx13mps321.AMER.DELL.COM \
    --to=alex_gagniuc@dellteam.com \
    --cc=Austin.Bolen@dell.com \
    --cc=Shyam.Iyer@dell.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=helgaas@kernel.org \
    --cc=jonathan.derrick@intel.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lukas@wunner.de \
    --cc=mr.nuke.me@gmail.com \
    --cc=oohall@gmail.com \
    --cc=ruscur@russell.cc \
    --cc=sbobroff@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).