linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Russell Currey <ruscur@russell.cc>,
	Oliver O'Halloran <oohall@gmail.com>,
	Mika Westerberg <mika.westerberg@linux.intel.com>,
	Lalithambika Krishnakumar <lalithambika.krishnakumar@intel.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Joerg Roedel <jroedel@suse.de>,
	Alex Williamson <alex.williamson@redhat.com>,
	"open list:PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC" 
	<linuxppc-dev@lists.ozlabs.org>,
	"open list:PCI SUBSYSTEM" <linux-pci@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] PCI/AER: Disable AER interrupt during suspend
Date: Thu, 22 Jul 2021 17:23:51 -0500	[thread overview]
Message-ID: <20210722222351.GA354095@bjorn-Precision-5520> (raw)
In-Reply-To: <CAAd53p6VN0ejKHcTRgj8mZ_iApR=KogpVZ-HkvdoZbJ=Yue98g@mail.gmail.com>

On Fri, Feb 05, 2021 at 11:17:32PM +0800, Kai-Heng Feng wrote:
> On Fri, Feb 5, 2021 at 7:28 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > [+cc Alex]
> >
> > On Thu, Jan 28, 2021 at 12:09:37PM +0800, Kai-Heng Feng wrote:
> > > On Thu, Jan 28, 2021 at 4:51 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Thu, Jan 28, 2021 at 01:31:00AM +0800, Kai-Heng Feng wrote:
> > > > > Commit 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in
> > > > > hint") enables ACS, and some platforms lose its NVMe after resume from
> > > > > firmware:
> > > > > [   50.947816] pcieport 0000:00:1b.0: DPC: containment event, status:0x1f01 source:0x0000
> > > > > [   50.947817] pcieport 0000:00:1b.0: DPC: unmasked uncorrectable error detected
> > > > > [   50.947829] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
> > > > > [   50.947830] pcieport 0000:00:1b.0:   device [8086:06ac] error status/mask=00200000/00010000
> > > > > [   50.947831] pcieport 0000:00:1b.0:    [21] ACSViol                (First)
> > > > > [   50.947841] pcieport 0000:00:1b.0: AER: broadcast error_detected message
> > > > > [   50.947843] nvme nvme0: frozen state error detected, reset controller
> > > > >
> > > > > It happens right after ACS gets enabled during resume.
> > > > >
> > > > > To prevent that from happening, disable AER interrupt and enable it on
> > > > > system suspend and resume, respectively.
> > > >
> > > > Lots of questions here.  Maybe this is what we'll end up doing, but I
> > > > am curious about why the error is reported in the first place.
> > > >
> > > > Is this a consequence of the link going down and back up?
> > >
> > > Could be. From the observations, it only happens when firmware suspend
> > > (S3) is used.
> > > Maybe it happens when it's gets powered up, but I don't have equipment
> > > to debug at hardware level.
> > >
> > > If we use non-firmware suspend method, enabling ACS after resume won't
> > > trip AER and DPC.
> > >
> > > > Is it consequence of the device doing a DMA when it shouldn't?
> > >
> > > If it's doing DMA while suspending, the same error should also happen
> > > after NVMe is suspended and before PCIe port suspending.
> > > Furthermore, if non-firmware suspend method is used, there's so such
> > > issue, so less likely to be any DMA operation.
> > >
> > > > Are we doing something in the wrong order during suspend?  Or maybe
> > > > resume, since I assume the error is reported during resume?
> > >
> > > Yes the error is reported during resume. The suspend/resume order
> > > seems fine as non-firmware suspend doesn't have this issue.
> >
> > I really feel like we need a better understanding of what's going on
> > here.  Disabling the AER interrupt is like closing our eyes and
> > pretending that because we don't see it, it didn't happen.
> >
> > An ACS error is triggered by a DMA, right?  I'm assuming an MMIO
> > access from the CPU wouldn't trigger this error.  And it sounds like
> > the error is triggered before we even start running the driver after
> > resume.
> >
> > If we're powering up an NVMe device from D3cold and it DMAs before the
> > driver touches it, something would be seriously broken.  I doubt
> > that's what's happening.  Maybe a device could resume some previously
> > programmed DMA after powering up from D3hot.
> 
> I am not that familiar with PCIe ACS/AER/DPC, so I can't really answer
> questions you raised.
> PCIe spec doesn't say the suspend/resume order is also not helping here.
> 
> However, I really think it's a system firmware issue.
> I've seen some suspend-to-idle platforms with NVMe can reach D3cold,
> those are unaffected.

Marking both of these as "not applicable" for now because I don't
think we really understand what's going on.

Apparently a DMA occurs during suspend or resume and triggers an ACS
violation.  I don't think think such a DMA should occur in the first
place.

Or maybe, since you say the problem happens right after ACS is enabled
during resume, we're doing the ACS enable incorrectly?  Although I
would think we should not be doing DMA at the same time we're enabling
ACS, either.

If this really is a system firmware issue, both HP and Dell should
have the knowledge and equipment to figure out what's going on.

> > Or maybe the error occurred on suspend, like if the device wasn't
> > quiesced or something, but we didn't notice it until resume?  The
> > AER error status bits are RW1CS, which means they can be preserved
> > across hot/warm/cold resets.
> >
> > Can you instrument the code to see whether the AER error status bit is
> > set before enabling ACS?  I'm not sure that merely enabling ACS (I
> > assume you mean pci_std_enable_acs(), where we write PCI_ACS_CTRL)
> > should cause an interrupt for a previously-logged error.  I suspect
> > that could happen when enabling *AER*, but I wouldn't think it would
> > happen when enabling *ACS*.
> 
> Diff to print AER status:
> https://bugzilla.kernel.org/show_bug.cgi?id=209149#c11
> 
> And dmesg:
> https://bugzilla.kernel.org/show_bug.cgi?id=209149#c12
> 
> Looks like the read before suspend and after resume are both fine.
> 
> >
> > Does this error happen on multiple machines from different vendors?
> > Wondering if it could be a BIOS issue, e.g., BIOS not cleaning up
> > after it did something to cause an error.
> 
> AFAIK, systems from both HP and Dell are affected.
> I was told that the reference platform from Intel is using
> suspend-to-idle, but vendors changed the sleep method to S3 to have
> lower power consumption to pass regulation.
> 
> Kai-Heng
> 
> >
> > > > If we *do* take the error, why doesn't DPC recovery work?
> > >
> > > It works for the root port, but not for the NVMe drive:
> > > [   50.947816] pcieport 0000:00:1b.0: DPC: containment event,
> > > status:0x1f01 source:0x0000
> > > [   50.947817] pcieport 0000:00:1b.0: DPC: unmasked uncorrectable error detected
> > > [   50.947829] pcieport 0000:00:1b.0: PCIe Bus Error:
> > > severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
> > > ID)
> > > [   50.947830] pcieport 0000:00:1b.0:   device [8086:06ac] error
> > > status/mask=00200000/00010000
> > > [   50.947831] pcieport 0000:00:1b.0:    [21] ACSViol                (First)
> > > [   50.947841] pcieport 0000:00:1b.0: AER: broadcast error_detected message
> > > [   50.947843] nvme nvme0: frozen state error detected, reset controller
> > > [   50.948400] ACPI: EC: event unblocked
> > > [   50.948432] xhci_hcd 0000:00:14.0: PME# disabled
> > > [   50.948444] xhci_hcd 0000:00:14.0: enabling bus mastering
> > > [   50.949056] pcieport 0000:00:1b.0: PME# disabled
> > > [   50.949068] pcieport 0000:00:1c.0: PME# disabled
> > > [   50.949416] e1000e 0000:00:1f.6: PME# disabled
> > > [   50.949463] e1000e 0000:00:1f.6: enabling bus mastering
> > > [   50.951606] sd 0:0:0:0: [sda] Starting disk
> > > [   50.951610] nvme 0000:01:00.0: can't change power state from D3hot
> > > to D0 (config space inaccessible)
> > > [   50.951730] nvme nvme0: Removing after probe failure status: -19
> > > [   50.952360] nvme nvme0: failed to set APST feature (-19)
> > > [   50.971136] snd_hda_intel 0000:00:1f.3: PME# disabled
> > > [   51.089330] pcieport 0000:00:1b.0: AER: broadcast resume message
> > > [   51.089345] pcieport 0000:00:1b.0: AER: device recovery successful
> > >
> > > But I think why recovery doesn't work for NVMe is for another discussion...
> > >
> > > Kai-Heng
> > >
> > > >
> > > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209149
> > > > > Fixes: 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint")
> > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > ---
> > > > >  drivers/pci/pcie/aer.c | 18 ++++++++++++++++++
> > > > >  1 file changed, 18 insertions(+)
> > > > >
> > > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > > > > index 77b0f2c45bc0..0e9a85530ae6 100644
> > > > > --- a/drivers/pci/pcie/aer.c
> > > > > +++ b/drivers/pci/pcie/aer.c
> > > > > @@ -1365,6 +1365,22 @@ static int aer_probe(struct pcie_device *dev)
> > > > >       return 0;
> > > > >  }
> > > > >
> > > > > +static int aer_suspend(struct pcie_device *dev)
> > > > > +{
> > > > > +     struct aer_rpc *rpc = get_service_data(dev);
> > > > > +
> > > > > +     aer_disable_rootport(rpc);
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +static int aer_resume(struct pcie_device *dev)
> > > > > +{
> > > > > +     struct aer_rpc *rpc = get_service_data(dev);
> > > > > +
> > > > > +     aer_enable_rootport(rpc);
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
> > > > >   * @dev: pointer to Root Port, RCEC, or RCiEP
> > > > > @@ -1437,6 +1453,8 @@ static struct pcie_port_service_driver aerdriver = {
> > > > >       .service        = PCIE_PORT_SERVICE_AER,
> > > > >
> > > > >       .probe          = aer_probe,
> > > > > +     .suspend        = aer_suspend,
> > > > > +     .resume         = aer_resume,
> > > > >       .remove         = aer_remove,
> > > > >  };
> > > > >
> > > > > --
> > > > > 2.29.2
> > > > >

  reply	other threads:[~2021-07-22 22:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 17:31 [PATCH 1/2] PCI/AER: Disable AER interrupt during suspend Kai-Heng Feng
2021-01-27 17:31 ` [PATCH 2/2] PCI/DPC: Disable DPC " Kai-Heng Feng
2021-01-27 20:50 ` [PATCH 1/2] PCI/AER: Disable AER " Bjorn Helgaas
2021-01-28  4:09   ` Kai-Heng Feng
2021-02-04 23:27     ` Bjorn Helgaas
2021-02-05 15:17       ` Kai-Heng Feng
2021-07-22 22:23         ` Bjorn Helgaas [this message]
2021-07-23  5:24           ` Christoph Hellwig
2021-07-23  7:05             ` Kai-Heng Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210722222351.GA354095@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=jroedel@suse.de \
    --cc=kai.heng.feng@canonical.com \
    --cc=lalithambika.krishnakumar@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=oohall@gmail.com \
    --cc=ruscur@russell.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).