From: Keith Busch <kbusch@kernel.org>
To: Samuel Thibault <samuel.thibault@ens-lyon.org>,
Vidya Sagar <vidyas@nvidia.com>,
linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org
Subject: Re: Are AER corrected errors worrying?
Date: Wed, 6 Jan 2021 13:48:08 -0800 [thread overview]
Message-ID: <20210106214808.GA1280721@dhcp-10-100-145-180.wdc.com> (raw)
In-Reply-To: <20210106202823.ehdkno3inlzszqtb@function>
On Wed, Jan 06, 2021 at 09:28:23PM +0100, Samuel Thibault wrote:
> Samuel Thibault, le lun. 04 janv. 2021 22:36:48 +0100, a ecrit:
> > Samuel Thibault, le lun. 04 janv. 2021 21:12:47 +0100, a ecrit:
> > > Vidya Sagar wrote:
> > > > Since this is a laptop, I'm suspecting that ASPM states might have
> > > > been enabled which could be causing these errors.
> > >
> > > Keith Busch, le lun. 04 janv. 2021 10:44:35 -0800, a ecrit:
> > > > Sometimes these types of errors occur from low power settings, so you
> > > > can try disabling the automatic management of these (assuming the
> > > > hardware supports it). To disable nvme specific power state transitions,
> > > > the kernel parameter is "nvme_core.default_ps_max_latency_us=0".
> > >
> > > I have tried to add it,
> > >
> > > I'll watch in the coming
> > > hours/days to see if that avoided the issue.
> >
> > I did get one
> >
> > Jan 4 22:34:53 begin kernel: [ 7165.207562] pcieport 0000:00:1d.0: AER: Corrected error received: 0000:02:00.0
> > Jan 4 22:34:53 begin kernel: [ 7165.213891] nvme 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
> > Jan 4 22:34:53 begin kernel: [ 7165.216949] nvme 0000:02:00.0: device [15b7:5006] error status/mask=00000001/0000e000
> > Jan 4 22:34:53 begin kernel: [ 7165.219995] nvme 0000:02:00.0: [ 0] RxErr
> >
> > > > PCI also has automatic link power savings that you can disable with
> > > > parameter "pcie_aspm=off".
> > >
> > > I'll try that if I still see errors with the nvme_core parameter.
> >
> > I'm on it.
>
> I tried to make the machine only run apt-get update every 10m for 24h.
>
> With pcie_aspm=off, I didn't get any corrected error
> Without it I got 39 corrected errors
>
> So that seems very relevant :)
>
> Is there more I can provide to investigate if that can somehow be fixed
> in the driver? I guess I can safely use the system with pcie_aspm=off?
> (the energy saving seems neglectible)
I don't think there's more to do from the kernel or driver beyond
disabling usage of the problematic feature. I think a proper fix would
have to come from the hardware vendor.
next prev parent reply other threads:[~2021-01-06 21:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210101224028.4akud7meibjavvtf@function>
[not found] ` <20210104184435.GE1024941@dhcp-10-100-145-180.wdc.com>
2021-01-04 20:12 ` Are AER corrected errors worrying? Samuel Thibault
2021-01-04 21:36 ` Samuel Thibault
2021-01-04 22:33 ` Samuel Thibault
2021-01-06 20:28 ` Samuel Thibault
2021-01-06 21:48 ` Keith Busch [this message]
2021-01-06 22:40 ` Samuel Thibault
2021-01-02 17:03 Samuel Thibault
2021-01-03 6:45 ` Vidya Sagar
2021-01-03 11:25 ` Samuel Thibault
2021-01-03 13:48 ` Samuel Thibault
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210106214808.GA1280721@dhcp-10-100-145-180.wdc.com \
--to=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=samuel.thibault@ens-lyon.org \
--cc=vidyas@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).