linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Cc: linux-pci@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Subject: Re: [PATCH] Documentation: PCI: add vmd documentation
Date: Fri, 26 Apr 2024 16:36:31 -0500	[thread overview]
Message-ID: <20240426213631.GA594688@bhelgaas> (raw)
In-Reply-To: <51c5f46e-5649-4ab4-9828-5711dd06ee5d@intel.com>

On Thu, Apr 25, 2024 at 04:32:21PM -0700, Paul M Stillwell Jr wrote:
> On 4/25/2024 3:32 PM, Bjorn Helgaas wrote:
> > On Thu, Apr 25, 2024 at 02:43:07PM -0700, Paul M Stillwell Jr wrote:
> > > On 4/25/2024 10:24 AM, Bjorn Helgaas wrote:
> > > > On Wed, Apr 24, 2024 at 02:29:16PM -0700, Paul M Stillwell Jr wrote:
> > > > > On 4/23/2024 5:47 PM, Bjorn Helgaas wrote:
> > > ...
> > 
> > > > > > _OSC is the only mechanism for negotiating ownership of these
> > > > > > features, and PCI Firmware r3.3, sec 4.5.1, is pretty clear that _OSC
> > > > > > only applies to the hierarchy originated by the PNP0A03/PNP0A08 host
> > > > > > bridge that contains the _OSC method.  AFAICT, there's no
> > > > > > global/device-specific thing here.
> > > > > > 
> > > > > > The BIOS may have a single user-visible setting, and it may apply that
> > > > > > setting to all host bridge _OSC methods, but that's just part of the
> > > > > > BIOS UI, not part of the firmware/OS interface.
> > > > > 
> > > > > Fair, but we are still left with the question of how to set the _OSC bits
> > > > > for the VMD bridge. This would normally happen using ACPI AFAICT and we
> > > > > don't have that for the devices behind VMD.
> > > > 
> > > > In the absence of a mechanism for negotiating ownership, e.g., an ACPI
> > > > host bridge device for the hierarchy, the OS owns all the PCIe
> > > > features.
> > > 
> > > I'm new to this space so I don't know what it means for the OS to
> > > own the features. In other words, how would the vmd driver figure
> > > out what features are supported?
> > 
> > There are three things that go into this:
> > 
> >    - Does the OS support the feature, e.g., is CONFIG_PCIEAER enabled?
> > 
> >    - Has the platform granted permission to the OS to use the feature,
> >      either explicitly via _OSC or implicitly because there's no
> >      mechanism to negotiate ownership?
> > 
> >    - Does the device advertise the feature, e.g., does it have an AER
> >      Capability?
> > 
> > If all three are true, Linux enables the feature.
> > 
> > I think vmd has implicit ownership of all features because there is no
> > ACPI host bridge device for the VMD domain, and (IMO) that means there
> > is no way to negotiate ownership in that domain.
> > 
> > So the VMD domain starts with all the native_* bits set, meaning Linux
> > owns the features.  If the vmd driver doesn't want some feature to be
> > used, it could clear the native_* bit for it.
> > 
> > I don't think vmd should unilaterally claim ownership of features by
> > *setting* native_* bits because that will lead to conflicts with
> > firmware.
> 
> This is the crux of the problem IMO. I'm happy to set the native_* bits
> using some knowledge about what the firmware wants, but we don't have a
> mechanism to do it AFAICT. I think that's what commit 04b12ef163d1 ("PCI:
> vmd: Honor ACPI _OSC on PCIe features") was trying to do: use a domain that
> ACPI had run on and negotiated features and apply them to the vmd domain.

Yes, this is the problem.  We have no information about what firmware
wants for the VMD domain because there is no ACPI host bridge device.

We have information about what firmware wants for *other* domains.
04b12ef163d1 assumes that also applies to the VMD domain, but I don't
think that's a good idea.

> Using the 3 criteria you described above, could we do this for the hotplug
> feature for VMD:
> 
> 1. Use IS_ENABLED(CONFIG_<whatever hotplug setting we need>) to check to see
> if the hotplug feature is enabled

That's already there.

> 2. We know that for VMD we want hotplug enabled so that is the implicit
> permission

The VMD domain starts with all native_* bits set.  All you have to do
is avoid *clearing* them.

The problem (IMO) is that 04b12ef163d1 clears bits based on the _OSC
for some other domain.

> 3. Look at the root ports below VMD and see if hotplug capability is set

This is already there, too.

> If 1 & 3 are true, then we set the native_* bits for hotplug (we can look
> for surprise hotplug as well in the capability to set the
> native_shpc_hotplug bit corrrectly) to 1. This feels like it would solve the
> problem of "what if there is a hotplug issue on the platform" because the
> user would have disabled hotplug for VMD and the root ports below it would
> have the capability turned off.
> 
> In theory we could do this same thing for all the features, but we don't
> know what the firmware wants for features other than hotplug (because we
> implicitly know that vmd wants hotplug). I feel like 04b12ef163d1 is a good
> compromise for the other features, but I hear your issues with it.
> 
> I'm happy to "do the right thing" for the other features, I just can't
> figure out what that thing is :)

04b12ef163d1 was motivated by a flood of Correctable Errors.

Kai-Heng says the errors occur even when booting with
"pcie_ports=native" and VMD turned off, i.e., when the VMD RCiEP is
disabled and the NVMe devices appear under plain Root Ports in domain
0000.  That suggests that they aren't related to VMD at all.

I think there's a significant chance that those errors are caused by a
software defect, e.g., ASPM configuration.  There are many similar
reports of Correctable Errors where "pcie_aspm=off" is a workaround.

If we can nail down the root cause of those Correctable Errors, we may
be able to fix it and just revert 04b12ef163d1.  That would leave all
the PCIe features owned and enabled by Linux in the VMD domain.  AER
would be enabled and not a problem, hotplug would be enabled as you
need, etc.

There are a zillion reports of these errors and I added comments to
some to see if anybody can help us get to a root cause.

Bjorn

  reply	other threads:[~2024-04-26 21:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 20:15 [PATCH] Documentation: PCI: add vmd documentation Paul M Stillwell Jr
2024-04-17 23:51 ` Keith Busch
2024-04-18 15:07   ` Paul M Stillwell Jr
2024-04-18 23:34     ` Keith Busch
2024-04-18 18:26 ` Bjorn Helgaas
2024-04-18 21:51   ` Paul M Stillwell Jr
2024-04-19 21:14     ` Bjorn Helgaas
2024-04-19 22:18       ` Paul M Stillwell Jr
2024-04-22 20:27         ` Bjorn Helgaas
2024-04-22 21:39           ` Paul M Stillwell Jr
2024-04-22 22:52             ` Bjorn Helgaas
2024-04-22 23:39               ` Paul M Stillwell Jr
2024-04-23 21:26                 ` Bjorn Helgaas
2024-04-23 23:10                   ` Paul M Stillwell Jr
2024-04-24  0:47                     ` Bjorn Helgaas
2024-04-24 21:29                       ` Paul M Stillwell Jr
2024-04-25 17:24                         ` Bjorn Helgaas
2024-04-25 21:43                           ` Paul M Stillwell Jr
2024-04-25 22:32                             ` Bjorn Helgaas
2024-04-25 23:32                               ` Paul M Stillwell Jr
2024-04-26 21:36                                 ` Bjorn Helgaas [this message]
2024-04-26 21:46                                   ` Paul M Stillwell Jr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240426213631.GA594688@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=paul.m.stillwell.jr@intel.com \
    --cc=rafael.j.wysocki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).