linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <linux-pci@vger.kernel.org>, Keith Busch <kbusch@kernel.org>,
	"Kai-Heng Feng" <kai.heng.feng@canonical.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Subject: Re: [PATCH] Documentation: PCI: add vmd documentation
Date: Fri, 26 Apr 2024 14:46:02 -0700	[thread overview]
Message-ID: <9e2feef5-b088-49a6-959e-64c6d02faced@intel.com> (raw)
In-Reply-To: <20240426213631.GA594688@bhelgaas>

On 4/26/2024 2:36 PM, Bjorn Helgaas wrote:
> On Thu, Apr 25, 2024 at 04:32:21PM -0700, Paul M Stillwell Jr wrote:
>> On 4/25/2024 3:32 PM, Bjorn Helgaas wrote:
>>> On Thu, Apr 25, 2024 at 02:43:07PM -0700, Paul M Stillwell Jr wrote:
>>>> On 4/25/2024 10:24 AM, Bjorn Helgaas wrote:
>>>>> On Wed, Apr 24, 2024 at 02:29:16PM -0700, Paul M Stillwell Jr wrote:
>>>>>> On 4/23/2024 5:47 PM, Bjorn Helgaas wrote:
>>>> ...
>>>
>>>>>>> _OSC is the only mechanism for negotiating ownership of these
>>>>>>> features, and PCI Firmware r3.3, sec 4.5.1, is pretty clear that _OSC
>>>>>>> only applies to the hierarchy originated by the PNP0A03/PNP0A08 host
>>>>>>> bridge that contains the _OSC method.  AFAICT, there's no
>>>>>>> global/device-specific thing here.
>>>>>>>
>>>>>>> The BIOS may have a single user-visible setting, and it may apply that
>>>>>>> setting to all host bridge _OSC methods, but that's just part of the
>>>>>>> BIOS UI, not part of the firmware/OS interface.
>>>>>>
>>>>>> Fair, but we are still left with the question of how to set the _OSC bits
>>>>>> for the VMD bridge. This would normally happen using ACPI AFAICT and we
>>>>>> don't have that for the devices behind VMD.
>>>>>
>>>>> In the absence of a mechanism for negotiating ownership, e.g., an ACPI
>>>>> host bridge device for the hierarchy, the OS owns all the PCIe
>>>>> features.
>>>>
>>>> I'm new to this space so I don't know what it means for the OS to
>>>> own the features. In other words, how would the vmd driver figure
>>>> out what features are supported?
>>>
>>> There are three things that go into this:
>>>
>>>     - Does the OS support the feature, e.g., is CONFIG_PCIEAER enabled?
>>>
>>>     - Has the platform granted permission to the OS to use the feature,
>>>       either explicitly via _OSC or implicitly because there's no
>>>       mechanism to negotiate ownership?
>>>
>>>     - Does the device advertise the feature, e.g., does it have an AER
>>>       Capability?
>>>
>>> If all three are true, Linux enables the feature.
>>>
>>> I think vmd has implicit ownership of all features because there is no
>>> ACPI host bridge device for the VMD domain, and (IMO) that means there
>>> is no way to negotiate ownership in that domain.
>>>
>>> So the VMD domain starts with all the native_* bits set, meaning Linux
>>> owns the features.  If the vmd driver doesn't want some feature to be
>>> used, it could clear the native_* bit for it.
>>>
>>> I don't think vmd should unilaterally claim ownership of features by
>>> *setting* native_* bits because that will lead to conflicts with
>>> firmware.
>>
>> This is the crux of the problem IMO. I'm happy to set the native_* bits
>> using some knowledge about what the firmware wants, but we don't have a
>> mechanism to do it AFAICT. I think that's what commit 04b12ef163d1 ("PCI:
>> vmd: Honor ACPI _OSC on PCIe features") was trying to do: use a domain that
>> ACPI had run on and negotiated features and apply them to the vmd domain.
> 
> Yes, this is the problem.  We have no information about what firmware
> wants for the VMD domain because there is no ACPI host bridge device.
> 
> We have information about what firmware wants for *other* domains.
> 04b12ef163d1 assumes that also applies to the VMD domain, but I don't
> think that's a good idea.
> 
>> Using the 3 criteria you described above, could we do this for the hotplug
>> feature for VMD:
>>
>> 1. Use IS_ENABLED(CONFIG_<whatever hotplug setting we need>) to check to see
>> if the hotplug feature is enabled
> 
> That's already there.
> 
>> 2. We know that for VMD we want hotplug enabled so that is the implicit
>> permission
> 
> The VMD domain starts with all native_* bits set.  All you have to do
> is avoid *clearing* them.
> 
> The problem (IMO) is that 04b12ef163d1 clears bits based on the _OSC
> for some other domain.
> 
>> 3. Look at the root ports below VMD and see if hotplug capability is set
> 
> This is already there, too.
> 
>> If 1 & 3 are true, then we set the native_* bits for hotplug (we can look
>> for surprise hotplug as well in the capability to set the
>> native_shpc_hotplug bit corrrectly) to 1. This feels like it would solve the
>> problem of "what if there is a hotplug issue on the platform" because the
>> user would have disabled hotplug for VMD and the root ports below it would
>> have the capability turned off.
>>
>> In theory we could do this same thing for all the features, but we don't
>> know what the firmware wants for features other than hotplug (because we
>> implicitly know that vmd wants hotplug). I feel like 04b12ef163d1 is a good
>> compromise for the other features, but I hear your issues with it.
>>
>> I'm happy to "do the right thing" for the other features, I just can't
>> figure out what that thing is :)
> 
> 04b12ef163d1 was motivated by a flood of Correctable Errors.
> 
> Kai-Heng says the errors occur even when booting with
> "pcie_ports=native" and VMD turned off, i.e., when the VMD RCiEP is
> disabled and the NVMe devices appear under plain Root Ports in domain
> 0000.  That suggests that they aren't related to VMD at all.
> 
> I think there's a significant chance that those errors are caused by a
> software defect, e.g., ASPM configuration.  There are many similar
> reports of Correctable Errors where "pcie_aspm=off" is a workaround.
> 
> If we can nail down the root cause of those Correctable Errors, we may
> be able to fix it and just revert 04b12ef163d1.  That would leave all
> the PCIe features owned and enabled by Linux in the VMD domain.  AER
> would be enabled and not a problem, hotplug would be enabled as you
> need, etc.
> 
> There are a zillion reports of these errors and I added comments to
> some to see if anybody can help us get to a root cause.
> 

OK, sounds like the plan for me is to sit tight for now WRT a patch to 
fix hotplug. I'll submit a v2 for the documentation.

Paul


      reply	other threads:[~2024-04-26 21:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 20:15 [PATCH] Documentation: PCI: add vmd documentation Paul M Stillwell Jr
2024-04-17 23:51 ` Keith Busch
2024-04-18 15:07   ` Paul M Stillwell Jr
2024-04-18 23:34     ` Keith Busch
2024-04-18 18:26 ` Bjorn Helgaas
2024-04-18 21:51   ` Paul M Stillwell Jr
2024-04-19 21:14     ` Bjorn Helgaas
2024-04-19 22:18       ` Paul M Stillwell Jr
2024-04-22 20:27         ` Bjorn Helgaas
2024-04-22 21:39           ` Paul M Stillwell Jr
2024-04-22 22:52             ` Bjorn Helgaas
2024-04-22 23:39               ` Paul M Stillwell Jr
2024-04-23 21:26                 ` Bjorn Helgaas
2024-04-23 23:10                   ` Paul M Stillwell Jr
2024-04-24  0:47                     ` Bjorn Helgaas
2024-04-24 21:29                       ` Paul M Stillwell Jr
2024-04-25 17:24                         ` Bjorn Helgaas
2024-04-25 21:43                           ` Paul M Stillwell Jr
2024-04-25 22:32                             ` Bjorn Helgaas
2024-04-25 23:32                               ` Paul M Stillwell Jr
2024-04-26 21:36                                 ` Bjorn Helgaas
2024-04-26 21:46                                   ` Paul M Stillwell Jr [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e2feef5-b088-49a6-959e-64c6d02faced@intel.com \
    --to=paul.m.stillwell.jr@intel.com \
    --cc=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=rafael.j.wysocki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).