All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Nirmal Patel <nirmal.patel@linux.intel.com>
Cc: linux-pci@vger.kernel.org, samruddh.dhope@intel.com,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>
Subject: Re: [PATCH v2] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
Date: Fri, 12 Jan 2024 16:55:04 -0600	[thread overview]
Message-ID: <20240112225504.GA1129179@bhelgaas> (raw)
In-Reply-To: <24a4e9c73f6447f5bb67906c57520514dfa77389.camel@linux.intel.com>

[+cc Rafael, Kai-Heng]

On Thu, Dec 14, 2023 at 03:22:21PM -0700, Nirmal Patel wrote:
> On Thu, 2023-12-14 at 13:23 -0600, Bjorn Helgaas wrote:
> > On Wed, Dec 13, 2023 at 06:07:02PM -0700, Nirmal Patel wrote:
> > > On Tue, 2023-12-12 at 15:13 -0600, Bjorn Helgaas wrote:
> > > > On Mon, Dec 11, 2023 at 04:05:25PM -0700, Nirmal Patel wrote:
> > > > > On Tue, 2023-12-05 at 18:27 -0600, Bjorn Helgaas wrote:
> > > > > > On Tue, Dec 05, 2023 at 03:20:27PM -0700, Nirmal Patel wrote:
> > > > > > ...
> > > > > > > Currently Hotplug is controlled by _OSC in both host and
> > > > > > > Guest.  In case of guest, it seems guest BIOS hasn't
> > > > > > > implemented _OSC since all the flags are set to 0 which
> > > > > > > is not the case in host.
> > > > > > 
> > > > > > I think you want pciehp to work on the VMD Root Ports in
> > > > > > the Guest, no matter what, on the assumption that any _OSC
> > > > > > that applies to host bridge A does not apply to the host
> > > > > > bridge created by the vmd driver.
> > > > > > 
> > > > > > If so, we should just revert 04b12ef163d1 ("PCI: vmd:
> > > > > > Honor ACPI _OSC on PCIe features").  Since host bridge B
> > > > > > was not enumerated via ACPI, the OS owns all those
> > > > > > features under B by default, so reverting 04b12ef163d1
> > > > > > would leave that state.
> > > > > > 
> > > > > > Obviously we'd have to first figure out another solution
> > > > > > for the AER message flood that 04b12ef163d1 resolved.
> > > > > 
> > > > > If we revert 04b12ef163d1, then VMD driver will still enable
> > > > > AER blindly which is a bug. So we need to find a way to make
> > > > > VMD driver aware of AER platform setting and use that
> > > > > information to enable or disable AER for its child devices.
> > > > > 
> > > > > There is a setting in BIOS that allows us to enable OS
> > > > > native AER support on the platform. This setting is located
> > > > > in EDK Menu -> Platform configuration -> system event log ->
> > > > > IIO error enabling -> OS native AER support.
> > > > > 
> > > > > This setting is assigned to VMD bridge by
> > > > > vmd_copy_host_bridge_flags in patch 04b12ef163d1. Without
> > > > > the patch 04b12ef163d1, VMD driver will enable AER even if
> > > > > platform has disabled OS native AER support as mentioned
> > > > > earlier. This will result in an AER flood mentioned in
> > > > > 04b12ef163d1 since there is no AER handlers. 
> > 
> > I missed this before.  What does "there are no AER handlers" mean?
> > Do you mean there are no *firmware* AER handlers?  
>
> Sorry for confusing wordings. Your understanding is correct. The patch
> 04b12ef163d1 is used to disable AER on vmd and avoid the AER flooding
> by applying _OSC settings since vmd driver doesn't give a choice to
> toggle AER, DPC, LTR, etc.
> > 
> > The dmesg log at https://bugzilla.kernel.org/attachment.cgi?id=299571
> > (from https://bugzilla.kernel.org/show_bug.cgi?id=215027, the bug
> > fixed by 04b12ef163d1), shows this:
> > 
> >   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM
> > Segments MSI HPX-Type3]
> >   acpi PNP0A08:00: _OSC: platform does not support [AER]
> >   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME
> > PCIeCapability LTR]
> > 
> > so the firmware did not grant AER control to the OS (I think
> > "platform does not support" is a confusing description).
> > 
> > Prior to 04b12ef163d1, Linux applied _OSC above the VMD bridge but
> > not below it, so Linux had native control below VMD and it did
> > handle AER interrupts from the 10000:e0:06.0 Root Port below VMD:
> > 
> >   vmd 0000:00:0e.0: PCI host bridge to bus 10000:e0
> >   pcieport 10000:e0:06.0: AER: Corrected error received:
> > 10000:e1:00.0
> > 
> > After 04b12ef163d1, Linux applied _OSC below the VMD bridge as well,
> > so it did not enable or handle any AER interrupts.  I suspect the
> > platform didn't handle AER interrupts below VMD either, so those
> > errors were probably just ignored.
> > 
> > > > > If VMD is aware of OS native AER support setting, then we
> > > > > will not see AER flooding issue.
> > > > > 
> > > > > Do you have any suggestion on how to make VMD driver aware
> > > > > of AER setting and keep it in sync with platform setting.
> > > > 
> > > > Well, this is the whole problem.  IIUC, you're saying we
> > > > should use _OSC to negotiate for AER control below the VMD
> > > > bridge, but ignore _OSC for hotplug control.
> > > 
> > > Because VMD has its own hotplug BIOS setting which allows vmd to
> > > enable or disable hotplug on its domain. However we don't have
> > > VMD specific AER, DPC, LTR settings. 
> > 
> > I don't quite follow.  The OS knows nothing about whether BIOS
> > settings exist, so they can't be used to determine where _OSC
> > applies.
> > 
> > > Is it okay if we include an additional check for hotplug? i.e.
> > > Hotplug capable bit in SltCap register which reflects VMD BIOS
> > > hotplug setting.
> > 
> > I don't know what check you have in mind, but the OS can
> > definitely use SltCap to decide what features to enable.
>
> Yes I agree. That is what I am also suggesting in this patch.

I should have said "the OS can use SltCap to *help* decide what
features to enable."  For ACPI host bridges, _OSC determines whether
hotplug should be operated by the platform or the OS.

I think we're converging on the idea that since VMD is effectively
*not* an ACPI host bridge and doesn't have its own _OSC, the _OSC that
applies to the VMD endpoint does *not* apply to the hierarchy below
the VMD.  In that case, the default is that the OS owns all the
features (hotplug, AER, etc) below the VMD.

> > You suggest above that you want vmd to be aware of AER ownership per
> > _OSC, but it sounds more like you just want AER disabled below VMD
> > regardless.  Or do you have platforms that can actually handle AER
> > interrupts from Root Ports below VMD?
>
> No I dont want AER disabled below VMD all the time. I am suggesting
> vmd should be in sync with all the _OSC flags since vmd doesn't give
> a choice to toggle.

This is what I don't understand.  If we decide that _OSC doesn't apply
below VMD because the VMD host bridge isn't described in ACPI, the
idea of being in sync with _OSC doesn't make sense.

> However, the issue arises in guest where _OSC setting for hotplug is
> always 0 even though hotplug is 1 in host and hotplug enable bit in
> SltCap register is 1 in both host and guest. So we can use that to
> enable hotplug in guest. Like you suggested in your above comment. 

All this got paged out over the holidays, so I need to refresh my
understanding here.  Maybe it will help if we can make the situation
more concrete.  I'm basing this on the logs at
https://bugzilla.kernel.org/show_bug.cgi?id=215027.  I assume the
10000:e0:06.0 Root Port and the 10000:e1:00.0 NVMe device are both
passed through to the guest.  I'm sure I got lots wrong here, so
please correct me :)

  Host OS sees:

    PNP0A08 host bridge to 0000 [bus 00-ff]
      _OSC applies to domain 0000
      OS owns [PCIeHotplug SHPCHotplug PME PCIeCapability LTR] in domain 0000
    vmd 0000:00:0e.0: PCI host bridge to domain 10000 [bus e0-ff]
      no _OSC applies in domain 10000;
      host OS owns all PCIe features in domain 10000
    pci 10000:e0:06.0: [8086:464d]             # VMD root port
    pci 10000:e0:06.0: PCI bridge to [bus e1]
    pci 10000:e0:06.0: SltCap: HotPlug+        # Hotplug Capable
    pci 10000:e1:00.0: [144d:a80a]             # nvme

  Guest OS sees:

    PNP0A03 host bridge to 0000 [bus 00-ff]
      _OSC applies to domain 0000
      platform owns [PCIeHotplug ...]          # _OSC doesn't grant to OS
    pci 0000:e0:06.0: [8086:464d]              # VMD root port
    pci 0000:e0:06.0: PCI bridge to [bus e1]
    pci 0000:e0:06.0: SltCap: HotPlug+         # Hotplug Capable
    pci 0000:e1:00.0: [144d:a80a]             # nvme

So the guest OS sees that the VMD Root Port supports hotplug, but it
can't use it because it was not granted ownership of the feature?

Bjorn

  parent reply	other threads:[~2024-01-12 22:55 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-27 21:17 [PATCH v2] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports Nirmal Patel
2023-12-02  0:07 ` Bjorn Helgaas
2023-12-05 22:20   ` Nirmal Patel
2023-12-06  0:27     ` Bjorn Helgaas
2023-12-11 23:05       ` Nirmal Patel
2023-12-12 21:13         ` Bjorn Helgaas
2023-12-14  1:07           ` Nirmal Patel
2023-12-14 19:23             ` Bjorn Helgaas
2023-12-14 22:22               ` Nirmal Patel
2024-01-12  0:02                 ` Nirmal Patel
2024-01-12 22:55                 ` Bjorn Helgaas [this message]
2024-01-16 20:37                   ` Nirmal Patel
2024-01-17  0:49                     ` Bjorn Helgaas
2024-02-01 21:16                       ` Bjorn Helgaas
2024-02-01 18:38                     ` Nirmal Patel
2024-02-01 23:00                       ` Bjorn Helgaas
2024-02-07  0:27                         ` Nirmal Patel
2024-02-07 18:55                           ` Bjorn Helgaas
2024-02-13 17:47                             ` Nirmal Patel
2024-03-06 22:27                               ` Nirmal Patel
2024-03-07  6:44                                 ` Kai-Heng Feng
2024-03-08  0:09                                   ` Nirmal Patel
2024-03-15  1:29                                     ` Kai-Heng Feng
2024-03-22 20:57                                       ` Nirmal Patel
2024-03-22 21:37                                         ` Bjorn Helgaas
2024-03-22 22:43                                           ` Paul M Stillwell Jr
2024-03-22 23:36                                             ` Bjorn Helgaas
2024-03-25 15:10                                               ` Paul M Stillwell Jr
2024-03-26  0:17                                               ` Nirmal Patel
2024-03-26  1:59                                                 ` Kai-Heng Feng
2024-03-26 15:51                                                   ` Paul M Stillwell Jr
2024-03-26 16:03                                                     ` Paul M Stillwell Jr
2024-03-26 21:08                                                     ` Bjorn Helgaas
2024-04-02 16:10                                                       ` Paul M Stillwell Jr
2024-02-01 22:22                     ` Bjorn Helgaas
2024-01-16 23:54     ` Bjorn Helgaas
2024-02-14 13:43     ` Lorenzo Pieralisi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240112225504.GA1129179@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=nirmal.patel@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=samruddh.dhope@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.