linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Blazej Kucman <blazej.kucman@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	linux-pci@vger.kernel.org,
	Blazej Kucman <blazej.kucman@intel.com>,
	Hans de Goede <hdegoede@redhat.com>,
	Lukas Wunner <lukas@wunner.de>,
	Naveen Naidu <naveennaidu479@gmail.com>,
	Keith Busch <kbusch@kernel.org>,
	Nirmal Patel <nirmal.patel@linux.intel.com>,
	Jonathan Derrick <jonathan.derrick@linux.dev>
Subject: Re: [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1
Date: Wed, 2 Feb 2022 16:48:01 +0100	[thread overview]
Message-ID: <20220202164801.00007228@linux.intel.com> (raw)
In-Reply-To: <20220128140328.GA206121@bhelgaas>

On Fri, 28 Jan 2022 08:03:28 -0600
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Fri, Jan 28, 2022 at 09:49:34PM +0800, Kai-Heng Feng wrote:
> > On Fri, Jan 28, 2022 at 9:08 PM Bjorn Helgaas <helgaas@kernel.org>
> > wrote:  
> > > On Fri, Jan 28, 2022 at 09:29:31AM +0100, Mariusz Tkaczyk wrote:  
> > > > On Thu, 27 Jan 2022 20:52:12 -0600
> > > > Bjorn Helgaas <helgaas@kernel.org> wrote:  
> > > > > On Thu, Jan 27, 2022 at 03:46:15PM +0100, Mariusz Tkaczyk
> > > > > wrote:  
> > > > > > ...
> > > > > > Thanks for your suggestions. Blazej did some tests and
> > > > > > results were inconclusive. He tested it on two same
> > > > > > platforms. On the first one it didn't work, even if he
> > > > > > reverted all suggested patches. On the second one hotplugs
> > > > > > always worked.
> > > > > >
> > > > > > He noticed that on first platform where issue has been found
> > > > > > initally, there was boot parameter "pci=nommconf". After
> > > > > > adding this parameter on the second platform, hotplugs
> > > > > > stopped working too.
> > > > > >
> > > > > > Tested on tag pci-v5.17-changes. He have
> > > > > > CONFIG_HOTPLUG_PCI_PCIE and CONFIG_DYNAMIC_DEBUG enabled in
> > > > > > config. He also attached two dmesg logs to bugzilla with
> > > > > > boot parameter 'dyndbg="file pciehp* +p" as requested. One
> > > > > > with "pci=nommconf" and one without.
> > > > > >
> > > > > > Issue seems to related to "pci=nommconf" and it is probably
> > > > > > caused by change outside pciehp.  
> > > > >
> > > > > Maybe I'm missing something.  If I understand correctly, the
> > > > > problem has nothing to do with the kernel version (correct me
> > > > > if I'm wrong!)  
> > > >
> > > > The problem occurred after the merge commit. It is some kind of
> > > > regression.  
> > >
> > > The bug report doesn't yet contain the evidence showing this.  It
> > > only contains dmesg logs with "pci=nommconf" where pciehp doesn't
> > > work (which is the expected behavior) and a log without
> > > "pci=nommconf" where pciehp does work (which is again the
> > > expected behavior). 
> > > > > PCIe native hotplug doesn't work when booted with
> > > > > "pci=nommconf". When using "pci=nommconf", obviously we can't
> > > > > access the extended PCI config space (offset 0x100-0xfff), so
> > > > > none of the extended capabilities are available.
> > > > >
> > > > > In that case, we don't even ask the platform for control of
> > > > > PCIe hotplug via _OSC.  From the dmesg diff from normal
> > > > > (working) to "pci=nommconf" (not working):
> > > > >
> > > > >   -Command line: BOOT_IMAGE=/boot/vmlinuz-smp ...
> > > > >   +Command line: BOOT_IMAGE=/boot/vmlinuz-smp pci=nommconf ...
> > > > >   ...
> > > > >   -acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM
> > > > > ClockPM Segments MSI HPX-Type3] -acpi PNP0A08:00: _OSC:
> > > > > platform does not support [AER LTR] -acpi PNP0A08:00: _OSC:
> > > > > OS now controls [PCIeHotplug PME PCIeCapability] +acpi
> > > > > PNP0A08:00: _OSC: OS supports [ASPM ClockPM Segments MSI
> > > > > HPX-Type3] +acpi PNP0A08:00: _OSC: not requesting OS control;
> > > > > OS requires [ExtendedConfig ASPM ClockPM MSI] +acpi
> > > > > PNP0A08:00: MMCONFIG is disabled, can't access extended PCI
> > > > > configuration space under this bridge.  
> > > >
> > > > So, it shouldn't work from years but it has been broken
> > > > recently, that is the only objection I have. Could you tell why
> > > > it was working? According to your words- it shouldn't. We are
> > > > using VMD driver, is that matter?  
> > >
> > > 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") looks
> > > like a it could be related.  Try reverting that commit and see
> > > whether it makes a difference.  
> > 
> > The affected NVMe is indeed behind VMD domain, so I think the commit
> > can make a difference.
> > 
> > Does VMD behave differently on laptops and servers?
> > Anyway, I agree that the issue really lies in "pci=nommconf".  
> 
> Oh, I have a guess:
> 
>   - With "pci=nommconf", prior to v5.17-rc1, pciehp did not work in
>     general, but *did* work for NVMe behind a VMD.  As of v5.17-rc1,
>     pciehp no longer works for NVMe behind VMD.
> 
>   - Without "pci=nommconf", pciehp works as expected for all devices
>     including NVMe behind VMD, both before and after v5.17-rc1.
> 
> Is that what you're observing?
> 
> If so, I doubt there's anything to fix other than getting rid of
> "pci=nommconf".
> 
> Bjorn

I haven't tested with VMD disabled earlier. I verified it and my
observations are as follows:

OS: RHEL 8.4
NO - hotplug not working
YES - hotplug working

pci=nommconf added:
+--------------+-------------------+---------------------+--------------+
|              | pci-v5.17-changes | revert-04b12ef163d1 | inbox kernel
+--------------+-------------------+---------------------+--------------+
| VMD enabled  | NO                | YES                 | YES         
+--------------+-------------------+---------------------+--------------+
| VMD disabled | NO                | NO                  | NO
+--------------+-------------------+---------------------+--------------+

without pci=nommconf:
+--------------+-------------------+---------------------+--------------+
|              | pci-v5.17-changes | revert-04b12ef163d1 | inbox kernel
+--------------+-------------------+---------------------+--------------+
| VMD enabled  | YES               | YES                 | YES
+--------------+-------------------+---------------------+--------------+
| VMD disabled | YES               | YES                 | YES
+--------------+-------------------+---------------------+--------------+

So, results confirmed your assumptions, but I also confirmed that
revert of 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features")
makes it to work as in inbox kernel.

We will drop the legacy parameter in our tests. According to my results
there is a regression in VMD caused by: 04b12ef163d1 commit, even if it
is not working for nvme anyway. Should it be fixed?

Thanks,
Blazej

  reply	other threads:[~2022-02-02 15:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-215525-41252@https.bugzilla.kernel.org/>
2022-01-24 21:46 ` [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1 Bjorn Helgaas
2022-01-25  8:58   ` Hans de Goede
2022-01-25 15:33   ` Lukas Wunner
2022-01-26  7:31   ` Thorsten Leemhuis
2022-01-27 14:46   ` Mariusz Tkaczyk
2022-01-27 20:47     ` Jonathan Derrick
2022-01-27 22:31     ` Jonathan Derrick
2022-01-28  2:52     ` Bjorn Helgaas
2022-01-28  8:29       ` Mariusz Tkaczyk
2022-01-28 13:08         ` Bjorn Helgaas
2022-01-28 13:49           ` Kai-Heng Feng
2022-01-28 14:03             ` Bjorn Helgaas
2022-02-02 15:48               ` Blazej Kucman [this message]
2022-02-02 16:43                 ` Bjorn Helgaas
2022-02-03  9:13                   ` Thorsten Leemhuis
2022-02-03 10:47                     ` Blazej Kucman
2022-02-03 15:58                       ` Bjorn Helgaas
2022-02-09 13:41                         ` Blazej Kucman
2022-02-09 21:02                           ` Bjorn Helgaas
2022-02-10 11:14                             ` Blazej Kucman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220202164801.00007228@linux.intel.com \
    --to=blazej.kucman@linux.intel.com \
    --cc=blazej.kucman@intel.com \
    --cc=hdegoede@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jonathan.derrick@linux.dev \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=naveennaidu479@gmail.com \
    --cc=nirmal.patel@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).