All of lore.kernel.org
 help / color / mirror / Atom feed
From: Blazej Kucman <blazej.kucman@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	linux-pci@vger.kernel.org,
	Blazej Kucman <blazej.kucman@intel.com>,
	Hans de Goede <hdegoede@redhat.com>,
	Lukas Wunner <lukas@wunner.de>,
	Naveen Naidu <naveennaidu479@gmail.com>,
	Keith Busch <kbusch@kernel.org>,
	Nirmal Patel <nirmal.patel@linux.intel.com>,
	Jonathan Derrick <jonathan.derrick@linux.dev>
Subject: Re: [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1
Date: Wed, 2 Feb 2022 16:48:01 +0100	[thread overview]
Message-ID: <20220202164801.00007228@linux.intel.com> (raw)
In-Reply-To: <20220128140328.GA206121@bhelgaas>

On Fri, 28 Jan 2022 08:03:28 -0600
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Fri, Jan 28, 2022 at 09:49:34PM +0800, Kai-Heng Feng wrote:
> > On Fri, Jan 28, 2022 at 9:08 PM Bjorn Helgaas <helgaas@kernel.org>
> > wrote:  
> > > On Fri, Jan 28, 2022 at 09:29:31AM +0100, Mariusz Tkaczyk wrote:  
> > > > On Thu, 27 Jan 2022 20:52:12 -0600
> > > > Bjorn Helgaas <helgaas@kernel.org> wrote:  
> > > > > On Thu, Jan 27, 2022 at 03:46:15PM +0100, Mariusz Tkaczyk
> > > > > wrote:  
> > > > > > ...
> > > > > > Thanks for your suggestions. Blazej did some tests and
> > > > > > results were inconclusive. He tested it on two same
> > > > > > platforms. On the first one it didn't work, even if he
> > > > > > reverted all suggested patches. On the second one hotplugs
> > > > > > always worked.
> > > > > >
> > > > > > He noticed that on first platform where issue has been found
> > > > > > initally, there was boot parameter "pci=nommconf". After
> > > > > > adding this parameter on the second platform, hotplugs
> > > > > > stopped working too.
> > > > > >
> > > > > > Tested on tag pci-v5.17-changes. He have
> > > > > > CONFIG_HOTPLUG_PCI_PCIE and CONFIG_DYNAMIC_DEBUG enabled in
> > > > > > config. He also attached two dmesg logs to bugzilla with
> > > > > > boot parameter 'dyndbg="file pciehp* +p" as requested. One
> > > > > > with "pci=nommconf" and one without.
> > > > > >
> > > > > > Issue seems to related to "pci=nommconf" and it is probably
> > > > > > caused by change outside pciehp.  
> > > > >
> > > > > Maybe I'm missing something.  If I understand correctly, the
> > > > > problem has nothing to do with the kernel version (correct me
> > > > > if I'm wrong!)  
> > > >
> > > > The problem occurred after the merge commit. It is some kind of
> > > > regression.  
> > >
> > > The bug report doesn't yet contain the evidence showing this.  It
> > > only contains dmesg logs with "pci=nommconf" where pciehp doesn't
> > > work (which is the expected behavior) and a log without
> > > "pci=nommconf" where pciehp does work (which is again the
> > > expected behavior). 
> > > > > PCIe native hotplug doesn't work when booted with
> > > > > "pci=nommconf". When using "pci=nommconf", obviously we can't
> > > > > access the extended PCI config space (offset 0x100-0xfff), so
> > > > > none of the extended capabilities are available.
> > > > >
> > > > > In that case, we don't even ask the platform for control of
> > > > > PCIe hotplug via _OSC.  From the dmesg diff from normal
> > > > > (working) to "pci=nommconf" (not working):
> > > > >
> > > > >   -Command line: BOOT_IMAGE=/boot/vmlinuz-smp ...
> > > > >   +Command line: BOOT_IMAGE=/boot/vmlinuz-smp pci=nommconf ...
> > > > >   ...
> > > > >   -acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM
> > > > > ClockPM Segments MSI HPX-Type3] -acpi PNP0A08:00: _OSC:
> > > > > platform does not support [AER LTR] -acpi PNP0A08:00: _OSC:
> > > > > OS now controls [PCIeHotplug PME PCIeCapability] +acpi
> > > > > PNP0A08:00: _OSC: OS supports [ASPM ClockPM Segments MSI
> > > > > HPX-Type3] +acpi PNP0A08:00: _OSC: not requesting OS control;
> > > > > OS requires [ExtendedConfig ASPM ClockPM MSI] +acpi
> > > > > PNP0A08:00: MMCONFIG is disabled, can't access extended PCI
> > > > > configuration space under this bridge.  
> > > >
> > > > So, it shouldn't work from years but it has been broken
> > > > recently, that is the only objection I have. Could you tell why
> > > > it was working? According to your words- it shouldn't. We are
> > > > using VMD driver, is that matter?  
> > >
> > > 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") looks
> > > like a it could be related.  Try reverting that commit and see
> > > whether it makes a difference.  
> > 
> > The affected NVMe is indeed behind VMD domain, so I think the commit
> > can make a difference.
> > 
> > Does VMD behave differently on laptops and servers?
> > Anyway, I agree that the issue really lies in "pci=nommconf".  
> 
> Oh, I have a guess:
> 
>   - With "pci=nommconf", prior to v5.17-rc1, pciehp did not work in
>     general, but *did* work for NVMe behind a VMD.  As of v5.17-rc1,
>     pciehp no longer works for NVMe behind VMD.
> 
>   - Without "pci=nommconf", pciehp works as expected for all devices
>     including NVMe behind VMD, both before and after v5.17-rc1.
> 
> Is that what you're observing?
> 
> If so, I doubt there's anything to fix other than getting rid of
> "pci=nommconf".
> 
> Bjorn

I haven't tested with VMD disabled earlier. I verified it and my
observations are as follows:

OS: RHEL 8.4
NO - hotplug not working
YES - hotplug working

pci=nommconf added:
+--------------+-------------------+---------------------+--------------+
|              | pci-v5.17-changes | revert-04b12ef163d1 | inbox kernel
+--------------+-------------------+---------------------+--------------+
| VMD enabled  | NO                | YES                 | YES         
+--------------+-------------------+---------------------+--------------+
| VMD disabled | NO                | NO                  | NO
+--------------+-------------------+---------------------+--------------+

without pci=nommconf:
+--------------+-------------------+---------------------+--------------+
|              | pci-v5.17-changes | revert-04b12ef163d1 | inbox kernel
+--------------+-------------------+---------------------+--------------+
| VMD enabled  | YES               | YES                 | YES
+--------------+-------------------+---------------------+--------------+
| VMD disabled | YES               | YES                 | YES
+--------------+-------------------+---------------------+--------------+

So, results confirmed your assumptions, but I also confirmed that
revert of 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features")
makes it to work as in inbox kernel.

We will drop the legacy parameter in our tests. According to my results
there is a regression in VMD caused by: 04b12ef163d1 commit, even if it
is not working for nvme anyway. Should it be fixed?

Thanks,
Blazej

  reply	other threads:[~2022-02-02 15:48 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-215525-41252@https.bugzilla.kernel.org/>
2022-01-24 21:46 ` [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1 Bjorn Helgaas
2022-01-25  8:58   ` Hans de Goede
2022-01-25 15:33   ` Lukas Wunner
2022-01-26  7:31   ` Thorsten Leemhuis
2022-02-02 19:22     ` Lukas Wunner
2022-01-27 14:46   ` Mariusz Tkaczyk
2022-01-27 20:47     ` Jonathan Derrick
2022-01-27 22:31     ` Jonathan Derrick
2022-01-28  2:52     ` Bjorn Helgaas
2022-01-28  8:29       ` Mariusz Tkaczyk
2022-01-28 13:08         ` Bjorn Helgaas
2022-01-28 13:49           ` Kai-Heng Feng
2022-01-28 14:03             ` Bjorn Helgaas
2022-02-02 15:48               ` Blazej Kucman [this message]
2022-02-02 16:43                 ` Bjorn Helgaas
2022-02-03  9:13                   ` Thorsten Leemhuis
2022-02-03 10:47                     ` Blazej Kucman
2022-02-03 15:58                       ` Bjorn Helgaas
2022-02-09 13:41                         ` Blazej Kucman
2022-02-09 21:02                           ` Bjorn Helgaas
2022-02-10 11:14                             ` Blazej Kucman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220202164801.00007228@linux.intel.com \
    --to=blazej.kucman@linux.intel.com \
    --cc=blazej.kucman@intel.com \
    --cc=hdegoede@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jonathan.derrick@linux.dev \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=naveennaidu479@gmail.com \
    --cc=nirmal.patel@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.