All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Derrick <jonathan.derrick@linux.dev>
To: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci@vger.kernel.org,
	Blazej Kucman <blazej.kucman@intel.com>,
	Hans de Goede <hdegoede@redhat.com>,
	Lukas Wunner <lukas@wunner.de>,
	Naveen Naidu <naveennaidu479@gmail.com>,
	Keith Busch <kbusch@kernel.org>,
	Nirmal Patel <nirmal.patel@linux.intel.com>
Subject: Re: [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1
Date: Thu, 27 Jan 2022 13:47:08 -0700	[thread overview]
Message-ID: <154fcaf2-18cd-9ea9-eee2-bc8b8ee3468d@linux.dev> (raw)
In-Reply-To: <20220127154615.00003df8@linux.intel.com>



On 1/27/2022 7:46 AM, Mariusz Tkaczyk wrote:
> On Mon, 24 Jan 2022 15:46:35 -0600
> Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
>> [+cc linux-pci, Hans, Lukas, Naveen, Keith, Nirmal, Jonathan]
>>
>> On Mon, Jan 24, 2022 at 11:46:14AM +0000,
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=215525
>>>
>>>              Bug ID: 215525
>>>             Summary: HotPlug does not work on upstream kernel
>>> 5.17.0-rc1 Product: Drivers
>>>             Version: 2.5
>>>      Kernel Version: 5.17.0-rc1 upstream
>>>            Hardware: x86-64
>>>                  OS: Linux
>>>                Tree: Mainline
>>>              Status: NEW
>>>            Severity: normal
>>>            Priority: P1
>>>           Component: PCI
>>>            Assignee: drivers_pci@kernel-bugs.osdl.org
>>>            Reporter: blazej.kucman@intel.com
>>>          Regression: No
>>>
>>> Created attachment 300308
>>>    -->
>>> https://bugzilla.kernel.org/attachment.cgi?id=300308&action=edit
>>> dmesg
>>>
>>> While testing on latest upstream
>>> kernel(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/)
>>> we noticed that with the merge commit
>>> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0a231f01e5b25bacd23e6edc7c979a18a517b2b)
>>> hotplug and hotunplug of nvme drives stopped working.
>>>
>>> Rescan PCI does not help.
>>> echo "1" > /sys/bus/pci/rescan
>>>
>>> Issue does not reproduce on a kernel built on an antecedent
>>> commit(88db8458086b1dcf20b56682504bdb34d2bca0e2).
>>>
>>>
>>> During hot-remove device does not disappear, however when we try to
>>> do I/O on the disk then there is an I/O error, and the device
>>> disappears.
>>>
>>> Before I/O no logs regarding the disk appeared in the dmesg, only
>>> after I/O the entries appeared like below:
>>> [  177.943703] nvme nvme5: controller is down; will reset:
>>> CSTS=0xffffffff, PCI_STATUS=0xffff
>>> [  177.971661] nvme 10000:0b:00.0: can't change power state from
>>> D3cold to D0 (config space inaccessible)
>>> [  177.981121] pcieport 10000:00:02.0: can't derive routing for PCI
>>> INT A [  177.987749] nvme 10000:0b:00.0: PCI INT A: no GSI
>>> [  177.992633] nvme nvme5: Removing after probe failure status: -19
>>> [  178.004633] nvme5n1: detected capacity change from 83984375 to 0
>>> [  178.004677] I/O error, dev nvme5n1, sector 0 op 0x0:(READ) flags
>>> 0x0 phys_seg 1 prio class 0
>>>
>>>
>>> OS: RHEL 8.4 GA
>>> Platform: Intel Purley
>>>
>>> The logs are collected on a non-recent upstream kernel, but a issue
>>> also occurs on the newest upstream
>>> kernel(dd81e1c7d5fb126e5fbc5c9e334d7b3ec29a16a0)
>>
>> Apparently worked immediately before merging the PCI changes for
>> v5.17 and failed immediately after:
>>
>>    good: 88db8458086b ("Merge tag 'exfat-for-5.17-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat") bad:
>>   d0a231f01e5b ("Merge tag 'pci-v5.17-changes' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci")
>>
>> Only three commits touch pciehp:
>>
>>    085a9f43433f ("PCI: pciehp: Use down_read/write_nested(reset_lock)
>> to fix lockdep errors") 23584c1ed3e1 ("PCI: pciehp: Fix infinite loop
>> in IRQ handler upon power fault") a3b0f10db148 ("PCI: pciehp: Use
>> PCI_POSSIBLE_ERROR() to check config reads")
>>
>> None seems obviously related to me.  Blazej, could you try setting
>> CONFIG_DYNAMIC_DEBUG=y and booting with 'dyndbg="file pciehp* +p"' to
>> enable more debug messages?
>>
> 
> Hi Bjorn,
> 
> Thanks for your suggestions. Blazej did some tests and results were
> inconclusive. He tested it on two same platforms. On the first one it
> didn't work, even if he reverted all suggested patches. On the second
> one hotplugs always worked.
> 
> He noticed that on first platform where issue has been found initally,
> there was boot parameter "pci=nommconf". After adding this parameter
> on the second platform, hotplugs stopped working too.
> 
> Tested on tag pci-v5.17-changes. He have CONFIG_HOTPLUG_PCI_PCIE
> and CONFIG_DYNAMIC_DEBUG enabled in config. He also attached two dmesg
> logs to bugzilla with boot parameter 'dyndbg="file pciehp* +p" as
> requested. One with "pci=nommconf" and one without.
> 
> Issue seems to related to "pci=nommconf" and it is probably caused
> by change outside pciehp.

Could it be related to this?

int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int 
devfn, int reg, int len, u32 *val)
{
	if (domain == 0 && reg < 256 && raw_pci_ops)
		return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
	if (raw_pci_ext_ops)
		return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
	return -EINVAL;
}

It looks like raw_pci_ext_ops won't be set with nommconf, and VMD 
subdevice domain will be > 0.


> 
> He is currently working on email client setup to answer himself.
> 
> Thanks,
> Mariusz
> 
> 

  reply	other threads:[~2022-01-27 20:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-215525-41252@https.bugzilla.kernel.org/>
2022-01-24 21:46 ` [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1 Bjorn Helgaas
2022-01-25  8:58   ` Hans de Goede
2022-01-25 15:33   ` Lukas Wunner
2022-01-26  7:31   ` Thorsten Leemhuis
2022-02-02 19:22     ` Lukas Wunner
2022-01-27 14:46   ` Mariusz Tkaczyk
2022-01-27 20:47     ` Jonathan Derrick [this message]
2022-01-27 22:31     ` Jonathan Derrick
2022-01-28  2:52     ` Bjorn Helgaas
2022-01-28  8:29       ` Mariusz Tkaczyk
2022-01-28 13:08         ` Bjorn Helgaas
2022-01-28 13:49           ` Kai-Heng Feng
2022-01-28 14:03             ` Bjorn Helgaas
2022-02-02 15:48               ` Blazej Kucman
2022-02-02 16:43                 ` Bjorn Helgaas
2022-02-03  9:13                   ` Thorsten Leemhuis
2022-02-03 10:47                     ` Blazej Kucman
2022-02-03 15:58                       ` Bjorn Helgaas
2022-02-09 13:41                         ` Blazej Kucman
2022-02-09 21:02                           ` Bjorn Helgaas
2022-02-10 11:14                             ` Blazej Kucman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=154fcaf2-18cd-9ea9-eee2-bc8b8ee3468d@linux.dev \
    --to=jonathan.derrick@linux.dev \
    --cc=blazej.kucman@intel.com \
    --cc=hdegoede@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=naveennaidu479@gmail.com \
    --cc=nirmal.patel@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.