linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-kernel@vger.kernel.org, terraluna977@gmail.com,
	bhelgaas@google.com, linux-pci@vger.kernel.org, mst@redhat.com,
	rafael@kernel.org, linux-acpi@vger.kernel.org
Subject: Re: [PATCH 1/1] PCI: acpiphp:: use pci_assign_unassigned_bridge_resources() only if bus->self not NULL
Date: Mon, 7 Aug 2023 15:07:46 +0200	[thread overview]
Message-ID: <20230807150746.71ca1df3@imammedo.users.ipa.redhat.com> (raw)
In-Reply-To: <20230804232709.GA174043@bhelgaas>

On Fri, 4 Aug 2023 18:27:09 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Wed, Jul 26, 2023 at 02:35:18PM +0200, Igor Mammedov wrote:
> > Commit [1] switched acpiphp hotplug to use
> >    pci_assign_unassigned_bridge_resources()
> > which depends on bridge being available, however in some cases
> > when acpiphp is in use, enable_slot() can get a slot without
> > bridge associated.  
> 
> acpiphp is *always* in use if we get to enable_slot(), so that doesn't
> really add information here.
> 
> >   1. legitimate case of hotplug on root bus
> >       (likely not exiting on real hw, but widely used in virt world)
> >   2. broken firmware, that sends 'Bus check' events to non
> >      existing root ports (Dell Inspiron 7352/0W6WV0), which somehow
> >      endup at acpiphp:enable_slot(..., bridge = 0) and with bus
> >      without bridge assigned to it.  

how about following commit message (incorporating all feed back in this thread):

-- cut --
Commit [1] switched acpiphp hotplug to use
   pci_assign_unassigned_bridge_resources()
which depends on bridge being available, however in case
when acpiphp is enabled [2], enable_slot() can be called without
bridge associated.
  1. legitimate case of hotplug on root bus
      (widely used in virt world)
  2. a (misbehaving) firmware, that sends 'Bus check' events
     to non existing root ports (Dell Inspiron 7352/0W6WV0),
     which endup at acpiphp:enable_slot(..., bridge = 0)
     where bus has not bridge assigned to it.
     acpihp doesn't know that it's a bridge, and bus specific
     'PCI subsystem' can't argument ACPI context with bridge
     information since the PCI device to get this data from
     is/was not available.

Issue is easy to reproduce with QEMU's 'pc' machine, which supports
PCI hotplug on hostbridge slots. To reproduce boot kernel at
commit [1] in VM started with following CLI (assuming guest root fs
is installed on sda1 partition):

   # qemu-system-x86_64 -M pc -m 1G -enable-kvm -cpu host \
         -monitor stdio -serial file:serial.log           \
         -kernel arch/x86/boot/bzImage                    \
         -append "root=/dev/sda1 console=ttyS0"           \
         guest_disk.img

once guest OS is fully booted at qemu prompt:

(qemu) device_add e1000

(check serial.log) it will cause NULL pointer dereference at:

    void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
    {
        struct pci_bus *parent = bridge->subordinate;

[  612.277651] BUG: kernel NULL pointer dereference, address: 0000000000000018
[...]
[  612.277798]  ? pci_assign_unassigned_bridge_resources+0x1f/0x260
[  612.277804]  ? pcibios_allocate_dev_resources+0x3c/0x2a0
[  612.277809]  enable_slot+0x21f/0x3e0
[  612.277816]  acpiphp_hotplug_notify+0x13d/0x260
[  612.277822]  ? __pfx_acpiphp_hotplug_notify+0x10/0x10
[  612.277827]  acpi_device_hotplug+0xbc/0x540
[  612.277834]  acpi_hotplug_work_fn+0x15/0x20
[  612.277839]  process_one_work+0x1f7/0x370
[  612.277845]  worker_thread+0x45/0x3b0
[  612.277850]  ? __pfx_worker_thread+0x10/0x10
[  612.277854]  kthread+0xdc/0x110
[  612.277860]  ? __pfx_kthread+0x10/0x10
[  612.277866]  ret_from_fork+0x28/0x40
[  612.277871]  ? __pfx_kthread+0x10/0x10
[  612.277876]  ret_from_fork_asm+0x1b/0x30

The issue was discovered on Dell Inspiron 7352/0W6WV0 laptop with
following sequence:
   1. suspend to RAM
   2. wake up with the same backtrace being observed:
   3. 2nd suspend to RAM attempt makes laptop freeze

Fix it by using __pci_bus_assign_resources() instead of
pci_assign_unassigned_bridge_resources() as we used to do
but only in case when bus doesn't have a bridge associated
(to cover for the case of ACPI event on hostbridge or
non existing root port).

That let us keep hotplug on root bus working like it used to be
and at the same time keeps resource reassignment usable on
root ports (and other 1st level bridges) that was fixed by [1].

1)
Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
2) CONFIG_HOTPLUG_PCI_ACPI=y

-- cut --

if commit message is looking acceptable to you, I can respin
amended patch with your suggestions taken in account.

> IIUC, the Inspiron problem happens when:
> 
>   - acpiphp_context->bridge is NULL, so hotplug_event() calls
>     enable_slot() instead of acpiphp_check_bridge(), AND
> 
>   - acpiphp_slot->bus->self is also NULL, because enable_slot() calls
>     pci_assign_unassigned_bridge_resources() with that NULL pointer,
>     which dereferences "bridge->subordinate"
> 
> But I can't figure out why acpiphp_context->bridge is NULL for RP07
> and RP08 (which don't exist), but not for RP03 (which does).
> 
> I guess all the acpiphp_contexts (RP03, RP07, RP08) must be allocated in
> acpiphp_add_context() by acpiphp_init_context().
> 
> Woody's lspci from [1] shows only one Root Port:
> 
>   00:1c.0 Wildcat Point-LP PCI Express Root Port #3
> 
> The DSDT.DSL includes:
> 
>   Device (RP01) _ADR 0x001C0000		# 1c.0
>   Device (RP02) _ADR 0x001C0001		# 1c.1
>   Device (RP03) _ADR 0x001C0002		# 1c.2
>   Device (RP04) _ADR 0x001C0003		# 1c.3
>   Device (RP05) _ADR 0x001C0004		# 1c.4
>   Device (RP06) _ADR 0x001C0005		# 1c.5
>   Device (RP07) _ADR 0x001C0006		# 1c.6
>   Device (RP08) _ADR 0x001C0007		# 1c.7
> 
> I can see why we might need a Bus Check after resume to see if
> something got added while we were suspended.  But I don't see why we
> handle RP03 differently from RP07 and RP08.
> 
> Can you help me out?  I'm lost in a maze of twisty passages, all
> alike.

I'll try to trace call path and see where it leads
(based on a guess in updated commit message: OS/nor ACPI
has info if it's bridge when the device didn't exists
during boot).

(though, I don't think it should hold this patch.
while it would be good to understand where bridge
gets added in acpi context, it's not directly relevant
to the fixing hotplug on hostbridge and buscheck events 
on non-existing root-ports)

> Bjorn
> 
> [1] https://lore.kernel.org/r/92150d8d-8a3a-d600-a996-f60a8e4c876c@gmail.com
> 



  reply	other threads:[~2023-08-07 13:08 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-26 12:35 [PATCH 0/1] PCI: acpiphp: fix regression introduced by 'Reassign resources on bridge if necessary' Igor Mammedov
2023-07-26 12:35 ` [PATCH 1/1] PCI: acpiphp:: use pci_assign_unassigned_bridge_resources() only if bus->self not NULL Igor Mammedov
2023-07-26 14:24   ` Rafael J. Wysocki
2023-07-27  6:04   ` Michael S. Tsirkin
2023-07-27 17:41   ` Bjorn Helgaas
2023-07-28  9:32     ` Igor Mammedov
2023-07-29 21:50       ` Bjorn Helgaas
2023-07-31 12:44         ` Igor Mammedov
2023-07-31 21:42           ` Bjorn Helgaas
2023-07-31 21:54             ` Michael S. Tsirkin
2023-08-01  9:57               ` Igor Mammedov
2023-08-01 12:57                 ` Michael S. Tsirkin
2023-08-04 14:11             ` Igor Mammedov
2023-07-31 12:46         ` [PATCH QEMU] acpiphp: hack to send BusCheck to missing device on root bus Igor Mammedov
2023-08-04 23:27   ` [PATCH 1/1] PCI: acpiphp:: use pci_assign_unassigned_bridge_resources() only if bus->self not NULL Bjorn Helgaas
2023-08-07 13:07     ` Igor Mammedov [this message]
2023-08-07 17:28       ` Bjorn Helgaas
2023-08-08 11:47         ` Igor Mammedov
2023-08-08  9:25   ` Michal Koutný
2023-08-08 19:35     ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230807150746.71ca1df3@imammedo.users.ipa.redhat.com \
    --to=imammedo@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=rafael@kernel.org \
    --cc=terraluna977@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).