LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Jiasen Lin <linjiasen@hygon.cn>
To: Takashi Iwai <tiwai@suse.de>, Lukas Wunner <lukas@wunner.de>
Cc: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>,
	"mika.westerberg@linux.intel.com"
	<mika.westerberg@linux.intel.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	<alexander.deucher@amd.com>
Subject: Re: Linux v5.5 serious PCI bug
Date: Wed, 11 Dec 2019 15:33:32 +0800
Message-ID: <50eb6e32-296a-0086-f172-60103b89b5a5@hygon.cn> (raw)
In-Reply-To: <s5h1rtccpqh.wl-tiwai@suse.de>



On 2019/12/10 20:46, Takashi Iwai wrote:
> On Tue, 10 Dec 2019 13:29:41 +0100,
> Lukas Wunner wrote:
>>
>> [cc += Alex, Takashi]
>>
>> On Tue, Dec 10, 2019 at 12:00:23PM +0000, Nicholas Johnson wrote:
>>> On Tue, Dec 10, 2019 at 09:28:00AM +0200, mika.westerberg@linux.intel.com wrote:
>>>> On Mon, Dec 09, 2019 at 01:33:49PM +0000, Nicholas Johnson wrote:
>>>>> On Mon, Dec 09, 2019 at 03:12:39PM +0200, mika.westerberg@linux.intel.com wrote:
>>>>>> On Mon, Dec 09, 2019 at 12:34:04PM +0000, Nicholas Johnson wrote:
>>>>>>> I have compiled Linux v5.5-rc1 and thought all was good until I
>>>>>>> hot-removed a Gigabyte Aorus eGPU from Thunderbolt. The driver for the
>>>>>>> GPU was not loaded (blacklisted) so the crash is nothing to do with the
>>>>>>> GPU driver.
>>>>>>>
>>>>>>> We had:
>>>>>>> - kernel NULL pointer dereference
>>>>>>> - refcount_t: underflow; use-after-free.
>>>
>>> The following is the culprit responsible for the issues:
>>>
>>> commit 586bc4aab878efcf672536f0cdec3d04b6990c94
>>> Author: Alex Deucher <alexander.deucher@amd.com>
>>> Date:   Fri Nov 22 16:43:50 2019 -0500
>>>
>>>      ALSA: hda/hdmi - fix vgaswitcheroo detection for AMD
>>
>> Does the below fix the issue?
>>
>> -- >8 --
>> diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
>> index 35b4526f0d28..b856b89378ac 100644
>> --- a/sound/pci/hda/hda_intel.c
>> +++ b/sound/pci/hda/hda_intel.c
>> @@ -1419,7 +1419,6 @@ static bool atpx_present(void)
>>   				return true;
>>   			}
>>   		}
>> -		pci_dev_put(pdev);
>>   	}
>>   	return false;
>>   }
> 
> Oh this looks really like a bug, even if this isn't the root cause.
> 
> Care to submit a proper patch?
> 
Hi Nicholas
I have disassembled the vmlinux by objdump -Sdl, but is it too big to
send as an attachment. Just take a look at the key information:
The file dmesg-5 show trace:
[   71.693210] general protection fault: 0000 [#1] SMP PTI
[   71.693215] CPU: 1 PID: 184 Comm: irq/128-pciehp Not tainted 5.5.0-rc1 #2
[   71.693217] Hardware name: Dell Inc. XPS 13 9370/09DKKT, BIOS 1.11.1 
07/11/2019
[   71.693223] RIP: 0010:pci_remove_bus_device+0x9a/0x100
[   71.693250] Call Trace:
[   71.693256]  pci_remove_bus_device+0x31/0x100
[   71.693259]  pci_remove_bus_device+0x31/0x100
[   71.693261]  pci_stop_and_remove_bus_device+0x1b/0x20

.....

[   71.709515] Call Trace:
[   71.709527]  kobject_put+0xfc/0x1c0
[   71.709536]  __device_link_free_srcu+0x51/0x60
[   71.709538]  srcu_invoke_callbacks+0xd2/0x170
[   71.709547]  process_one_work+0x1ec/0x3a0
[   71.709550]  worker_thread+0x4d/0x400
[   71.709554]  kthread+0x104/0x140
[   71.709557]  ? process_one_work+0x3a0/0x3a0
[   71.709559]  ? kthread_park+0x90/0x90
[   71.709563]  ret_from_fork+0x35/0x40

the disassembling file
ffffffff814eea30 <pci_remove_bus_device>:
pci_remove_bus_device():
/home/higon/linux/drivers/pci/remove.c:86

static void pci_remove_bus_device(struct pci_dev *dev)
{
ffffffff814eea30:	e8 bb 2e 51 00       	callq  ffffffff81a018f0 <__fentry__>
ffffffff814eea35:	41 55                	push   %r13
ffffffff814eea37:	41 54                	push   %r12
ffffffff814eea39:	55                   	push   %rbp
ffffffff814eea3a:	53                   	push   %rbx
ffffffff814eea3b:	48 89 fd             	mov    %rdi,%rbp
/home/higon/linux/drivers/pci/remove.c:87
	struct pci_bus *bus = dev->subordinate;
ffffffff814eea3e:	4c 8b 6f 18          	mov    0x18(%rdi),%r13
/home/higon/linux/drivers/pci/remove.c:90
	struct pci_dev *child, *tmp;

........

static inline void list_del(struct list_head *entry)
{
	__list_del_entry(entry);
	entry->next = LIST_POISON1;
ffffffff814eeac0:	48 b8 00 01 00 00 00 	movabs $0xdead000000000100,%rax
ffffffff814eeac7:	00 ad de
ffffffff814eeaca:	48 89 45 00          	mov    %rax,0x0(%rbp)
/home/higon/linux/./include/linux/list.h:141
	entry->prev = LIST_POISON2;
ffffffff814eeace:	48 b8 22 01 00 00 00 	movabs $0xdead000000000122,%rax
ffffffff814eead5:	00 ad de
ffffffff814eead8:	48 89 45 08          	mov    %rax,0x8(%rbp)
pci_destroy_dev():
/home/higon/linux/drivers/pci/remove.c:39
ffffffff814eeadc:	e8 2f 23 bf ff       	callq  ffffffff810e0e10 <up_write>
/home/higon/linux/drivers/pci/remove.c:41

In the file source/sound/pci/hda/hda_intel.c, atpx_present can not call
pci_dev_put(pdev) because the reference count for pdev is always
decremented if it is not NULL after call pci_get_class.
Resource of pdev will be released when the reference count for it
decremented to zero. Hotplug service dereferences NULL pointer when 
remove the device from the device lists.

static bool atpx_present(void)
{
	struct pci_dev *pdev = NULL;
	acpi_handle dhandle, atpx_handle;
	acpi_status status;

	while ((pdev = pci_get_class(PCI_BASE_CLASS_DISPLAY << 16, pdev)) != 
NULL) {
		dhandle = ACPI_HANDLE(&pdev->dev);
		if (dhandle) {
			status = acpi_get_handle(dhandle, "ATPX", &atpx_handle);
			if (!ACPI_FAILURE(status)) {
				pci_dev_put(pdev);
				return true;
			}
		}
		pci_dev_put(pdev);
	}
	return false;
}
struct pci_dev *pci_get_class(unsigned int class, struct pci_dev *from)
{
	struct pci_device_id id = {
		.vendor = PCI_ANY_ID,
		.device = PCI_ANY_ID,
		.subvendor = PCI_ANY_ID,
		.subdevice = PCI_ANY_ID,
		.class_mask = PCI_ANY_ID,
		.class = class,
	};

	return pci_get_dev_by_id(&id, from);
}
static struct pci_dev *pci_get_dev_by_id(const struct pci_device_id *id,
					 struct pci_dev *from)
{
	struct device *dev;
	struct device *dev_start = NULL;
	struct pci_dev *pdev = NULL;

	WARN_ON(in_interrupt());
	if (from)
		dev_start = &from->dev;
	dev = bus_find_device(&pci_bus_type, dev_start, (void *)id,
			      match_pci_dev_by_id);
	if (dev)
		pdev = to_pci_dev(dev);
	pci_dev_put(from);
	return pdev;
}

Thanks,
Jiasen Lin

> 
> thanks,
> 
> Takashi
> 

  reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-09 12:34 Nicholas Johnson
2019-12-09 12:37 ` Pavel Machek
2019-12-09 13:07   ` Nicholas Johnson
2019-12-09 13:12 ` mika.westerberg
2019-12-09 13:29   ` Nicholas Johnson
2019-12-09 13:33   ` Nicholas Johnson
2019-12-10  7:28     ` mika.westerberg
2019-12-10 12:00       ` Nicholas Johnson
2019-12-10 12:29         ` Lukas Wunner
2019-12-10 12:46           ` Takashi Iwai
2019-12-11  7:33             ` Jiasen Lin [this message]
2019-12-10 12:52           ` Nicholas Johnson
2019-12-10 12:34         ` mika.westerberg
2019-12-10 13:39 ` [PATCH] ALSA: hda/hdmi - Fix duplicate unref of pci_dev Lukas Wunner
2019-12-10 13:41   ` Takashi Iwai
2019-12-10 13:47   ` Nicholas Johnson
2019-12-10 13:50     ` Takashi Iwai
2019-12-10 15:34   ` Deucher, Alexander
2019-12-10 15:46     ` Lukas Wunner
2019-12-10 15:53       ` Deucher, Alexander
2019-12-10 16:10         ` Takashi Iwai
2019-12-10 16:51           ` Deucher, Alexander
2019-12-10 16:13         ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50eb6e32-296a-0086-f172-60103b89b5a5@hygon.cn \
    --to=linjiasen@hygon.cn \
    --cc=alexander.deucher@amd.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mika.westerberg@linux.intel.com \
    --cc=nicholas.johnson-opensource@outlook.com.au \
    --cc=tiwai@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git