From: Bjorn Helgaas <helgaas@kernel.org>
To: "Robert Święcki" <robert@swiecki.net>,
"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
Cc: linux-i2c@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
linux-pci@vger.kernel.org,
"Rafael J. Wysocki" <rafael@kernel.org>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Crashes in 5.15-git in i2c code
Date: Mon, 8 Nov 2021 12:58:23 -0600 [thread overview]
Message-ID: <20211108185823.GA1101310@bhelgaas> (raw)
In-Reply-To: <CAP145phj7jEy6tkdFMdW-rzPprMTUckaaSrtrVysE-u+S+=Lcg@mail.gmail.com>
[+cc Uwe, Rafael, linux-pm, linux-pci, linux-kernel, beginning of
thread: https://lore.kernel.org/linux-i2c/CAP145pgdrdiMAT7=-iB1DMgA7t_bMqTcJL4N0=6u8kNY3EU0dw@mail.gmail.com/T/#t]
On Mon, Nov 08, 2021 at 05:34:14PM +0100, Robert Święcki wrote:
> > I'm daily-driving the linux from Linus' git (recompiling every day or
> > two), and yesterday it stopped booting. Below is the dmesg from
> > pstore.
> > ...
>
> This introduced the bug: 0c5c62ddf88c34bc83b66e4ac9beb2bb0e1887d4
> https://github.com/torvalds/linux/commit/0c5c62ddf88c34bc83b66e4ac9beb2bb0e1887d4
Thank you very much for the debugging and this report! This report is
for i2c, but the problem will affect many drivers.
> > <1>[ 1.431369][ T447] BUG: kernel NULL pointer dereference,
> > address: 0000000000000540
> > <1>[ 1.431371][ T447] #PF: supervisor read access in kernel mode
> > <1>[ 1.431375][ T447] #PF: error_code(0x0000) - not-present page
> > <6>[ 1.431378][ T447] PGD 0 P4D 0
> > <4>[ 1.431384][ T447] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > <4>[ 1.431388][ T447] CPU: 12 PID: 447 Comm: systemd-udevd
> > Tainted: G E 5.15.0+ #91
> > <4>[ 1.431391][ T447] Hardware name: ASUS System Product Name/ROG
> > CROSSHAIR VIII FORMULA, BIOS 3801 07/30/2021
> > <4>[ 1.431392][ T447] RIP: 0010:i2c_dw_pci_resume+0x8/0x40
> > [i2c_designware_pci]
> > <4>[ 1.431399][ T447] Code: 00 00 00 00 66 66 2e 0f 1f 84 00 00 00
> > 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 53 48
> > 8b 5f 78 48 89 df <ff> 93 40 05 00 00 c6 83 c0 05 00 00 00 5b c3 66 66
> > 2e 0f 1f 84 00
> > <4>[ 1.431401][ T447] RSP: 0018:ffffb3e740a13ba8 EFLAGS: 00010286
> > <4>[ 1.431403][ T447] RAX: 0000000000000000 RBX: 0000000000000000
> > RCX: 0000000000000000
$ ./scripts/decodecode < oops
22: 53 push %rbx
23: 48 8b 5f 78 mov 0x78(%rdi),%rbx
27: 48 89 df mov %rbx,%rdi
2a:* ff 93 40 05 00 00 callq *0x540(%rbx) <-- trapping instruction
30: c6 83 c0 05 00 00 00 movb $0x0,0x5c0(%rbx)
37: 5b pop %rbx
38: c3 retq
static int i2c_dw_pci_resume(struct device *dev)
{
struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
int ret;
ret = i_dev->init(i_dev);
i_dev->suspended = false;
return ret;
So I think we're trying to call i_dev->init(), which is a NULL
pointer.
> > <4>[ 1.431422][ T447] pci_pm_runtime_resume+0xaa/0x100
> > <4>[ 1.431434][ T447] __rpm_callback+0x3c/0x100
> > <4>[ 1.431442][ T447] rpm_callback+0x54/0x80
> > <4>[ 1.431445][ T447] rpm_resume+0x410/0x700
> > <4>[ 1.431455][ T447] __pm_runtime_resume+0x45/0x80
> > <4>[ 1.431457][ T447] pci_device_probe+0xa2/0x140
> > <4>[ 1.431459][ T447] really_probe+0x1e4/0x400
> > <4>[ 1.431464][ T447] __driver_probe_device+0xf9/0x180
> > <4>[ 1.431466][ T447] driver_probe_device+0x19/0xc0
I think the problem here is that:
- really_probe() sets dev->driver
- local_pci_probe() calls pm_runtime_get_sync(), which leads to:
- pci_pm_runtime_resume(), which previously skipped the driver's
.runtime_resume() method when "pci_dev->driver" as NULL
- after 2a4d9408c9e8 ("PCI: Use to_pci_driver() instead of
pci_dev->driver") [1], it checks "dev->driver" instead of
"pci_dev->driver"
- dev->driver is non-NULL (set by really_probe() above), but at this
point pci_dev->driver used to be NULL because local_pci_probe()
didn't set it until after after calling pm_runtime_get_sync() (see
b5f9c644eb1b ("PCI: Remove struct pci_dev->driver") [2])
- because dev->driver is non-NULL, we call i2c_dw_pci_resume()
before i2c_dw_pci_probe(), so the driver init hasn't been done
Here's the call tree:
really_probe
dev->driver = drv; # <--
call_driver_probe
dev->bus->probe
pci_device_probe
__pci_device_probe
pci_call_probe
local_pci_probe
pm_runtime_get_sync
...
pci_pm_runtime_resume
- if (!pci_dev->driver) # 2a4d9408c9e8 ("PCI: Use to_pci_driver() instead of pci_dev->driver")
+ if (!to_pci_driver(dev->driver))
return 0
pm->runtime_resume
i2c_dw_pci_resume
i_dev->init() # <-- NULL ptr deref
- pci_dev->driver = pci_drv # b5f9c644eb1b ("PCI: Remove struct pci_dev->driver")
pci_drv->probe
i2c_dw_pci_probe
Note that we used to call pm_runtime_get_sync() (and the driver's
runtime_resume() method) in the window where dev->driver had been set
but pci_dev->driver had not. Since we got rid of pci_dev->driver
altogether, that window no longer exists, so we call the driver's
runtime_resume() when it isn't prepared for it.
[1] https://git.kernel.org/linus/2a4d9408c9e8
[2] https://git.kernel.org/linus/b5f9c644eb1b
next prev parent reply other threads:[~2021-11-08 18:58 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAP145pgwt7svtDwcD=AStKTt_GSN-ZqPL2u74Y63TAY5ghAagQ@mail.gmail.com>
[not found] ` <CAP145pgrL-tOHrxsKwk_yzQihyk4TMFrgBb6zhNgC1i2wUTCeQ@mail.gmail.com>
2021-11-08 15:37 ` Fwd: Crashes in 5.15-git in i2c code Robert Święcki
2021-11-08 16:34 ` Robert Święcki
2021-11-08 18:58 ` Bjorn Helgaas [this message]
2021-11-08 19:09 ` Bjorn Helgaas
2021-11-08 21:22 ` [PATCH] pci: Don't call resume callback for nearly bound devices Uwe Kleine-König
2021-11-08 21:36 ` Robert Święcki
2021-11-09 0:00 ` Krzysztof Wilczyński
2021-11-09 2:56 ` Bjorn Helgaas
2021-11-09 6:42 ` Greg Kroah-Hartman
2021-11-09 6:59 ` Uwe Kleine-König
2021-11-09 12:42 ` Robert Święcki
2021-11-10 21:26 ` Bjorn Helgaas
2021-11-10 22:01 ` Robert Święcki
2021-11-09 17:18 ` Rafael J. Wysocki
2021-11-09 18:12 ` Bjorn Helgaas
2021-11-09 18:52 ` Rafael J. Wysocki
2021-11-09 18:58 ` Rafael J. Wysocki
2021-11-09 20:05 ` Bjorn Helgaas
2021-11-09 20:43 ` Uwe Kleine-König
2021-11-10 14:14 ` Bjorn Helgaas
2021-11-10 16:33 ` Robert Święcki
2021-11-10 16:48 ` Rafael J. Wysocki
2021-11-10 17:59 ` Bjorn Helgaas
2021-11-10 21:19 ` Bjorn Helgaas
2021-11-11 17:01 ` Bjorn Helgaas
2021-11-11 17:32 ` Robert Święcki
2021-11-11 18:09 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211108185823.GA1101310@bhelgaas \
--to=helgaas@kernel.org \
--cc=bhelgaas@google.com \
--cc=linux-i2c@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=robert@swiecki.net \
--cc=u.kleine-koenig@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.