From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
To: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] driver core: platform: don't oops on unbound devices
Date: Sat, 12 Dec 2020 23:49:26 +0300 [thread overview]
Message-ID: <CAA8EJpqwJKwYS=9o5Vtqwmi5qGd33woK_q4NO5h6mh-f3G+NtA@mail.gmail.com> (raw)
In-Reply-To: <20201212153929.yn47oz4fm37ysrry@pengutronix.de>
Hello,
On Sat, 12 Dec 2020 at 18:39, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> Hello,
>
> On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote:
> > On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote:
> > > Platform code stopped checking if the device is bound to the actual
> > > platform driver, thus calling non-existing drv->shutdown(). Verify that
> > > _dev->driver is not NULL before calling remove/shutdown callbacks.
> > >
> > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
> > > ---
> > > drivers/base/platform.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > > index 0358dc3ea3ad..93f44e69b472 100644
> > > --- a/drivers/base/platform.c
> > > +++ b/drivers/base/platform.c
> > > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev)
> > > struct platform_device *dev = to_platform_device(_dev);
> > > int ret = 0;
> > >
> > > - if (drv->remove)
> > > + if (_dev->driver && drv->remove)
> > > ret = drv->remove(dev);
> > > dev_pm_domain_detach(_dev, true);
> >
> > I don't object to this, but it always feels odd to be doing pointer math
> > on a NULL value, wait until the static-checkers get ahold of this and
> > you get crazy emails saying you are crashing the kernel (hint, they are
> > broken).
>
> I think you refer to the line
>
> struct platform_driver *drv = to_platform_driver(_dev->driver);
>
> which when _dev->driver is NULL results in drv being something really
> big?!
Yes. To remove pointer math on NULL value I can move the check for
_dev->driver before calculating drv.
>
> Accoding to my understanding platform_remove() shouldn't be called if
> the device isn't bound to a driver.
>
> > But, I don't see why this check is needed? If a driver is not bound to
> > a device, shouldn't this whole function just not be called? Or error
> > out at the top?
> >
> > Uwe, I'd really like your review/ack of this before taking it.
>
> So I agree and have the same question. So I wonder: @Dmitry, did you see
> a crash? When did it happen?
The crash happens in the platform_shutdown() function, which gets
called for unbound devices after commit 9c30921fe ("driver core:
platform: use bus_type functions").
I can include crash trace into v2.
I added a check to platform_remove() as a safety measure. All current
calls for dev->bus->remove() in dd.c seem to happen only when
dev->driver is set, but I thought that it might be a good check. I can
drop it if you'd like.
>
> For one of the bus types I changed recently
> (arch/powerpc/platforms/ps3/system-bus.c) the bus's shutdown function
> does:
>
> if (drv->shutdown)
> drv->shutdown(dev);
> else if (drv->remove) {
> dev_dbg(&dev->core, ...
> drv->remove(dev);
> } ...
>
> but for the platform bus I'm not aware that remove is used in absence of
> a shutdown callback.
>
> Relevant callers of bus->remove are all in drivers/base/dd.c, and for
> all of them dev->driver should be set.
>
> I look forward to an explaination about why this patch was created.
Here is an explanation: the 3d6a0000.gmu device is not bound to a
driver, causing a crash during reboot.
[ 57.832972] platform 3d6a000.gmu: shutdown
[ 57.837778] Unable to handle kernel paging request at virtual
address ffffffffffffffe8
[ 57.846391] Mem abort info:
[ 57.849704] ESR = 0x96000004
[ 57.853286] EC = 0x25: DABT (current EL), IL = 32 bits
[ 57.859177] SET = 0, FnV = 0
[ 57.862751] EA = 0, S1PTW = 0
[ 57.866415] Data abort info:
[ 57.869801] ISV = 0, ISS = 0x00000004
[ 57.874171] CM = 0, WnR = 0
[ 57.877634] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a1646000
[ 57.884937] [ffffffffffffffe8] pgd=0000000000000000, p4d=0000000000000000
[ 57.892323] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 57.898471] Modules linked in:
[ 57.902022] CPU: 7 PID: 387 Comm: reboot Tainted: G W
5.10.0-rc7-next-20201211-13328-gb9e15b9c1940-dirty #1270
[ 57.914043] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
[ 57.921340] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 57.927930] pc : platform_shutdown+0x8/0x34
[ 57.932661] lr : device_shutdown+0x158/0x32c
[ 57.937483] sp : ffff800010773c70
[ 57.941319] x29: ffff800010773c70 x28: ffff14f80c41c600
[ 57.947208] x27: 0000000000000000 x26: ffff14f80129c490
[ 57.953100] x25: ffffaa6264ece398 x24: 0000000000000008
[ 57.958990] x23: ffffaa62655be030 x22: ffffaa6265671600
[ 57.964875] x21: ffff14f80122b010 x20: ffff14f80129c410
[ 57.970765] x19: ffff14f80129c418 x18: 0000000000000030
[ 57.976665] x17: 0000000000000000 x16: 0000000000000001
[ 57.982590] x15: 0000000000000004 x14: 000000000000019f
[ 57.988478] x13: 0000000000000000 x12: 0000000000000000
[ 57.994394] x11: 0000000000000000 x10: 00000000000009b0
[ 58.000297] x9 : ffff800010773920 x8 : ffff14f80c41d010
[ 58.006205] x7 : ffff14f976ff59c0 x6 : 0000000000000192
[ 58.012112] x5 : 0000000000000000 x4 : ffff14f976feb920
[ 58.018023] x3 : ffff14f976ff2878 x2 : 0000000000000000
[ 58.023940] x1 : 0000000000000000 x0 : ffff14f80129c410
[ 58.029871] Call trace:
[ 58.032849] platform_shutdown+0x8/0x34
[ 58.037256] kernel_restart+0x40/0xa0
[ 58.041485] __do_sys_reboot+0x228/0x250
[ 58.045975] __arm64_sys_reboot+0x28/0x34
[ 58.050571] el0_svc_common+0x7c/0x1a0
[ 58.054886] do_el0_svc+0x28/0x94
[ 58.058754] el0_svc+0x14/0x20
[ 58.062371] el0_sync_handler+0x1a4/0x1b0
[ 58.066951] el0_sync+0x174/0x180
[ 58.070822] Code: d503201f d503201f d503245f f9403401 (f85e8021)
[ 58.077532] ---[ end trace 26b521c0dca4c8d0 ]---
--
With best wishes
Dmitry
next prev parent reply other threads:[~2020-12-12 20:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-12 1:14 [PATCH] driver core: platform: don't oops on unbound devices Dmitry Baryshkov
2020-12-12 11:41 ` Greg Kroah-Hartman
2020-12-12 15:39 ` Uwe Kleine-König
2020-12-12 20:49 ` Dmitry Baryshkov [this message]
2020-12-12 21:09 ` Uwe Kleine-König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAA8EJpqwJKwYS=9o5Vtqwmi5qGd33woK_q4NO5h6mh-f3G+NtA@mail.gmail.com' \
--to=dmitry.baryshkov@linaro.org \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=u.kleine-koenig@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).