All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] driver core: platform: don't oops on unbound devices
@ 2020-12-12  1:14 Dmitry Baryshkov
  2020-12-12 11:41 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Baryshkov @ 2020-12-12  1:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Uwe Kleine-König; +Cc: linux-kernel

Platform code stopped checking if the device is bound to the actual
platform driver, thus calling non-existing drv->shutdown(). Verify that
_dev->driver is not NULL before calling remove/shutdown callbacks.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
---
 drivers/base/platform.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 0358dc3ea3ad..93f44e69b472 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev)
 	struct platform_device *dev = to_platform_device(_dev);
 	int ret = 0;
 
-	if (drv->remove)
+	if (_dev->driver && drv->remove)
 		ret = drv->remove(dev);
 	dev_pm_domain_detach(_dev, true);
 
@@ -1354,7 +1354,7 @@ static void platform_shutdown(struct device *_dev)
 	struct platform_driver *drv = to_platform_driver(_dev->driver);
 	struct platform_device *dev = to_platform_device(_dev);
 
-	if (drv->shutdown)
+	if (_dev->driver && drv->shutdown)
 		drv->shutdown(dev);
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] driver core: platform: don't oops on unbound devices
  2020-12-12  1:14 [PATCH] driver core: platform: don't oops on unbound devices Dmitry Baryshkov
@ 2020-12-12 11:41 ` Greg Kroah-Hartman
  2020-12-12 15:39   ` Uwe Kleine-König
  0 siblings, 1 reply; 5+ messages in thread
From: Greg Kroah-Hartman @ 2020-12-12 11:41 UTC (permalink / raw)
  To: Dmitry Baryshkov; +Cc: Rafael J. Wysocki, Uwe Kleine-König, linux-kernel

On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote:
> Platform code stopped checking if the device is bound to the actual
> platform driver, thus calling non-existing drv->shutdown(). Verify that
> _dev->driver is not NULL before calling remove/shutdown callbacks.
> 
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
> ---
>  drivers/base/platform.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 0358dc3ea3ad..93f44e69b472 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev)
>  	struct platform_device *dev = to_platform_device(_dev);
>  	int ret = 0;
>  
> -	if (drv->remove)
> +	if (_dev->driver && drv->remove)
>  		ret = drv->remove(dev);
>  	dev_pm_domain_detach(_dev, true);

I don't object to this, but it always feels odd to be doing pointer math
on a NULL value, wait until the static-checkers get ahold of this and
you get crazy emails saying you are crashing the kernel (hint, they are
broken).

But, I don't see why this check is needed?  If a driver is not bound to
a device, shouldn't this whole function just not be called?  Or error
out at the top?  

Uwe, I'd really like your review/ack of this before taking it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] driver core: platform: don't oops on unbound devices
  2020-12-12 11:41 ` Greg Kroah-Hartman
@ 2020-12-12 15:39   ` Uwe Kleine-König
  2020-12-12 20:49     ` Dmitry Baryshkov
  0 siblings, 1 reply; 5+ messages in thread
From: Uwe Kleine-König @ 2020-12-12 15:39 UTC (permalink / raw)
  To: Dmitry Baryshkov, Greg Kroah-Hartman; +Cc: Rafael J. Wysocki, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2704 bytes --]

Hello,

On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote:
> On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote:
> > Platform code stopped checking if the device is bound to the actual
> > platform driver, thus calling non-existing drv->shutdown(). Verify that
> > _dev->driver is not NULL before calling remove/shutdown callbacks.
> > 
> > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
> > ---
> >  drivers/base/platform.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 0358dc3ea3ad..93f44e69b472 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev)
> >  	struct platform_device *dev = to_platform_device(_dev);
> >  	int ret = 0;
> >  
> > -	if (drv->remove)
> > +	if (_dev->driver && drv->remove)
> >  		ret = drv->remove(dev);
> >  	dev_pm_domain_detach(_dev, true);
> 
> I don't object to this, but it always feels odd to be doing pointer math
> on a NULL value, wait until the static-checkers get ahold of this and
> you get crazy emails saying you are crashing the kernel (hint, they are
> broken).

I think you refer to the line

	struct platform_driver *drv = to_platform_driver(_dev->driver);

which when _dev->driver is NULL results in drv being something really
big?!

Accoding to my understanding platform_remove() shouldn't be called if
the device isn't bound to a driver.

> But, I don't see why this check is needed?  If a driver is not bound to
> a device, shouldn't this whole function just not be called?  Or error
> out at the top?  
> 
> Uwe, I'd really like your review/ack of this before taking it.

So I agree and have the same question. So I wonder: @Dmitry, did you see
a crash? When did it happen?

For one of the bus types I changed recently
(arch/powerpc/platforms/ps3/system-bus.c) the bus's shutdown function
does:

	if (drv->shutdown)
		drv->shutdown(dev);
	else if (drv->remove) {
		dev_dbg(&dev->core, ...
		drv->remove(dev);
	} ...

but for the platform bus I'm not aware that remove is used in absence of
a shutdown callback.

Relevant callers of bus->remove are all in drivers/base/dd.c, and for
all of them dev->driver should be set.

I look forward to an explaination about why this patch was created.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] driver core: platform: don't oops on unbound devices
  2020-12-12 15:39   ` Uwe Kleine-König
@ 2020-12-12 20:49     ` Dmitry Baryshkov
  2020-12-12 21:09       ` Uwe Kleine-König
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Baryshkov @ 2020-12-12 20:49 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: Greg Kroah-Hartman, Rafael J. Wysocki, open list

Hello,

On Sat, 12 Dec 2020 at 18:39, Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> Hello,
>
> On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote:
> > On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote:
> > > Platform code stopped checking if the device is bound to the actual
> > > platform driver, thus calling non-existing drv->shutdown(). Verify that
> > > _dev->driver is not NULL before calling remove/shutdown callbacks.
> > >
> > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
> > > ---
> > >  drivers/base/platform.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > > index 0358dc3ea3ad..93f44e69b472 100644
> > > --- a/drivers/base/platform.c
> > > +++ b/drivers/base/platform.c
> > > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev)
> > >     struct platform_device *dev = to_platform_device(_dev);
> > >     int ret = 0;
> > >
> > > -   if (drv->remove)
> > > +   if (_dev->driver && drv->remove)
> > >             ret = drv->remove(dev);
> > >     dev_pm_domain_detach(_dev, true);
> >
> > I don't object to this, but it always feels odd to be doing pointer math
> > on a NULL value, wait until the static-checkers get ahold of this and
> > you get crazy emails saying you are crashing the kernel (hint, they are
> > broken).
>
> I think you refer to the line
>
>         struct platform_driver *drv = to_platform_driver(_dev->driver);
>
> which when _dev->driver is NULL results in drv being something really
> big?!

Yes. To remove pointer math on NULL value I can move the check for
_dev->driver before calculating drv.

>
> Accoding to my understanding platform_remove() shouldn't be called if
> the device isn't bound to a driver.
>
> > But, I don't see why this check is needed?  If a driver is not bound to
> > a device, shouldn't this whole function just not be called?  Or error
> > out at the top?
> >
> > Uwe, I'd really like your review/ack of this before taking it.
>
> So I agree and have the same question. So I wonder: @Dmitry, did you see
> a crash? When did it happen?

The crash happens in the platform_shutdown() function, which gets
called for unbound devices after commit 9c30921fe ("driver core:
platform: use bus_type functions").
I can include crash trace into v2.

I added a check to platform_remove() as a safety measure. All current
calls for dev->bus->remove() in dd.c seem to happen only when
dev->driver is set, but I thought that it might be a good check. I can
drop it if you'd like.


>
> For one of the bus types I changed recently
> (arch/powerpc/platforms/ps3/system-bus.c) the bus's shutdown function
> does:
>
>         if (drv->shutdown)
>                 drv->shutdown(dev);
>         else if (drv->remove) {
>                 dev_dbg(&dev->core, ...
>                 drv->remove(dev);
>         } ...
>
> but for the platform bus I'm not aware that remove is used in absence of
> a shutdown callback.
>
> Relevant callers of bus->remove are all in drivers/base/dd.c, and for
> all of them dev->driver should be set.
>
> I look forward to an explaination about why this patch was created.

Here is an explanation: the 3d6a0000.gmu device is not bound to a
driver, causing a crash during reboot.

[   57.832972] platform 3d6a000.gmu: shutdown
[   57.837778] Unable to handle kernel paging request at virtual
address ffffffffffffffe8
[   57.846391] Mem abort info:
[   57.849704]   ESR = 0x96000004
[   57.853286]   EC = 0x25: DABT (current EL), IL = 32 bits
[   57.859177]   SET = 0, FnV = 0
[   57.862751]   EA = 0, S1PTW = 0
[   57.866415] Data abort info:
[   57.869801]   ISV = 0, ISS = 0x00000004
[   57.874171]   CM = 0, WnR = 0
[   57.877634] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a1646000
[   57.884937] [ffffffffffffffe8] pgd=0000000000000000, p4d=0000000000000000
[   57.892323] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   57.898471] Modules linked in:
[   57.902022] CPU: 7 PID: 387 Comm: reboot Tainted: G        W
 5.10.0-rc7-next-20201211-13328-gb9e15b9c1940-dirty #1270
[   57.914043] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
[   57.921340] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[   57.927930] pc : platform_shutdown+0x8/0x34
[   57.932661] lr : device_shutdown+0x158/0x32c
[   57.937483] sp : ffff800010773c70
[   57.941319] x29: ffff800010773c70 x28: ffff14f80c41c600
[   57.947208] x27: 0000000000000000 x26: ffff14f80129c490
[   57.953100] x25: ffffaa6264ece398 x24: 0000000000000008
[   57.958990] x23: ffffaa62655be030 x22: ffffaa6265671600
[   57.964875] x21: ffff14f80122b010 x20: ffff14f80129c410
[   57.970765] x19: ffff14f80129c418 x18: 0000000000000030
[   57.976665] x17: 0000000000000000 x16: 0000000000000001
[   57.982590] x15: 0000000000000004 x14: 000000000000019f
[   57.988478] x13: 0000000000000000 x12: 0000000000000000
[   57.994394] x11: 0000000000000000 x10: 00000000000009b0
[   58.000297] x9 : ffff800010773920 x8 : ffff14f80c41d010
[   58.006205] x7 : ffff14f976ff59c0 x6 : 0000000000000192
[   58.012112] x5 : 0000000000000000 x4 : ffff14f976feb920
[   58.018023] x3 : ffff14f976ff2878 x2 : 0000000000000000
[   58.023940] x1 : 0000000000000000 x0 : ffff14f80129c410
[   58.029871] Call trace:
[   58.032849]  platform_shutdown+0x8/0x34
[   58.037256]  kernel_restart+0x40/0xa0
[   58.041485]  __do_sys_reboot+0x228/0x250
[   58.045975]  __arm64_sys_reboot+0x28/0x34
[   58.050571]  el0_svc_common+0x7c/0x1a0
[   58.054886]  do_el0_svc+0x28/0x94
[   58.058754]  el0_svc+0x14/0x20
[   58.062371]  el0_sync_handler+0x1a4/0x1b0
[   58.066951]  el0_sync+0x174/0x180
[   58.070822] Code: d503201f d503201f d503245f f9403401 (f85e8021)
[   58.077532] ---[ end trace 26b521c0dca4c8d0 ]---


--
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] driver core: platform: don't oops on unbound devices
  2020-12-12 20:49     ` Dmitry Baryshkov
@ 2020-12-12 21:09       ` Uwe Kleine-König
  0 siblings, 0 replies; 5+ messages in thread
From: Uwe Kleine-König @ 2020-12-12 21:09 UTC (permalink / raw)
  To: Dmitry Baryshkov; +Cc: Greg Kroah-Hartman, Rafael J. Wysocki, open list

[-- Attachment #1: Type: text/plain, Size: 3491 bytes --]

Hello Dmitry,

On Sat, Dec 12, 2020 at 11:49:26PM +0300, Dmitry Baryshkov wrote:
> On Sat, 12 Dec 2020 at 18:39, Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> > On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote:
> > > On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote:
> > > > Platform code stopped checking if the device is bound to the actual
> > > > platform driver, thus calling non-existing drv->shutdown(). Verify that
> > > > _dev->driver is not NULL before calling remove/shutdown callbacks.
> > > >
> > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions")
> > > > ---
> > > >  drivers/base/platform.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > > > index 0358dc3ea3ad..93f44e69b472 100644
> > > > --- a/drivers/base/platform.c
> > > > +++ b/drivers/base/platform.c
> > > > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev)
> > > >     struct platform_device *dev = to_platform_device(_dev);
> > > >     int ret = 0;
> > > >
> > > > -   if (drv->remove)
> > > > +   if (_dev->driver && drv->remove)
> > > >             ret = drv->remove(dev);
> > > >     dev_pm_domain_detach(_dev, true);
> > >
> > > I don't object to this, but it always feels odd to be doing pointer math
> > > on a NULL value, wait until the static-checkers get ahold of this and
> > > you get crazy emails saying you are crashing the kernel (hint, they are
> > > broken).
> >
> > I think you refer to the line
> >
> >         struct platform_driver *drv = to_platform_driver(_dev->driver);
> >
> > which when _dev->driver is NULL results in drv being something really
> > big?!
> 
> Yes. To remove pointer math on NULL value I can move the check for
> _dev->driver before calculating drv.

Yeah, that would be good.

> > Accoding to my understanding platform_remove() shouldn't be called if
> > the device isn't bound to a driver.
> >
> > > But, I don't see why this check is needed?  If a driver is not bound to
> > > a device, shouldn't this whole function just not be called?  Or error
> > > out at the top?
> > >
> > > Uwe, I'd really like your review/ack of this before taking it.
> >
> > So I agree and have the same question. So I wonder: @Dmitry, did you see
> > a crash? When did it happen?
> 
> The crash happens in the platform_shutdown() function, which gets
> called for unbound devices after commit 9c30921fe ("driver core:
> platform: use bus_type functions").
> I can include crash trace into v2.

Ah, now I understood. I didn't look too closely on your patch, only on
what Greg quoted. So you added a check to platform_remove (which should
be unnecessary) and to platform_shutdown (where I agree the check is
necessary).

> I added a check to platform_remove() as a safety measure. All current
> calls for dev->bus->remove() in dd.c seem to happen only when
> dev->driver is set, but I thought that it might be a good check. I can
> drop it if you'd like.

Yes, I'd like you to drop this. .remove isn't called for devices without
drivers.

Best regards and thanks for cleaning up after me,
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-12 21:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-12  1:14 [PATCH] driver core: platform: don't oops on unbound devices Dmitry Baryshkov
2020-12-12 11:41 ` Greg Kroah-Hartman
2020-12-12 15:39   ` Uwe Kleine-König
2020-12-12 20:49     ` Dmitry Baryshkov
2020-12-12 21:09       ` Uwe Kleine-König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.