* [PATCH] driver core: platform: don't oops on unbound devices @ 2020-12-12 1:14 Dmitry Baryshkov 2020-12-12 11:41 ` Greg Kroah-Hartman 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Baryshkov @ 2020-12-12 1:14 UTC (permalink / raw) To: Greg Kroah-Hartman, Rafael J. Wysocki, Uwe Kleine-König; +Cc: linux-kernel Platform code stopped checking if the device is bound to the actual platform driver, thus calling non-existing drv->shutdown(). Verify that _dev->driver is not NULL before calling remove/shutdown callbacks. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions") --- drivers/base/platform.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 0358dc3ea3ad..93f44e69b472 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev) struct platform_device *dev = to_platform_device(_dev); int ret = 0; - if (drv->remove) + if (_dev->driver && drv->remove) ret = drv->remove(dev); dev_pm_domain_detach(_dev, true); @@ -1354,7 +1354,7 @@ static void platform_shutdown(struct device *_dev) struct platform_driver *drv = to_platform_driver(_dev->driver); struct platform_device *dev = to_platform_device(_dev); - if (drv->shutdown) + if (_dev->driver && drv->shutdown) drv->shutdown(dev); } -- 2.29.2 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] driver core: platform: don't oops on unbound devices 2020-12-12 1:14 [PATCH] driver core: platform: don't oops on unbound devices Dmitry Baryshkov @ 2020-12-12 11:41 ` Greg Kroah-Hartman 2020-12-12 15:39 ` Uwe Kleine-König 0 siblings, 1 reply; 5+ messages in thread From: Greg Kroah-Hartman @ 2020-12-12 11:41 UTC (permalink / raw) To: Dmitry Baryshkov; +Cc: Rafael J. Wysocki, Uwe Kleine-König, linux-kernel On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote: > Platform code stopped checking if the device is bound to the actual > platform driver, thus calling non-existing drv->shutdown(). Verify that > _dev->driver is not NULL before calling remove/shutdown callbacks. > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions") > --- > drivers/base/platform.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > index 0358dc3ea3ad..93f44e69b472 100644 > --- a/drivers/base/platform.c > +++ b/drivers/base/platform.c > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev) > struct platform_device *dev = to_platform_device(_dev); > int ret = 0; > > - if (drv->remove) > + if (_dev->driver && drv->remove) > ret = drv->remove(dev); > dev_pm_domain_detach(_dev, true); I don't object to this, but it always feels odd to be doing pointer math on a NULL value, wait until the static-checkers get ahold of this and you get crazy emails saying you are crashing the kernel (hint, they are broken). But, I don't see why this check is needed? If a driver is not bound to a device, shouldn't this whole function just not be called? Or error out at the top? Uwe, I'd really like your review/ack of this before taking it. thanks, greg k-h ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] driver core: platform: don't oops on unbound devices 2020-12-12 11:41 ` Greg Kroah-Hartman @ 2020-12-12 15:39 ` Uwe Kleine-König 2020-12-12 20:49 ` Dmitry Baryshkov 0 siblings, 1 reply; 5+ messages in thread From: Uwe Kleine-König @ 2020-12-12 15:39 UTC (permalink / raw) To: Dmitry Baryshkov, Greg Kroah-Hartman; +Cc: Rafael J. Wysocki, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2704 bytes --] Hello, On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote: > On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote: > > Platform code stopped checking if the device is bound to the actual > > platform driver, thus calling non-existing drv->shutdown(). Verify that > > _dev->driver is not NULL before calling remove/shutdown callbacks. > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions") > > --- > > drivers/base/platform.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > > index 0358dc3ea3ad..93f44e69b472 100644 > > --- a/drivers/base/platform.c > > +++ b/drivers/base/platform.c > > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev) > > struct platform_device *dev = to_platform_device(_dev); > > int ret = 0; > > > > - if (drv->remove) > > + if (_dev->driver && drv->remove) > > ret = drv->remove(dev); > > dev_pm_domain_detach(_dev, true); > > I don't object to this, but it always feels odd to be doing pointer math > on a NULL value, wait until the static-checkers get ahold of this and > you get crazy emails saying you are crashing the kernel (hint, they are > broken). I think you refer to the line struct platform_driver *drv = to_platform_driver(_dev->driver); which when _dev->driver is NULL results in drv being something really big?! Accoding to my understanding platform_remove() shouldn't be called if the device isn't bound to a driver. > But, I don't see why this check is needed? If a driver is not bound to > a device, shouldn't this whole function just not be called? Or error > out at the top? > > Uwe, I'd really like your review/ack of this before taking it. So I agree and have the same question. So I wonder: @Dmitry, did you see a crash? When did it happen? For one of the bus types I changed recently (arch/powerpc/platforms/ps3/system-bus.c) the bus's shutdown function does: if (drv->shutdown) drv->shutdown(dev); else if (drv->remove) { dev_dbg(&dev->core, ... drv->remove(dev); } ... but for the platform bus I'm not aware that remove is used in absence of a shutdown callback. Relevant callers of bus->remove are all in drivers/base/dd.c, and for all of them dev->driver should be set. I look forward to an explaination about why this patch was created. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] driver core: platform: don't oops on unbound devices 2020-12-12 15:39 ` Uwe Kleine-König @ 2020-12-12 20:49 ` Dmitry Baryshkov 2020-12-12 21:09 ` Uwe Kleine-König 0 siblings, 1 reply; 5+ messages in thread From: Dmitry Baryshkov @ 2020-12-12 20:49 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Greg Kroah-Hartman, Rafael J. Wysocki, open list Hello, On Sat, 12 Dec 2020 at 18:39, Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > Hello, > > On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote: > > On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote: > > > Platform code stopped checking if the device is bound to the actual > > > platform driver, thus calling non-existing drv->shutdown(). Verify that > > > _dev->driver is not NULL before calling remove/shutdown callbacks. > > > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions") > > > --- > > > drivers/base/platform.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > > > index 0358dc3ea3ad..93f44e69b472 100644 > > > --- a/drivers/base/platform.c > > > +++ b/drivers/base/platform.c > > > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev) > > > struct platform_device *dev = to_platform_device(_dev); > > > int ret = 0; > > > > > > - if (drv->remove) > > > + if (_dev->driver && drv->remove) > > > ret = drv->remove(dev); > > > dev_pm_domain_detach(_dev, true); > > > > I don't object to this, but it always feels odd to be doing pointer math > > on a NULL value, wait until the static-checkers get ahold of this and > > you get crazy emails saying you are crashing the kernel (hint, they are > > broken). > > I think you refer to the line > > struct platform_driver *drv = to_platform_driver(_dev->driver); > > which when _dev->driver is NULL results in drv being something really > big?! Yes. To remove pointer math on NULL value I can move the check for _dev->driver before calculating drv. > > Accoding to my understanding platform_remove() shouldn't be called if > the device isn't bound to a driver. > > > But, I don't see why this check is needed? If a driver is not bound to > > a device, shouldn't this whole function just not be called? Or error > > out at the top? > > > > Uwe, I'd really like your review/ack of this before taking it. > > So I agree and have the same question. So I wonder: @Dmitry, did you see > a crash? When did it happen? The crash happens in the platform_shutdown() function, which gets called for unbound devices after commit 9c30921fe ("driver core: platform: use bus_type functions"). I can include crash trace into v2. I added a check to platform_remove() as a safety measure. All current calls for dev->bus->remove() in dd.c seem to happen only when dev->driver is set, but I thought that it might be a good check. I can drop it if you'd like. > > For one of the bus types I changed recently > (arch/powerpc/platforms/ps3/system-bus.c) the bus's shutdown function > does: > > if (drv->shutdown) > drv->shutdown(dev); > else if (drv->remove) { > dev_dbg(&dev->core, ... > drv->remove(dev); > } ... > > but for the platform bus I'm not aware that remove is used in absence of > a shutdown callback. > > Relevant callers of bus->remove are all in drivers/base/dd.c, and for > all of them dev->driver should be set. > > I look forward to an explaination about why this patch was created. Here is an explanation: the 3d6a0000.gmu device is not bound to a driver, causing a crash during reboot. [ 57.832972] platform 3d6a000.gmu: shutdown [ 57.837778] Unable to handle kernel paging request at virtual address ffffffffffffffe8 [ 57.846391] Mem abort info: [ 57.849704] ESR = 0x96000004 [ 57.853286] EC = 0x25: DABT (current EL), IL = 32 bits [ 57.859177] SET = 0, FnV = 0 [ 57.862751] EA = 0, S1PTW = 0 [ 57.866415] Data abort info: [ 57.869801] ISV = 0, ISS = 0x00000004 [ 57.874171] CM = 0, WnR = 0 [ 57.877634] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000a1646000 [ 57.884937] [ffffffffffffffe8] pgd=0000000000000000, p4d=0000000000000000 [ 57.892323] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 57.898471] Modules linked in: [ 57.902022] CPU: 7 PID: 387 Comm: reboot Tainted: G W 5.10.0-rc7-next-20201211-13328-gb9e15b9c1940-dirty #1270 [ 57.914043] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) [ 57.921340] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 57.927930] pc : platform_shutdown+0x8/0x34 [ 57.932661] lr : device_shutdown+0x158/0x32c [ 57.937483] sp : ffff800010773c70 [ 57.941319] x29: ffff800010773c70 x28: ffff14f80c41c600 [ 57.947208] x27: 0000000000000000 x26: ffff14f80129c490 [ 57.953100] x25: ffffaa6264ece398 x24: 0000000000000008 [ 57.958990] x23: ffffaa62655be030 x22: ffffaa6265671600 [ 57.964875] x21: ffff14f80122b010 x20: ffff14f80129c410 [ 57.970765] x19: ffff14f80129c418 x18: 0000000000000030 [ 57.976665] x17: 0000000000000000 x16: 0000000000000001 [ 57.982590] x15: 0000000000000004 x14: 000000000000019f [ 57.988478] x13: 0000000000000000 x12: 0000000000000000 [ 57.994394] x11: 0000000000000000 x10: 00000000000009b0 [ 58.000297] x9 : ffff800010773920 x8 : ffff14f80c41d010 [ 58.006205] x7 : ffff14f976ff59c0 x6 : 0000000000000192 [ 58.012112] x5 : 0000000000000000 x4 : ffff14f976feb920 [ 58.018023] x3 : ffff14f976ff2878 x2 : 0000000000000000 [ 58.023940] x1 : 0000000000000000 x0 : ffff14f80129c410 [ 58.029871] Call trace: [ 58.032849] platform_shutdown+0x8/0x34 [ 58.037256] kernel_restart+0x40/0xa0 [ 58.041485] __do_sys_reboot+0x228/0x250 [ 58.045975] __arm64_sys_reboot+0x28/0x34 [ 58.050571] el0_svc_common+0x7c/0x1a0 [ 58.054886] do_el0_svc+0x28/0x94 [ 58.058754] el0_svc+0x14/0x20 [ 58.062371] el0_sync_handler+0x1a4/0x1b0 [ 58.066951] el0_sync+0x174/0x180 [ 58.070822] Code: d503201f d503201f d503245f f9403401 (f85e8021) [ 58.077532] ---[ end trace 26b521c0dca4c8d0 ]--- -- With best wishes Dmitry ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] driver core: platform: don't oops on unbound devices 2020-12-12 20:49 ` Dmitry Baryshkov @ 2020-12-12 21:09 ` Uwe Kleine-König 0 siblings, 0 replies; 5+ messages in thread From: Uwe Kleine-König @ 2020-12-12 21:09 UTC (permalink / raw) To: Dmitry Baryshkov; +Cc: Greg Kroah-Hartman, Rafael J. Wysocki, open list [-- Attachment #1: Type: text/plain, Size: 3491 bytes --] Hello Dmitry, On Sat, Dec 12, 2020 at 11:49:26PM +0300, Dmitry Baryshkov wrote: > On Sat, 12 Dec 2020 at 18:39, Uwe Kleine-König > <u.kleine-koenig@pengutronix.de> wrote: > > On Sat, Dec 12, 2020 at 12:41:32PM +0100, Greg Kroah-Hartman wrote: > > > On Sat, Dec 12, 2020 at 04:14:26AM +0300, Dmitry Baryshkov wrote: > > > > Platform code stopped checking if the device is bound to the actual > > > > platform driver, thus calling non-existing drv->shutdown(). Verify that > > > > _dev->driver is not NULL before calling remove/shutdown callbacks. > > > > > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > > > Fixes: 9c30921fe799 ("driver core: platform: use bus_type functions") > > > > --- > > > > drivers/base/platform.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > > > > index 0358dc3ea3ad..93f44e69b472 100644 > > > > --- a/drivers/base/platform.c > > > > +++ b/drivers/base/platform.c > > > > @@ -1342,7 +1342,7 @@ static int platform_remove(struct device *_dev) > > > > struct platform_device *dev = to_platform_device(_dev); > > > > int ret = 0; > > > > > > > > - if (drv->remove) > > > > + if (_dev->driver && drv->remove) > > > > ret = drv->remove(dev); > > > > dev_pm_domain_detach(_dev, true); > > > > > > I don't object to this, but it always feels odd to be doing pointer math > > > on a NULL value, wait until the static-checkers get ahold of this and > > > you get crazy emails saying you are crashing the kernel (hint, they are > > > broken). > > > > I think you refer to the line > > > > struct platform_driver *drv = to_platform_driver(_dev->driver); > > > > which when _dev->driver is NULL results in drv being something really > > big?! > > Yes. To remove pointer math on NULL value I can move the check for > _dev->driver before calculating drv. Yeah, that would be good. > > Accoding to my understanding platform_remove() shouldn't be called if > > the device isn't bound to a driver. > > > > > But, I don't see why this check is needed? If a driver is not bound to > > > a device, shouldn't this whole function just not be called? Or error > > > out at the top? > > > > > > Uwe, I'd really like your review/ack of this before taking it. > > > > So I agree and have the same question. So I wonder: @Dmitry, did you see > > a crash? When did it happen? > > The crash happens in the platform_shutdown() function, which gets > called for unbound devices after commit 9c30921fe ("driver core: > platform: use bus_type functions"). > I can include crash trace into v2. Ah, now I understood. I didn't look too closely on your patch, only on what Greg quoted. So you added a check to platform_remove (which should be unnecessary) and to platform_shutdown (where I agree the check is necessary). > I added a check to platform_remove() as a safety measure. All current > calls for dev->bus->remove() in dd.c seem to happen only when > dev->driver is set, but I thought that it might be a good check. I can > drop it if you'd like. Yes, I'd like you to drop this. .remove isn't called for devices without drivers. Best regards and thanks for cleaning up after me, Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-12-12 21:09 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-12 1:14 [PATCH] driver core: platform: don't oops on unbound devices Dmitry Baryshkov 2020-12-12 11:41 ` Greg Kroah-Hartman 2020-12-12 15:39 ` Uwe Kleine-König 2020-12-12 20:49 ` Dmitry Baryshkov 2020-12-12 21:09 ` Uwe Kleine-König
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).