From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from omzsmtpe01.verizonbusiness.com ([199.249.25.210]:42916 "EHLO omzsmtpe01.verizonbusiness.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751192AbdFFBRT (ORCPT ); Mon, 5 Jun 2017 21:17:19 -0400 From: "Levin, Alexander (Sasha Levin)" To: Florian Fainelli CC: "stable@vger.kernel.org" , Mao Wenan , "David S . Miller" Subject: Re: [PATCH for v4.9 LTS 035/111] net: phy: Fix lack of reference count on PHY driver Date: Tue, 6 Jun 2017 01:16:41 +0000 Message-ID: <20170606011639.jzb5dqmvuagnwsrg@sasha-lappy> References: <20170604081123.19462-1-alexander.levin@verizon.com> <20170604081123.19462-35-alexander.levin@verizon.com> <0e225472-7f01-dbe1-9093-e61444bcb01d@gmail.com> <20170605121509.jshvm2h3gsmdpyvt@sasha-lappy> <20170605195831.5fhgxqfs2fplbqbp@sasha-lappy> In-Reply-To: Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org List-ID: On Mon, Jun 05, 2017 at 05:33:08PM -0700, Florian Fainelli wrote: > On 06/05/2017 12:58 PM, Levin, Alexander (Sasha Levin) wrote: > > On Mon, Jun 05, 2017 at 09:56:18AM -0700, Florian Fainelli wrote: > >> On 06/05/2017 05:15 AM, Levin, Alexander (Sasha Levin) wrote: > >>> On Sun, Jun 04, 2017 at 10:17:49AM -0700, Florian Fainelli wrote: > >>>> Hi Alex, > >>>> > >>>> On 06/04/2017 01:12 AM, Levin, Alexander (Sasha Levin) wrote: > >>>>> From: Mao Wenan > >>>>> > >>>>> [ Upstream commit cafe8df8b9bc9aa3dffa827c1a6757c6cd36f657 ] > >>>>> > >>>>> There is currently no reference count being held on the PHY driver, > >>>>> which makes it possible to remove the PHY driver module while the P= HY > >>>>> state machine is running and polling the PHY. This could cause cras= hes > >>>>> similar to this one to show up: > >>>>> > >>>>> [ 43.361162] BUG: unable to handle kernel NULL pointer dereferenc= e at 0000000000000140 > >>>>> [ 43.361162] IP: phy_state_machine+0x32/0x490 > >>>>> [ 43.361162] PGD 59dc067 > >>>>> [ 43.361162] PUD 0 > >>>>> [ 43.361162] > >>>>> [ 43.361162] Oops: 0000 [#1] SMP > >>>>> [ 43.361162] Modules linked in: dsa_loop [last unloaded: broadcom= ] > >>>>> [ 43.361162] CPU: 0 PID: 1299 Comm: kworker/0:3 Not tainted 4.10.= 0-rc5+ #415 > >>>>> [ 43.361162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996= ), > >>>>> BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 > >>>>> [ 43.361162] Workqueue: events_power_efficient phy_state_machine > >>>>> [ 43.361162] task: ffff880006782b80 task.stack: ffffc90000184000 > >>>>> [ 43.361162] RIP: 0010:phy_state_machine+0x32/0x490 > >>>>> [ 43.361162] RSP: 0018:ffffc90000187e18 EFLAGS: 00000246 > >>>>> [ 43.361162] RAX: 0000000000000000 RBX: ffff8800059e53c0 RCX: > >>>>> ffff880006a15c60 > >>>>> [ 43.361162] RDX: ffff880006782b80 RSI: 0000000000000000 RDI: > >>>>> ffff8800059e5428 > >>>>> [ 43.361162] RBP: ffffc90000187e48 R08: ffff880006a15c40 R09: > >>>>> 0000000000000000 > >>>>> [ 43.361162] R10: 0000000000000000 R11: 0000000000000000 R12: > >>>>> ffff8800059e5428 > >>>>> [ 43.361162] R13: ffff8800059e5000 R14: 0000000000000000 R15: > >>>>> ffff880006a15c40 > >>>>> [ 43.361162] FS: 0000000000000000(0000) GS:ffff880006a00000(0000= ) > >>>>> knlGS:0000000000000000 > >>>>> [ 43.361162] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>>> [ 43.361162] CR2: 0000000000000140 CR3: 0000000005979000 CR4: > >>>>> 00000000000006f0 > >>>>> [ 43.361162] Call Trace: > >>>>> [ 43.361162] process_one_work+0x1b4/0x3e0 > >>>>> [ 43.361162] worker_thread+0x43/0x4d0 > >>>>> [ 43.361162] ? __schedule+0x17f/0x4e0 > >>>>> [ 43.361162] kthread+0xf7/0x130 > >>>>> [ 43.361162] ? process_one_work+0x3e0/0x3e0 > >>>>> [ 43.361162] ? kthread_create_on_node+0x40/0x40 > >>>>> [ 43.361162] ret_from_fork+0x29/0x40 > >>>>> [ 43.361162] Code: 56 41 55 41 54 4c 8d 67 68 53 4c 8d af 40 fc f= f ff > >>>>> 48 89 fb 4c 89 e7 48 83 ec 08 e8 c9 9d 27 00 48 8b 83 60 ff ff ff 4= 4 8b > >>>>> 73 98 <48> 8b 90 40 01 00 00 44 89 f0 48 85 d2 74 08 4c 89 ef ff d2= 8b > >>>>> > >>>>> Keep references on the PHY driver module right before we are going = to > >>>>> utilize it in phy_attach_direct(), and conversely when we don't use= it > >>>>> anymore in phy_detach(). > >>>>> > >>>>> Signed-off-by: Mao Wenan > >>>>> [florian: rebase, rework commit message] > >>>>> Signed-off-by: Florian Fainelli > >>>>> Signed-off-by: David S. Miller > >>>>> > >>>>> Signed-off-by: Sasha Levin > >>>> > >>>> This commit alone will cause problems, you will also need to pick th= is > >>>> one on top of it: > >>>> > >>>> 6d9f66ac7fec2a6ccd649e5909806dfe36f1fc25 ("net: phy: Fix PHY module > >>>> checks and NULL deref in phy_attach_direct()") > >>> > >>> Should I also be grabbing a7dac9f9c1 > >>> ("phy: fix error case of phy_led_triggers_(un)register")? > >>> > >>> It says it fixes a commit that's not in -stable, but it looks like it= 's > >>> still relevant even without that commit. > >> > >> No, you don't have to pick this one, it does indeed fix something that > >> was only introduced in 4.10 and newer. > >=20 > > Hm, can you ack this conflict resolution of applying 6d9f66ac7fe on top > > of 4.9.30 (in particular, the code in phy_attach_direct()): >=20 > Acked-by: Florian Fainelli >=20 > Sorry it took a bit of time for testing because I also exercised the > error paths to make sure it was not blowing up on us, FWIW, attached was > the patch that I used. > --=20 > Florian I'll take your patch below. Thanks Florian! > From 2a33694744e3ed2d33c4a530118ab46d20fbe6fc Mon Sep 17 00:00:00 2001 > From: Florian Fainelli > Date: Wed, 8 Feb 2017 19:05:26 -0800 > Subject: [PATCH] net: phy: Fix PHY module checks and NULL deref in > phy_attach_direct() >=20 > The Generic PHY drivers gets assigned after we checked that the current > PHY driver is NULL, so we need to check a few things before we can > safely dereference d->driver. This would be causing a NULL deference to > occur when a system binds to the Generic PHY driver. Update > phy_attach_direct() to do the following: >=20 > - grab the driver module reference after we have assigned the Generic > PHY drivers accordingly, and remember we came from the generic PHY > path >=20 > - update the error path to clean up the module reference in case the > Generic PHY probe function fails >=20 > - split the error path involving phy_detacht() to avoid double free/put > since phy_detach() does all the clean up >=20 > - finally, have phy_detach() drop the module reference count before we > call device_release_driver() for the Generic PHY driver case >=20 > Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver= ") > Signed-off-by: Florian Fainelli > Signed-off-by: David S. Miller > --- > drivers/net/phy/phy_device.c | 29 +++++++++++++++++++++-------- > 1 file changed, 21 insertions(+), 8 deletions(-) >=20 > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c > index 67571f9627e5..14d57d0d1c04 100644 > --- a/drivers/net/phy/phy_device.c > +++ b/drivers/net/phy/phy_device.c > @@ -860,6 +860,7 @@ int phy_attach_direct(struct net_device *dev, struct = phy_device *phydev, > struct module *ndev_owner =3D dev->dev.parent->driver->owner; > struct mii_bus *bus =3D phydev->mdio.bus; > struct device *d =3D &phydev->mdio.dev; > + bool using_genphy =3D false; > int err; > =20 > /* For Ethernet device drivers that register their own MDIO bus, we > @@ -872,11 +873,6 @@ int phy_attach_direct(struct net_device *dev, struct= phy_device *phydev, > return -EIO; > } > =20 > - if (!try_module_get(d->driver->owner)) { > - dev_err(&dev->dev, "failed to get the device driver module\n"); > - return -EIO; > - } > - > get_device(d); > =20 > /* Assume that if there is no driver, that it doesn't > @@ -890,12 +886,22 @@ int phy_attach_direct(struct net_device *dev, struc= t phy_device *phydev, > d->driver =3D > &genphy_driver[GENPHY_DRV_1G].mdiodrv.driver; > =20 > + using_genphy =3D true; > + } > + > + if (!try_module_get(d->driver->owner)) { > + dev_err(&dev->dev, "failed to get the device driver module\n"); > + err =3D -EIO; > + goto error_put_device; > + } > + > + if (using_genphy) { > err =3D d->driver->probe(d); > if (err >=3D 0) > err =3D device_bind_driver(d); > =20 > if (err) > - goto error; > + goto error_module_put; > } > =20 > if (phydev->attached_dev) { > @@ -931,8 +937,14 @@ int phy_attach_direct(struct net_device *dev, struct= phy_device *phydev, > return err; > =20 > error: > - put_device(d); > + /* phy_detach() does all of the cleanup below */ > + phy_detach(phydev); > + return err; > + > +error_module_put: > module_put(d->driver->owner); > +error_put_device: > + put_device(d); > if (ndev_owner !=3D bus->owner) > module_put(bus->owner); > return err; > @@ -993,6 +1005,8 @@ void phy_detach(struct phy_device *phydev) > phydev->attached_dev =3D NULL; > phy_suspend(phydev); > =20 > + module_put(phydev->mdio.dev.driver->owner); > + > /* If the device had no specific driver before (i.e. - it > * was using the generic driver), we unbind the device > * from the generic driver so that there's a chance a > @@ -1013,7 +1027,6 @@ void phy_detach(struct phy_device *phydev) > bus =3D phydev->mdio.bus; > =20 > put_device(&phydev->mdio.dev); > - module_put(phydev->mdio.dev.driver->owner); > if (ndev_owner !=3D bus->owner) > module_put(bus->owner); > } > --=20 > 2.9.3 >=20 --=20 Thanks, Sasha=