netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* net-next: KSZ switch driver oops in ksz_mib_read_work
@ 2019-06-11 17:57 Robert Hancock
  2019-06-11 23:27 ` Florian Fainelli
  0 siblings, 1 reply; 3+ messages in thread
From: Robert Hancock @ 2019-06-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: woojung.huh, UNGLinuxDriver

We are using an embedded platform with a KSZ9897 switch. I am getting
the oops below in ksz_mib_read_work when testing with net-next branch.
After adding in some debug output, the problem is in this code:

	for (i = 0; i < dev->mib_port_cnt; i++) {
		p = &dev->ports[i];
		mib = &p->mib;
		mutex_lock(&mib->cnt_mutex);

		/* Only read MIB counters when the port is told to do.
		 * If not, read only dropped counters when link is not up.
		 */
		if (!p->read) {
			const struct dsa_port *dp = dsa_to_port(dev->ds, i);

			if (!netif_carrier_ok(dp->slave))
				mib->cnt_ptr = dev->reg_mib_cnt;
		}

The oops is happening on port index 3 (i.e. 4th port) which is not
connected on our platform and so has no entry in the device tree. For
that port, dp->slave is NULL and so netif_carrier_ok explodes.

If I change the code to skip the port entirely in the loop if dp->slave
is NULL it seems to fix the crash, but I'm not that familiar with this
code. Can someone confirm whether that is the proper fix?

[   17.842829] Unable to handle kernel NULL pointer dereference at
virtual address 0000002c
[   17.850983] pgd = (ptrval)
[   17.853711] [0000002c] *pgd=00000000
[   17.857317] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[   17.862632] Modules linked in:
[   17.865695] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 5.2.0-rc3 #1
[   17.872142] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   17.878688] Workqueue: events ksz_mib_read_work
[   17.883227] PC is at ksz_mib_read_work+0x58/0x94
[   17.887848] LR is at ksz_mib_read_work+0x38/0x94
[   17.887852] pc : [<c04843dc>]    lr : [<c04843bc>]    psr: 60070113
[   17.887857] sp : e8147f08  ip : e8148000  fp : ffffe000
[   17.887860] r10: 00000000  r9 : e8aa7040  r8 : e867cc44
[   17.887865] r7 : 00000c20  r6 : e8aa7120  r5 : 00000003  r4 : e867c958
[   17.887868] r3 : 00000000  r2 : 00000000  r1 : 00000003  r0 : e8aa7040
[   17.887879] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment none
[   17.948224] Control: 10c5387d  Table: 38d9404a  DAC: 00000051
[   17.948230] Process kworker/1:1 (pid: 21, stack limit = 0x(ptrval))
[   17.948236] Stack: (0xe8147f08 to 0xe8148000)
[   17.948245] 7f00:                   e8aa7120 e80a8080 eb7aef40
eb7b2000 00000000 e8aa7124
[   17.948254] 7f20: 00000000 c013865c 00000008 c0b03d00 e80a8080
e80a8094 eb7aef40 00000008
[   17.958073] systemd[1]: storage.mount: Unit is bound to inactive unit
dev-mmcblk1p2.device. Stopping, too.
[   17.963306] 7f40: c0b03d00 eb7aef58 eb7aef40 c01393a0 ffffe000
c0b46b09 c084e464 00000000
[   17.963314] 7f60: ffffe000 e8053140 e80530c0 00000000 e8146000
e80a8080 c013935c e80a1eac
[   17.963322] 7f80: e805315c c013e78c 00000000 e80530c0 c013e648
00000000 00000000 00000000
[   17.969893] random: systemd: uninitialized urandom read (16 bytes read)
[   17.973942] 7fa0: 00000000 00000000 00000000 c01010e8 00000000
00000000 00000000 00000000
[   17.973949] 7fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   17.973958] 7fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[   17.982246] random: systemd: uninitialized urandom read (16 bytes read)
[   17.990329] [<c04843dc>] (ksz_mib_read_work) from [<c013865c>]
(process_one_work+0x17c/0x390)
[   17.990345] [<c013865c>] (process_one_work) from [<c01393a0>]
(worker_thread+0x44/0x518)
[   18.009394] random: systemd: uninitialized urandom read (16 bytes read)
[   18.016344] [<c01393a0>] (worker_thread) from [<c013e78c>]
(kthread+0x144/0x14c)
[   18.016358] [<c013e78c>] (kthread) from [<c01010e8>]
(ret_from_fork+0x14/0x2c)
[   18.016362] Exception stack(0xe8147fb0 to 0xe8147ff8)
[   18.016369] 7fa0:                                     00000000
00000000 00000000 00000000
[   18.031159] 7fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   18.031166] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[   18.031176] Code: 1a000006 e51630e0 e0833405 e5933050 (e593302c)
[   18.031279] ---[ end trace ca82392a6c2aa959 ]---


-- 
Robert Hancock
Senior Software Developer
SED Systems, a division of Calian Ltd.
Email: hancock@sedsystems.ca

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: net-next: KSZ switch driver oops in ksz_mib_read_work
  2019-06-11 17:57 net-next: KSZ switch driver oops in ksz_mib_read_work Robert Hancock
@ 2019-06-11 23:27 ` Florian Fainelli
  2019-06-12 14:09   ` Andrew Lunn
  0 siblings, 1 reply; 3+ messages in thread
From: Florian Fainelli @ 2019-06-11 23:27 UTC (permalink / raw)
  To: Robert Hancock, netdev; +Cc: woojung.huh, UNGLinuxDriver

On 6/11/19 10:57 AM, Robert Hancock wrote:
> We are using an embedded platform with a KSZ9897 switch. I am getting
> the oops below in ksz_mib_read_work when testing with net-next branch.
> After adding in some debug output, the problem is in this code:
> 
> 	for (i = 0; i < dev->mib_port_cnt; i++) {
> 		p = &dev->ports[i];
> 		mib = &p->mib;
> 		mutex_lock(&mib->cnt_mutex);
> 
> 		/* Only read MIB counters when the port is told to do.
> 		 * If not, read only dropped counters when link is not up.
> 		 */
> 		if (!p->read) {
> 			const struct dsa_port *dp = dsa_to_port(dev->ds, i);
> 
> 			if (!netif_carrier_ok(dp->slave))
> 				mib->cnt_ptr = dev->reg_mib_cnt;
> 		}
> 
> The oops is happening on port index 3 (i.e. 4th port) which is not
> connected on our platform and so has no entry in the device tree. For
> that port, dp->slave is NULL and so netif_carrier_ok explodes.
> 
> If I change the code to skip the port entirely in the loop if dp->slave
> is NULL it seems to fix the crash, but I'm not that familiar with this
> code. Can someone confirm whether that is the proper fix?

Yes, the following should do it, if you confirm that is the case, I can
send that later with your Tested-by.

diff --git a/drivers/net/dsa/microchip/ksz_common.c
b/drivers/net/dsa/microchip/ksz_common.c
index 39dace8e3512..5470b28332cf 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -93,6 +93,9 @@ static void ksz_mib_read_work(struct work_struct *work)
                if (!p->read) {
                        const struct dsa_port *dp = dsa_to_port(dev->ds, i);

+                       if (dsa_is_unused_port(dp))
+                               continue;
+
                        if (!netif_carrier_ok(dp->slave))
                                mib->cnt_ptr = dev->reg_mib_cnt;
                }


> 
> [   17.842829] Unable to handle kernel NULL pointer dereference at
> virtual address 0000002c
> [   17.850983] pgd = (ptrval)
> [   17.853711] [0000002c] *pgd=00000000
> [   17.857317] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
> [   17.862632] Modules linked in:
> [   17.865695] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 5.2.0-rc3 #1
> [   17.872142] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [   17.878688] Workqueue: events ksz_mib_read_work
> [   17.883227] PC is at ksz_mib_read_work+0x58/0x94
> [   17.887848] LR is at ksz_mib_read_work+0x38/0x94
> [   17.887852] pc : [<c04843dc>]    lr : [<c04843bc>]    psr: 60070113
> [   17.887857] sp : e8147f08  ip : e8148000  fp : ffffe000
> [   17.887860] r10: 00000000  r9 : e8aa7040  r8 : e867cc44
> [   17.887865] r7 : 00000c20  r6 : e8aa7120  r5 : 00000003  r4 : e867c958
> [   17.887868] r3 : 00000000  r2 : 00000000  r1 : 00000003  r0 : e8aa7040
> [   17.887879] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
> Segment none
> [   17.948224] Control: 10c5387d  Table: 38d9404a  DAC: 00000051
> [   17.948230] Process kworker/1:1 (pid: 21, stack limit = 0x(ptrval))
> [   17.948236] Stack: (0xe8147f08 to 0xe8148000)
> [   17.948245] 7f00:                   e8aa7120 e80a8080 eb7aef40
> eb7b2000 00000000 e8aa7124
> [   17.948254] 7f20: 00000000 c013865c 00000008 c0b03d00 e80a8080
> e80a8094 eb7aef40 00000008
> [   17.958073] systemd[1]: storage.mount: Unit is bound to inactive unit
> dev-mmcblk1p2.device. Stopping, too.
> [   17.963306] 7f40: c0b03d00 eb7aef58 eb7aef40 c01393a0 ffffe000
> c0b46b09 c084e464 00000000
> [   17.963314] 7f60: ffffe000 e8053140 e80530c0 00000000 e8146000
> e80a8080 c013935c e80a1eac
> [   17.963322] 7f80: e805315c c013e78c 00000000 e80530c0 c013e648
> 00000000 00000000 00000000
> [   17.969893] random: systemd: uninitialized urandom read (16 bytes read)
> [   17.973942] 7fa0: 00000000 00000000 00000000 c01010e8 00000000
> 00000000 00000000 00000000
> [   17.973949] 7fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [   17.973958] 7fe0: 00000000 00000000 00000000 00000000 00000013
> 00000000 00000000 00000000
> [   17.982246] random: systemd: uninitialized urandom read (16 bytes read)
> [   17.990329] [<c04843dc>] (ksz_mib_read_work) from [<c013865c>]
> (process_one_work+0x17c/0x390)
> [   17.990345] [<c013865c>] (process_one_work) from [<c01393a0>]
> (worker_thread+0x44/0x518)
> [   18.009394] random: systemd: uninitialized urandom read (16 bytes read)
> [   18.016344] [<c01393a0>] (worker_thread) from [<c013e78c>]
> (kthread+0x144/0x14c)
> [   18.016358] [<c013e78c>] (kthread) from [<c01010e8>]
> (ret_from_fork+0x14/0x2c)
> [   18.016362] Exception stack(0xe8147fb0 to 0xe8147ff8)
> [   18.016369] 7fa0:                                     00000000
> 00000000 00000000 00000000
> [   18.031159] 7fc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [   18.031166] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [   18.031176] Code: 1a000006 e51630e0 e0833405 e5933050 (e593302c)
> [   18.031279] ---[ end trace ca82392a6c2aa959 ]---
> 
> 


-- 
Florian

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: net-next: KSZ switch driver oops in ksz_mib_read_work
  2019-06-11 23:27 ` Florian Fainelli
@ 2019-06-12 14:09   ` Andrew Lunn
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Lunn @ 2019-06-12 14:09 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: Robert Hancock, netdev, woojung.huh, UNGLinuxDriver

On Tue, Jun 11, 2019 at 04:27:47PM -0700, Florian Fainelli wrote:
> On 6/11/19 10:57 AM, Robert Hancock wrote:
> > We are using an embedded platform with a KSZ9897 switch. I am getting
> > the oops below in ksz_mib_read_work when testing with net-next branch.
> > After adding in some debug output, the problem is in this code:
> > 
> > 	for (i = 0; i < dev->mib_port_cnt; i++) {
> > 		p = &dev->ports[i];
> > 		mib = &p->mib;
> > 		mutex_lock(&mib->cnt_mutex);
> > 
> > 		/* Only read MIB counters when the port is told to do.
> > 		 * If not, read only dropped counters when link is not up.
> > 		 */
> > 		if (!p->read) {
> > 			const struct dsa_port *dp = dsa_to_port(dev->ds, i);
> > 
> > 			if (!netif_carrier_ok(dp->slave))
> > 				mib->cnt_ptr = dev->reg_mib_cnt;
> > 		}
> > 
> > The oops is happening on port index 3 (i.e. 4th port) which is not
> > connected on our platform and so has no entry in the device tree. For
> > that port, dp->slave is NULL and so netif_carrier_ok explodes.
> > 
> > If I change the code to skip the port entirely in the loop if dp->slave
> > is NULL it seems to fix the crash, but I'm not that familiar with this
> > code. Can someone confirm whether that is the proper fix?
> 
> Yes, the following should do it, if you confirm that is the case, I can
> send that later with your Tested-by.
> 
> diff --git a/drivers/net/dsa/microchip/ksz_common.c
> b/drivers/net/dsa/microchip/ksz_common.c
> index 39dace8e3512..5470b28332cf 100644
> --- a/drivers/net/dsa/microchip/ksz_common.c
> +++ b/drivers/net/dsa/microchip/ksz_common.c
> @@ -93,6 +93,9 @@ static void ksz_mib_read_work(struct work_struct *work)
>                 if (!p->read) {
>                         const struct dsa_port *dp = dsa_to_port(dev->ds, i);
> 
> +                       if (dsa_is_unused_port(dp))
> +                               continue;
> +
>                         if (!netif_carrier_ok(dp->slave))
>                                 mib->cnt_ptr = dev->reg_mib_cnt;
>                 }
> 

Hi Florian

There is a mutex held within the loop. So a continue is not going to
work here.

     Andrew

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-06-12 14:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-11 17:57 net-next: KSZ switch driver oops in ksz_mib_read_work Robert Hancock
2019-06-11 23:27 ` Florian Fainelli
2019-06-12 14:09   ` Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).