All of lore.kernel.org
 help / color / mirror / Atom feed
* Subject: Locking mv88e6xxx_reg_lock twice leads deadlock for 88E6176 switch
@ 2022-12-13 22:58 Maksim Kiselev
  2022-12-14 11:03 ` Vladimir Oltean
  0 siblings, 1 reply; 3+ messages in thread
From: Maksim Kiselev @ 2022-12-13 22:58 UTC (permalink / raw)
  To: bigunclemax
  Cc: fido_max, mw, Andrew Lunn, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Richard Cochran, Russell King, netdev, linux-kernel

Subject: Locking mv88e6xxx_reg_lock twice leads deadlock for 88E6176 switch

Hello, friends.

I have a device with Marvell 88E6176 switch. 
After 'mv88e6xxx: fix speed setting for CPU/DSA ports (cc1049ccee20)'commit was applied to
mainline kernel I faced with a problem that switch driver stuck at 'mv88e6xxx_probe' function.

I made some investigations and found that 'mv88e6xxx_reg_lock' called twice from the same thread which leads to deadlock.

I added logs to 'mv88e6xxx_reg_lock' and 'mv88e6xxx_reg_unlock' functions to see what happened.

So, first lock called from mv88e6xxx_setup:

static int mv88e6xxx_setup(struct dsa_switch *ds)
{
    ...

	if (mv88e6xxx_has_pvt(chip))
		ds->max_num_bridges = MV88E6XXX_MAX_PVT_SWITCHES -
				      ds->dst->last_switch - 1;

	mv88e6xxx_reg_lock(chip);

And second lock called from mv88e6352_phylink_get_caps (only for port 4):
static void mv88e6352_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
				       struct phylink_config *config)
{
    ...

	config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 |
				   MAC_1000FD;

	/* Port 4 supports automedia if the serdes is associated with it. */
	if (port == 4) {
        dump_stack(); // I added this to see where we came from
		mv88e6xxx_reg_lock(chip);


Here is kernel log with my debug output and call stack right before secondary lock:

[    5.801203] mv88e6085 mdio@2d24000:0c: switch 0x1760 detected: Marvell 88E6176, revision 1
[    5.950236] >>>>> (mv88e6xxx_reg_lock|797) from mv88e6xxx_mdio_read+0x54/0xfc [mv88e6xxx] thread_id: 287
[    5.952884] >>>>> (mv88e6xxx_reg_unlock|804) from mv88e6xxx_mdio_read+0x88/0xfc [mv88e6xxx] thread_id: 287
[    5.969373] >>>>> (mv88e6xxx_reg_lock|797) from mv88e6xxx_setup+0x5c/0x550 [mv88e6xxx] thread_id: 287
[    6.023785] >>>>> (mv88e6xxx_reg_lock|797) from mv88e6xxx_g1_irq_thread_work+0x28/0x124 [mv88e6xxx] thread_id: 315

[    6.069185] Backtrace: 
[    6.069206]  dump_backtrace from show_stack+0x18/0x1c
[    6.069266]  r7:f1221aa4 r6:00000004 r5:600b0013 r4:c08d684d
[    6.069279]  show_stack from dump_stack_lvl+0x48/0x54
[    6.069315]  dump_stack_lvl from dump_stack+0x14/0x1c
[    6.069351]  r5:f1221ab4 r4:c1e40040
[    6.069363]  dump_stack from mv88e6352_phylink_get_caps+0x58/0x164 [mv88e6xxx]
[    6.069645]  mv88e6352_phylink_get_caps [mv88e6xxx] from mv88e6xxx_get_caps+0x2c/0x4c [mv88e6xxx]
[    6.070106]  r7:c1ff31c0 r6:00000004 r5:c1ff31c0 r4:f1221aa4
[    6.070118]  mv88e6xxx_get_caps [mv88e6xxx] from mv88e6xxx_setup_port+0x84/0x60c [mv88e6xxx]
[    6.070575]  r7:c1ff31c0 r6:00000004 r5:c1e40040 r4:bf16e510
[    6.070588]  mv88e6xxx_setup_port [mv88e6xxx] from mv88e6xxx_setup+0x3e0/0x550 [mv88e6xxx]
[    6.071048]  r9:c1083600 r8:c1e40e04 r7:00000004 r6:c1ff31c0 r5:c1e40040 r4:bf16e510
[    6.071062]  mv88e6xxx_setup [mv88e6xxx] from dsa_register_switch+0x738/0xba8 [dsa_core]
[    6.071451]  r8:c23395c8 r7:c1ff31c0 r6:c23395c0 r5:c1ff31c0 r4:00000000
[    6.071464]  dsa_register_switch [dsa_core] from mv88e6xxx_probe+0x640/0x6a8 [mv88e6xxx]
[    6.071835]  r10:c1164000 r9:bf171a2c r8:ef7f00f8 r7:00000000 r6:00000000 r5:c1e40040
[    6.071851]  r4:c13ea000
[    6.071862]  mv88e6xxx_probe [mv88e6xxx] from mdio_probe+0x34/0x50
[    6.072133]  r10:c1164000 r9:c1e6a5b8 r8:bf173c4c r7:00000000 r6:bf173000 r5:c13ea000
[    6.072149]  r4:bf173000
[    6.072160]  mdio_probe from really_probe+0x14c/0x2b8
[    6.072206]  r5:c13ea000 r4:00000000
[    6.072218]  really_probe from __driver_probe_device+0xcc/0xe0
[    6.072260]  r7:00000039 r6:c13ea000 r5:bf173000 r4:c13ea000
[    6.072273]  __driver_probe_device from driver_probe_device+0x40/0xbc
[    6.072312]  r5:c13ea000 r4:c0cbdbc0
[    6.072324]  driver_probe_device from __driver_attach+0xf0/0x104
[    6.072367]  r7:c0c7c790 r6:bf173000 r5:c13ea000 r4:00000000
[    6.072380]  __driver_attach from bus_for_each_dev+0x70/0xb4
[    6.072434]  r7:c0c7c790 r6:c04e87e8 r5:bf173000 r4:c13ea000
[    6.072446]  bus_for_each_dev from driver_attach+0x20/0x28
[    6.072507]  r6:00000000 r5:c1e6a580 r4:bf173000
[    6.072520]  driver_attach from bus_add_driver+0xbc/0x1cc
[    6.072574]  bus_add_driver from driver_register+0xb4/0xfc
[    6.072633]  r9:bf102000 r8:bf173c4c r7:f1221ea0 r6:c1ff8800 r5:bf173c40 r4:bf173000
[    6.072647]  driver_register from mdio_driver_register+0x34/0x68
[    6.072696]  r5:bf173c40 r4:bf173000
[    6.072708]  mdio_driver_register from mdio_module_init+0x14/0x1000 [mv88e6xxx]
[    6.072976]  r5:bf173c40 r4:c0c88000
[    6.072988]  mdio_module_init [mv88e6xxx] from do_one_initcall+0x6c/0x1d8
[    6.073239]  do_one_initcall from do_init_module+0x48/0x1c4
[    6.073292]  r10:0000000c r9:00000000 r8:bf173c4c r7:f1221ea0 r6:c1ff8800 r5:bf173c40
[    6.073308]  r4:bf173c40
[    6.073319]  do_init_module from load_module+0x1558/0x1698
[    6.073362]  r6:00000000 r5:bf173c40 r4:f1221f28
[    6.073374]  load_module from sys_finit_module+0xdc/0xec
[    6.073423]  r10:0000017b r9:c1164000 r8:c020029c r7:0000017b r6:0000000f r5:b6edbee8
[    6.073439]  r4:00000000
[    6.073450]  sys_finit_module from ret_fast_syscall+0x0/0x4c
[    6.073485] Exception stack(0xf1221fa8 to 0xf1221ff0)
[    6.073515] 1fa0:                   00020000 00000000 0000000f b6edbee8 00000000 004a3420
[    6.073544] 1fc0: 00020000 00000000 00000000 0000017b 004a96a8 beb9292c 00000000 004a3420
[    6.073567] 1fe0: beb92820 beb92810 b6ed5294 b6b98c30
[    6.073589]  r6:00000000 r5:00000000 r4:00020000

[    6.073658] >>>>> (mv88e6xxx_reg_lock|797) from mv88e6352_phylink_get_caps+0x60/0x164 [mv88e6xxx] thread_id: 287

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Subject: Locking mv88e6xxx_reg_lock twice leads deadlock for 88E6176 switch
  2022-12-13 22:58 Subject: Locking mv88e6xxx_reg_lock twice leads deadlock for 88E6176 switch Maksim Kiselev
@ 2022-12-14 11:03 ` Vladimir Oltean
  2022-12-14 11:42   ` Maxim Kiselev
  0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Oltean @ 2022-12-14 11:03 UTC (permalink / raw)
  To: Maksim Kiselev
  Cc: fido_max, mw, Andrew Lunn, Florian Fainelli, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran,
	Russell King, netdev, linux-kernel

On Wed, Dec 14, 2022 at 01:58:55AM +0300, Maksim Kiselev wrote:
> Hello, friends.

Hello.

> I have a device with Marvell 88E6176 switch. 
> After 'mv88e6xxx: fix speed setting for CPU/DSA ports (cc1049ccee20)'commit was applied to
> mainline kernel I faced with a problem that switch driver stuck at 'mv88e6xxx_probe' function.

Sorry for that.

> I made some investigations and found that 'mv88e6xxx_reg_lock' called twice from the same thread which leads to deadlock.
> 
> I added logs to 'mv88e6xxx_reg_lock' and 'mv88e6xxx_reg_unlock' functions to see what happened.

I hope you didn't spend too much time doing that. If you enable CONFIG_PROVE_LOCKING,
you should automatically get a stack trace with the two threads that
acquired the mutex leading to the deadlock.

I've sent a patch which solves that issue here:
https://patchwork.kernel.org/project/netdevbpf/patch/20221214110120.3368472-1-vladimir.oltean@nxp.com/
I've regression-tested it on 88E6390. Please confirm with a Tested-by:
tag on that patch that it does resolve the deadlock for 88E6176.

Thanks for reporting!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Subject: Locking mv88e6xxx_reg_lock twice leads deadlock for 88E6176 switch
  2022-12-14 11:03 ` Vladimir Oltean
@ 2022-12-14 11:42   ` Maxim Kiselev
  0 siblings, 0 replies; 3+ messages in thread
From: Maxim Kiselev @ 2022-12-14 11:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: fido_max, mw, Andrew Lunn, Florian Fainelli, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran,
	Russell King, netdev, linux-kernel

Thanks for the quick fix, Vladimir!

ср, 14 дек. 2022 г. в 14:04, Vladimir Oltean <olteanv@gmail.com>:
>
> On Wed, Dec 14, 2022 at 01:58:55AM +0300, Maksim Kiselev wrote:
> > Hello, friends.
>
> Hello.
>
> > I have a device with Marvell 88E6176 switch.
> > After 'mv88e6xxx: fix speed setting for CPU/DSA ports (cc1049ccee20)'commit was applied to
> > mainline kernel I faced with a problem that switch driver stuck at 'mv88e6xxx_probe' function.
>
> Sorry for that.
>
> > I made some investigations and found that 'mv88e6xxx_reg_lock' called twice from the same thread which leads to deadlock.
> >
> > I added logs to 'mv88e6xxx_reg_lock' and 'mv88e6xxx_reg_unlock' functions to see what happened.
>
> I hope you didn't spend too much time doing that. If you enable CONFIG_PROVE_LOCKING,
> you should automatically get a stack trace with the two threads that
> acquired the mutex leading to the deadlock.
>
> I've sent a patch which solves that issue here:
> https://patchwork.kernel.org/project/netdevbpf/patch/20221214110120.3368472-1-vladimir.oltean@nxp.com/
> I've regression-tested it on 88E6390. Please confirm with a Tested-by:
> tag on that patch that it does resolve the deadlock for 88E6176.
>
> Thanks for reporting!

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-14 11:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-13 22:58 Subject: Locking mv88e6xxx_reg_lock twice leads deadlock for 88E6176 switch Maksim Kiselev
2022-12-14 11:03 ` Vladimir Oltean
2022-12-14 11:42   ` Maxim Kiselev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.