linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net v3] net: dsa: microchip: fix race condition
@ 2020-10-07  8:55 Christian Eggers
  2020-10-09 20:04 ` Jakub Kicinski
  0 siblings, 1 reply; 2+ messages in thread
From: Christian Eggers @ 2020-10-07  8:55 UTC (permalink / raw)
  To: Vladimir Oltean, Woojung Huh, Microchip Linux Driver Support
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S . Miller,
	Jakub Kicinski, netdev, linux-kernel, Christian Eggers

Between queuing the delayed work and finishing the setup of the dsa
ports, the process may sleep in request_module() (via
phy_device_create()) and the queued work may be executed prior to the
switch net devices being registered. In ksz_mib_read_work(), a NULL
dereference will happen within netof_carrier_ok(dp->slave).

Not queuing the delayed work in ksz_init_mib_timer() makes things even
worse because the work will now be queued for immediate execution
(instead of 2000 ms) in ksz_mac_link_down() via
dsa_port_link_register_of().

Call tree:
ksz9477_i2c_probe()
\--ksz9477_switch_register()
   \--ksz_switch_register()
      +--dsa_register_switch()
      |  \--dsa_switch_probe()
      |     \--dsa_tree_setup()
      |        \--dsa_tree_setup_switches()
      |           +--dsa_switch_setup()
      |           |  +--ksz9477_setup()
      |           |  |  \--ksz_init_mib_timer()
      |           |  |     |--/* Start the timer 2 seconds later. */
      |           |  |     \--schedule_delayed_work(&dev->mib_read, msecs_to_jiffies(2000));
      |           |  \--__mdiobus_register()
      |           |     \--mdiobus_scan()
      |           |        \--get_phy_device()
      |           |           +--get_phy_id()
      |           |           \--phy_device_create()
      |           |              |--/* sleeping, ksz_mib_read_work() can be called meanwhile */
      |           |              \--request_module()
      |           |
      |           \--dsa_port_setup()
      |              +--/* Called for non-CPU ports */
      |              +--dsa_slave_create()
      |              |  +--/* Too late, ksz_mib_read_work() may be called beforehand */
      |              |  \--port->slave = ...
      |             ...
      |              +--Called for CPU port */
      |              \--dsa_port_link_register_of()
      |                 \--ksz_mac_link_down()
      |                    +--/* mib_read must be initialized here */
      |                    +--/* work is already scheduled, so it will be executed after 2000 ms */
      |                    \--schedule_delayed_work(&dev->mib_read, 0);
      \-- /* here port->slave is setup properly, scheduling the delayed work should be safe */

Solution:
1. Do not queue (only initialize) delayed work in ksz_init_mib_timer().
2. Only queue delayed work in ksz_mac_link_down() if init is completed.
3. Queue work once in ksz_switch_register(), after dsa_register_switch()
has completed.

Fixes: 7c6ff470aa86 ("net: dsa: microchip: add MIB counter reading support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
---
v3:
---------
- Use 12 digts for commit id in "Fixes:" tag

v2:
---------
- no changes in the patch itself
- use correct subject-prefix
- changed wording of commit description
- added call tree to commit description
- added "Fixes:" tag

 drivers/net/dsa/microchip/ksz_common.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 8e755b50c9c1..a94d2278b95c 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -103,14 +103,8 @@ void ksz_init_mib_timer(struct ksz_device *dev)
 
 	INIT_DELAYED_WORK(&dev->mib_read, ksz_mib_read_work);
 
-	/* Read MIB counters every 30 seconds to avoid overflow. */
-	dev->mib_read_interval = msecs_to_jiffies(30000);
-
 	for (i = 0; i < dev->mib_port_cnt; i++)
 		dev->dev_ops->port_init_cnt(dev, i);
-
-	/* Start the timer 2 seconds later. */
-	schedule_delayed_work(&dev->mib_read, msecs_to_jiffies(2000));
 }
 EXPORT_SYMBOL_GPL(ksz_init_mib_timer);
 
@@ -143,7 +137,9 @@ void ksz_mac_link_down(struct dsa_switch *ds, int port, unsigned int mode,
 
 	/* Read all MIB counters when the link is going down. */
 	p->read = true;
-	schedule_delayed_work(&dev->mib_read, 0);
+	/* timer started */
+	if (dev->mib_read_interval)
+		schedule_delayed_work(&dev->mib_read, 0);
 }
 EXPORT_SYMBOL_GPL(ksz_mac_link_down);
 
@@ -446,6 +442,12 @@ int ksz_switch_register(struct ksz_device *dev,
 		return ret;
 	}
 
+	/* Read MIB counters every 30 seconds to avoid overflow. */
+	dev->mib_read_interval = msecs_to_jiffies(30000);
+
+	/* Start the MIB timer. */
+	schedule_delayed_work(&dev->mib_read, 0);
+
 	return 0;
 }
 EXPORT_SYMBOL(ksz_switch_register);
-- 
Christian Eggers
Embedded software developer

Arnold & Richter Cine Technik GmbH & Co. Betriebs KG
Sitz: Muenchen - Registergericht: Amtsgericht Muenchen - Handelsregisternummer: HRA 57918
Persoenlich haftender Gesellschafter: Arnold & Richter Cine Technik GmbH
Sitz: Muenchen - Registergericht: Amtsgericht Muenchen - Handelsregisternummer: HRB 54477
Geschaeftsfuehrer: Dr. Michael Neuhaeuser; Stephan Schenk; Walter Trauninger; Markus Zeiler


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [net v3] net: dsa: microchip: fix race condition
  2020-10-07  8:55 [net v3] net: dsa: microchip: fix race condition Christian Eggers
@ 2020-10-09 20:04 ` Jakub Kicinski
  0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2020-10-09 20:04 UTC (permalink / raw)
  To: Christian Eggers
  Cc: Vladimir Oltean, Woojung Huh, Microchip Linux Driver Support,
	Andrew Lunn, Vivien Didelot, Florian Fainelli, David S . Miller,
	netdev, linux-kernel, George McCollister

On Wed, 7 Oct 2020 10:55:23 +0200 Christian Eggers wrote:
> Between queuing the delayed work and finishing the setup of the dsa
> ports, the process may sleep in request_module() (via
> phy_device_create()) and the queued work may be executed prior to the
> switch net devices being registered. In ksz_mib_read_work(), a NULL
> dereference will happen within netof_carrier_ok(dp->slave).
> 
> Not queuing the delayed work in ksz_init_mib_timer() makes things even
> worse because the work will now be queued for immediate execution
> (instead of 2000 ms) in ksz_mac_link_down() via
> dsa_port_link_register_of().
> 
> Call tree:
> ksz9477_i2c_probe()
> \--ksz9477_switch_register()
>    \--ksz_switch_register()
>       +--dsa_register_switch()
>       |  \--dsa_switch_probe()
>       |     \--dsa_tree_setup()
>       |        \--dsa_tree_setup_switches()
>       |           +--dsa_switch_setup()
>       |           |  +--ksz9477_setup()
>       |           |  |  \--ksz_init_mib_timer()
>       |           |  |     |--/* Start the timer 2 seconds later. */
>       |           |  |     \--schedule_delayed_work(&dev->mib_read, msecs_to_jiffies(2000));
>       |           |  \--__mdiobus_register()
>       |           |     \--mdiobus_scan()
>       |           |        \--get_phy_device()
>       |           |           +--get_phy_id()
>       |           |           \--phy_device_create()
>       |           |              |--/* sleeping, ksz_mib_read_work() can be called meanwhile */
>       |           |              \--request_module()
>       |           |
>       |           \--dsa_port_setup()
>       |              +--/* Called for non-CPU ports */
>       |              +--dsa_slave_create()
>       |              |  +--/* Too late, ksz_mib_read_work() may be called beforehand */
>       |              |  \--port->slave = ...
>       |             ...
>       |              +--Called for CPU port */
>       |              \--dsa_port_link_register_of()
>       |                 \--ksz_mac_link_down()
>       |                    +--/* mib_read must be initialized here */
>       |                    +--/* work is already scheduled, so it will be executed after 2000 ms */
>       |                    \--schedule_delayed_work(&dev->mib_read, 0);
>       \-- /* here port->slave is setup properly, scheduling the delayed work should be safe */

Thanks for this graph, very informative!

> Solution:
> 1. Do not queue (only initialize) delayed work in ksz_init_mib_timer().
> 2. Only queue delayed work in ksz_mac_link_down() if init is completed.
> 3. Queue work once in ksz_switch_register(), after dsa_register_switch()
> has completed.
> 
> Fixes: 7c6ff470aa86 ("net: dsa: microchip: add MIB counter reading support")
> Signed-off-by: Christian Eggers <ceggers@arri.de>

You should add Florian's and Vladimir's review tags here, under your
sign-off.

> @@ -143,7 +137,9 @@ void ksz_mac_link_down(struct dsa_switch *ds, int port, unsigned int mode,
>  
>  	/* Read all MIB counters when the link is going down. */
>  	p->read = true;
> -	schedule_delayed_work(&dev->mib_read, 0);
> +	/* timer started */
> +	if (dev->mib_read_interval)
> +		schedule_delayed_work(&dev->mib_read, 0);

Your patch seems fine, but I wonder what was the original author trying
to achieve with this schedule_delayed_work(..., 0) call?

The work is supposed to be scheduled at this point, right?
In that case another call to schedule_delayed_work() is
simply ignored. 

Judging by the comment it seems like someone was under the impression
this will reschedule the work to be run immediately, which is not the
case.

In fact looks like a separate bug introduced in:

469b390e1ba3 ("net: dsa: microchip: use delayed_work instead of timer + work")

>  }
>  EXPORT_SYMBOL_GPL(ksz_mac_link_down);
>  

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-10-09 20:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07  8:55 [net v3] net: dsa: microchip: fix race condition Christian Eggers
2020-10-09 20:04 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).