All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
@ 2021-09-08  7:43 Joakim Zhang
  2021-09-08 11:40 ` patchwork-bot+netdevbpf
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Joakim Zhang @ 2021-09-08  7:43 UTC (permalink / raw)
  To: peppe.cavallaro, alexandre.torgue, joabreu, davem, kuba
  Cc: linux-imx, netdev, linux-kernel

commit 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after
napi disabled"), this patch tries to fix system hang caused by eee_ctrl_timer,
unfortunately, it only can resolve it for system reboot stress test. System
hang also can be reproduced easily during system suspend/resume stess test
when mount NFS on i.MX8MP EVK board.

In stmmac driver, eee feature is combined to phylink framework. When do
system suspend, phylink_stop() would queue delayed work, it invokes
stmmac_mac_link_down(), where to deactivate eee_ctrl_timer synchronizly.
In above commit, try to fix issue by deactivating eee_ctrl_timer obviously,
but it is not enough. Looking into eee_ctrl_timer expire callback
stmmac_eee_ctrl_timer(), it could enable hareware eee mode again. What is
unexpected is that LPI interrupt (MAC_Interrupt_Enable.LPIEN bit) is always
asserted. This interrupt has chance to be issued when LPI state entry/exit
from the MAC, and at that time, clock could have been already disabled.
The result is that system hang when driver try to touch register from
interrupt handler.

The reason why above commit can fix system hang issue in stmmac_release()
is that, deactivate eee_ctrl_timer not just after napi disabled, further
after irq freed.

In conclusion, hardware would generate LPI interrupt when clock has been
disabled during suspend or resume, since hardware is in eee mode and LPI
interrupt enabled.

Interrupts from MAC, MTL and DMA level are enabled and never been disabled
when system suspend, so postpone clocks management from suspend stage to
noirq suspend stage should be more safe.

Fixes: 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after napi disabled")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
---
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 14 ------
 .../ethernet/stmicro/stmmac/stmmac_platform.c | 44 +++++++++++++++++++
 2 files changed, 44 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index ece02b35a6ce..246f84fedbc8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7118,7 +7118,6 @@ int stmmac_suspend(struct device *dev)
 	struct net_device *ndev = dev_get_drvdata(dev);
 	struct stmmac_priv *priv = netdev_priv(ndev);
 	u32 chan;
-	int ret;
 
 	if (!ndev || !netif_running(ndev))
 		return 0;
@@ -7150,13 +7149,6 @@ int stmmac_suspend(struct device *dev)
 	} else {
 		stmmac_mac_set(priv, priv->ioaddr, false);
 		pinctrl_pm_select_sleep_state(priv->device);
-		/* Disable clock in case of PWM is off */
-		clk_disable_unprepare(priv->plat->clk_ptp_ref);
-		ret = pm_runtime_force_suspend(dev);
-		if (ret) {
-			mutex_unlock(&priv->lock);
-			return ret;
-		}
 	}
 
 	mutex_unlock(&priv->lock);
@@ -7242,12 +7234,6 @@ int stmmac_resume(struct device *dev)
 		priv->irq_wake = 0;
 	} else {
 		pinctrl_pm_select_default_state(priv->device);
-		/* enable the clk previously disabled */
-		ret = pm_runtime_force_resume(dev);
-		if (ret)
-			return ret;
-		if (priv->plat->clk_ptp_ref)
-			clk_prepare_enable(priv->plat->clk_ptp_ref);
 		/* reset the phy so that it's ready */
 		if (priv->mii)
 			stmmac_mdio_reset(priv->mii);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 5ca710844cc1..4885f9ad1b1e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -9,6 +9,7 @@
 *******************************************************************************/
 
 #include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
 #include <linux/module.h>
 #include <linux/io.h>
 #include <linux/of.h>
@@ -771,9 +772,52 @@ static int __maybe_unused stmmac_runtime_resume(struct device *dev)
 	return stmmac_bus_clks_config(priv, true);
 }
 
+static int stmmac_pltfr_noirq_suspend(struct device *dev)
+{
+	struct net_device *ndev = dev_get_drvdata(dev);
+	struct stmmac_priv *priv = netdev_priv(ndev);
+	int ret;
+
+	if (!netif_running(ndev))
+		return 0;
+
+	if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
+		/* Disable clock in case of PWM is off */
+		clk_disable_unprepare(priv->plat->clk_ptp_ref);
+
+		ret = pm_runtime_force_suspend(dev);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int stmmac_pltfr_noirq_resume(struct device *dev)
+{
+	struct net_device *ndev = dev_get_drvdata(dev);
+	struct stmmac_priv *priv = netdev_priv(ndev);
+	int ret;
+
+	if (!netif_running(ndev))
+		return 0;
+
+	if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
+		/* enable the clk previously disabled */
+		ret = pm_runtime_force_resume(dev);
+		if (ret)
+			return ret;
+
+		clk_prepare_enable(priv->plat->clk_ptp_ref);
+	}
+
+	return 0;
+}
+
 const struct dev_pm_ops stmmac_pltfr_pm_ops = {
 	SET_SYSTEM_SLEEP_PM_OPS(stmmac_pltfr_suspend, stmmac_pltfr_resume)
 	SET_RUNTIME_PM_OPS(stmmac_runtime_suspend, stmmac_runtime_resume, NULL)
+	SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(stmmac_pltfr_noirq_suspend, stmmac_pltfr_noirq_resume)
 };
 EXPORT_SYMBOL_GPL(stmmac_pltfr_pm_ops);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
  2021-09-08  7:43 [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume Joakim Zhang
@ 2021-09-08 11:40 ` patchwork-bot+netdevbpf
  2021-09-09  8:46 ` kernel test robot
  2021-11-11 15:16 ` Vladimir Oltean
  2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-09-08 11:40 UTC (permalink / raw)
  To: Joakim Zhang
  Cc: peppe.cavallaro, alexandre.torgue, joabreu, davem, kuba,
	linux-imx, netdev, linux-kernel

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Wed,  8 Sep 2021 15:43:35 +0800 you wrote:
> commit 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after
> napi disabled"), this patch tries to fix system hang caused by eee_ctrl_timer,
> unfortunately, it only can resolve it for system reboot stress test. System
> hang also can be reproduced easily during system suspend/resume stess test
> when mount NFS on i.MX8MP EVK board.
> 
> In stmmac driver, eee feature is combined to phylink framework. When do
> system suspend, phylink_stop() would queue delayed work, it invokes
> stmmac_mac_link_down(), where to deactivate eee_ctrl_timer synchronizly.
> In above commit, try to fix issue by deactivating eee_ctrl_timer obviously,
> but it is not enough. Looking into eee_ctrl_timer expire callback
> stmmac_eee_ctrl_timer(), it could enable hareware eee mode again. What is
> unexpected is that LPI interrupt (MAC_Interrupt_Enable.LPIEN bit) is always
> asserted. This interrupt has chance to be issued when LPI state entry/exit
> from the MAC, and at that time, clock could have been already disabled.
> The result is that system hang when driver try to touch register from
> interrupt handler.
> 
> [...]

Here is the summary with links:
  - [net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
    https://git.kernel.org/netdev/net/c/276aae377206

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
  2021-09-08  7:43 [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume Joakim Zhang
  2021-09-08 11:40 ` patchwork-bot+netdevbpf
@ 2021-09-09  8:46 ` kernel test robot
  2021-11-11 15:16 ` Vladimir Oltean
  2 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2021-09-09  8:46 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3332 bytes --]

Hi Joakim,

I love your patch! Yet something to improve:

[auto build test ERROR on net/master]

url:    https://github.com/0day-ci/linux/commits/Joakim-Zhang/net-stmmac-fix-system-hang-caused-by-eee_ctrl_timer-during-suspend-resume/20210908-154605
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 626bf91a292e2035af5b9d9cce35c5c138dfe06d
config: nios2-allyesconfig (attached as .config)
compiler: nios2-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/2336c3b3edee28ce2cbe0ca5f3a4e2cd40247c61
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Joakim-Zhang/net-stmmac-fix-system-hang-caused-by-eee_ctrl_timer-during-suspend-resume/20210908-154605
        git checkout 2336c3b3edee28ce2cbe0ca5f3a4e2cd40247c61
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=nios2 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function]
     796 | static int stmmac_pltfr_noirq_resume(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function]
     775 | static int stmmac_pltfr_noirq_suspend(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors


vim +/stmmac_pltfr_noirq_resume +796 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c

   774	
 > 775	static int stmmac_pltfr_noirq_suspend(struct device *dev)
   776	{
   777		struct net_device *ndev = dev_get_drvdata(dev);
   778		struct stmmac_priv *priv = netdev_priv(ndev);
   779		int ret;
   780	
   781		if (!netif_running(ndev))
   782			return 0;
   783	
   784		if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
   785			/* Disable clock in case of PWM is off */
   786			clk_disable_unprepare(priv->plat->clk_ptp_ref);
   787	
   788			ret = pm_runtime_force_suspend(dev);
   789			if (ret)
   790				return ret;
   791		}
   792	
   793		return 0;
   794	}
   795	
 > 796	static int stmmac_pltfr_noirq_resume(struct device *dev)
   797	{
   798		struct net_device *ndev = dev_get_drvdata(dev);
   799		struct stmmac_priv *priv = netdev_priv(ndev);
   800		int ret;
   801	
   802		if (!netif_running(ndev))
   803			return 0;
   804	
   805		if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
   806			/* enable the clk previously disabled */
   807			ret = pm_runtime_force_resume(dev);
   808			if (ret)
   809				return ret;
   810	
   811			clk_prepare_enable(priv->plat->clk_ptp_ref);
   812		}
   813	
   814		return 0;
   815	}
   816	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 60616 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
  2021-09-08  7:43 [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume Joakim Zhang
  2021-09-08 11:40 ` patchwork-bot+netdevbpf
  2021-09-09  8:46 ` kernel test robot
@ 2021-11-11 15:16 ` Vladimir Oltean
  2021-11-12  1:21   ` Joakim Zhang
  2 siblings, 1 reply; 6+ messages in thread
From: Vladimir Oltean @ 2021-11-11 15:16 UTC (permalink / raw)
  To: Joakim Zhang
  Cc: peppe.cavallaro, alexandre.torgue, joabreu, davem, kuba,
	linux-imx, netdev, linux-kernel

Hi Joakim,

On Wed, Sep 08, 2021 at 03:43:35PM +0800, Joakim Zhang wrote:
> commit 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after
> napi disabled"), this patch tries to fix system hang caused by eee_ctrl_timer,
> unfortunately, it only can resolve it for system reboot stress test. System
> hang also can be reproduced easily during system suspend/resume stess test
> when mount NFS on i.MX8MP EVK board.
> 
> In stmmac driver, eee feature is combined to phylink framework. When do
> system suspend, phylink_stop() would queue delayed work, it invokes
> stmmac_mac_link_down(), where to deactivate eee_ctrl_timer synchronizly.
> In above commit, try to fix issue by deactivating eee_ctrl_timer obviously,
> but it is not enough. Looking into eee_ctrl_timer expire callback
> stmmac_eee_ctrl_timer(), it could enable hareware eee mode again. What is
> unexpected is that LPI interrupt (MAC_Interrupt_Enable.LPIEN bit) is always
> asserted. This interrupt has chance to be issued when LPI state entry/exit
> from the MAC, and at that time, clock could have been already disabled.
> The result is that system hang when driver try to touch register from
> interrupt handler.
> 
> The reason why above commit can fix system hang issue in stmmac_release()
> is that, deactivate eee_ctrl_timer not just after napi disabled, further
> after irq freed.
> 
> In conclusion, hardware would generate LPI interrupt when clock has been
> disabled during suspend or resume, since hardware is in eee mode and LPI
> interrupt enabled.
> 
> Interrupts from MAC, MTL and DMA level are enabled and never been disabled
> when system suspend, so postpone clocks management from suspend stage to
> noirq suspend stage should be more safe.
> 
> Fixes: 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after napi disabled")
> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
> ---
>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 14 ------
>  .../ethernet/stmicro/stmmac/stmmac_platform.c | 44 +++++++++++++++++++
>  2 files changed, 44 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index ece02b35a6ce..246f84fedbc8 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -7118,7 +7118,6 @@ int stmmac_suspend(struct device *dev)
>  	struct net_device *ndev = dev_get_drvdata(dev);
>  	struct stmmac_priv *priv = netdev_priv(ndev);
>  	u32 chan;
> -	int ret;
>  
>  	if (!ndev || !netif_running(ndev))
>  		return 0;
> @@ -7150,13 +7149,6 @@ int stmmac_suspend(struct device *dev)
>  	} else {
>  		stmmac_mac_set(priv, priv->ioaddr, false);
>  		pinctrl_pm_select_sleep_state(priv->device);
> -		/* Disable clock in case of PWM is off */
> -		clk_disable_unprepare(priv->plat->clk_ptp_ref);

This patch looks strange to me. You are basically saying that the LPI
timer for MAC-level EEE is clocked from the clk_ptp_ref clock?! Are you
sure this is correct? I thought this clock is only used for the PTP
timestamping counter. Maybe the clock definitions in imx8mp.dtsi are not
correct?

> -		ret = pm_runtime_force_suspend(dev);
> -		if (ret) {
> -			mutex_unlock(&priv->lock);
> -			return ret;
> -		}
>  	}
>  
>  	mutex_unlock(&priv->lock);
> @@ -7242,12 +7234,6 @@ int stmmac_resume(struct device *dev)
>  		priv->irq_wake = 0;
>  	} else {
>  		pinctrl_pm_select_default_state(priv->device);
> -		/* enable the clk previously disabled */
> -		ret = pm_runtime_force_resume(dev);
> -		if (ret)
> -			return ret;
> -		if (priv->plat->clk_ptp_ref)
> -			clk_prepare_enable(priv->plat->clk_ptp_ref);
>  		/* reset the phy so that it's ready */
>  		if (priv->mii)
>  			stmmac_mdio_reset(priv->mii);
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> index 5ca710844cc1..4885f9ad1b1e 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> @@ -9,6 +9,7 @@
>  *******************************************************************************/
>  
>  #include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
>  #include <linux/module.h>
>  #include <linux/io.h>
>  #include <linux/of.h>
> @@ -771,9 +772,52 @@ static int __maybe_unused stmmac_runtime_resume(struct device *dev)
>  	return stmmac_bus_clks_config(priv, true);
>  }
>  
> +static int stmmac_pltfr_noirq_suspend(struct device *dev)
> +{
> +	struct net_device *ndev = dev_get_drvdata(dev);
> +	struct stmmac_priv *priv = netdev_priv(ndev);
> +	int ret;
> +
> +	if (!netif_running(ndev))
> +		return 0;
> +
> +	if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
> +		/* Disable clock in case of PWM is off */
> +		clk_disable_unprepare(priv->plat->clk_ptp_ref);
> +
> +		ret = pm_runtime_force_suspend(dev);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int stmmac_pltfr_noirq_resume(struct device *dev)
> +{
> +	struct net_device *ndev = dev_get_drvdata(dev);
> +	struct stmmac_priv *priv = netdev_priv(ndev);
> +	int ret;
> +
> +	if (!netif_running(ndev))
> +		return 0;
> +
> +	if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
> +		/* enable the clk previously disabled */
> +		ret = pm_runtime_force_resume(dev);
> +		if (ret)
> +			return ret;
> +
> +		clk_prepare_enable(priv->plat->clk_ptp_ref);
> +	}
> +
> +	return 0;
> +}
> +
>  const struct dev_pm_ops stmmac_pltfr_pm_ops = {
>  	SET_SYSTEM_SLEEP_PM_OPS(stmmac_pltfr_suspend, stmmac_pltfr_resume)
>  	SET_RUNTIME_PM_OPS(stmmac_runtime_suspend, stmmac_runtime_resume, NULL)
> +	SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(stmmac_pltfr_noirq_suspend, stmmac_pltfr_noirq_resume)
>  };
>  EXPORT_SYMBOL_GPL(stmmac_pltfr_pm_ops);
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
  2021-11-11 15:16 ` Vladimir Oltean
@ 2021-11-12  1:21   ` Joakim Zhang
  2021-11-12 13:43     ` Vladimir Oltean
  0 siblings, 1 reply; 6+ messages in thread
From: Joakim Zhang @ 2021-11-12  1:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: peppe.cavallaro, alexandre.torgue, joabreu, davem, kuba,
	dl-linux-imx, netdev, linux-kernel


Hi Vladimir,

> -----Original Message-----
> From: Vladimir Oltean <olteanv@gmail.com>
> Sent: 2021年11月11日 23:16
> To: Joakim Zhang <qiangqing.zhang@nxp.com>
> Cc: peppe.cavallaro@st.com; alexandre.torgue@foss.st.com;
> joabreu@synopsys.com; davem@davemloft.net; kuba@kernel.org;
> dl-linux-imx <linux-imx@nxp.com>; netdev@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH net] net: stmmac: fix system hang caused by
> eee_ctrl_timer during suspend/resume
> 
> Hi Joakim,
> 
> On Wed, Sep 08, 2021 at 03:43:35PM +0800, Joakim Zhang wrote:
> > commit 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after
> > napi disabled"), this patch tries to fix system hang caused by
> > eee_ctrl_timer, unfortunately, it only can resolve it for system
> > reboot stress test. System hang also can be reproduced easily during
> > system suspend/resume stess test when mount NFS on i.MX8MP EVK
> board.
> >
> > In stmmac driver, eee feature is combined to phylink framework. When
> > do system suspend, phylink_stop() would queue delayed work, it invokes
> > stmmac_mac_link_down(), where to deactivate eee_ctrl_timer
> synchronizly.
> > In above commit, try to fix issue by deactivating eee_ctrl_timer
> > obviously, but it is not enough. Looking into eee_ctrl_timer expire
> > callback stmmac_eee_ctrl_timer(), it could enable hareware eee mode
> > again. What is unexpected is that LPI interrupt
> > (MAC_Interrupt_Enable.LPIEN bit) is always asserted. This interrupt
> > has chance to be issued when LPI state entry/exit from the MAC, and at
> that time, clock could have been already disabled.
> > The result is that system hang when driver try to touch register from
> > interrupt handler.
> >
> > The reason why above commit can fix system hang issue in
> > stmmac_release() is that, deactivate eee_ctrl_timer not just after
> > napi disabled, further after irq freed.
> >
> > In conclusion, hardware would generate LPI interrupt when clock has
> > been disabled during suspend or resume, since hardware is in eee mode
> > and LPI interrupt enabled.
> >
> > Interrupts from MAC, MTL and DMA level are enabled and never been
> > disabled when system suspend, so postpone clocks management from
> > suspend stage to noirq suspend stage should be more safe.
> >
> > Fixes: 5f58591323bf ("net: stmmac: delete the eee_ctrl_timer after
> > napi disabled")
> > Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
> > ---
> >  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 14 ------
> > .../ethernet/stmicro/stmmac/stmmac_platform.c | 44
> +++++++++++++++++++
> >  2 files changed, 44 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index ece02b35a6ce..246f84fedbc8 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -7118,7 +7118,6 @@ int stmmac_suspend(struct device *dev)
> >  	struct net_device *ndev = dev_get_drvdata(dev);
> >  	struct stmmac_priv *priv = netdev_priv(ndev);
> >  	u32 chan;
> > -	int ret;
> >
> >  	if (!ndev || !netif_running(ndev))
> >  		return 0;
> > @@ -7150,13 +7149,6 @@ int stmmac_suspend(struct device *dev)
> >  	} else {
> >  		stmmac_mac_set(priv, priv->ioaddr, false);
> >  		pinctrl_pm_select_sleep_state(priv->device);
> > -		/* Disable clock in case of PWM is off */
> > -		clk_disable_unprepare(priv->plat->clk_ptp_ref);
> 
> This patch looks strange to me. You are basically saying that the LPI timer for
> MAC-level EEE is clocked from the clk_ptp_ref clock?! Are you sure this is
> correct? I thought this clock is only used for the PTP timestamping counter.
> Maybe the clock definitions in imx8mp.dtsi are not correct?

No, MAC-level EEE is not clocked from the clk_ptp_ref clock, this clock is for PTP. To fix this issue,
I can only move pm_runtime_force_suspend() into noirq suspend stage. As commit message said, 
"postpone clocks management from suspend stage to noirq suspend stage", that means I move all
the clocks management into noirq suspend stage, including PTP clock, it's should be harmless for PTP
function. Do you find any regression with this patch? 

Best Regards,
Joakim Zhang
> > -		ret = pm_runtime_force_suspend(dev);
> > -		if (ret) {
> > -			mutex_unlock(&priv->lock);
> > -			return ret;
> > -		}
> >  	}
> >
> >  	mutex_unlock(&priv->lock);
> > @@ -7242,12 +7234,6 @@ int stmmac_resume(struct device *dev)
> >  		priv->irq_wake = 0;
> >  	} else {
> >  		pinctrl_pm_select_default_state(priv->device);
> > -		/* enable the clk previously disabled */
> > -		ret = pm_runtime_force_resume(dev);
> > -		if (ret)
> > -			return ret;
> > -		if (priv->plat->clk_ptp_ref)
> > -			clk_prepare_enable(priv->plat->clk_ptp_ref);
> >  		/* reset the phy so that it's ready */
> >  		if (priv->mii)
> >  			stmmac_mdio_reset(priv->mii);
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > index 5ca710844cc1..4885f9ad1b1e 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > @@ -9,6 +9,7 @@
> >
> >
> **********************************************************
> ************
> > *********/
> >
> >  #include <linux/platform_device.h>
> > +#include <linux/pm_runtime.h>
> >  #include <linux/module.h>
> >  #include <linux/io.h>
> >  #include <linux/of.h>
> > @@ -771,9 +772,52 @@ static int __maybe_unused
> stmmac_runtime_resume(struct device *dev)
> >  	return stmmac_bus_clks_config(priv, true);  }
> >
> > +static int stmmac_pltfr_noirq_suspend(struct device *dev) {
> > +	struct net_device *ndev = dev_get_drvdata(dev);
> > +	struct stmmac_priv *priv = netdev_priv(ndev);
> > +	int ret;
> > +
> > +	if (!netif_running(ndev))
> > +		return 0;
> > +
> > +	if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
> > +		/* Disable clock in case of PWM is off */
> > +		clk_disable_unprepare(priv->plat->clk_ptp_ref);
> > +
> > +		ret = pm_runtime_force_suspend(dev);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int stmmac_pltfr_noirq_resume(struct device *dev) {
> > +	struct net_device *ndev = dev_get_drvdata(dev);
> > +	struct stmmac_priv *priv = netdev_priv(ndev);
> > +	int ret;
> > +
> > +	if (!netif_running(ndev))
> > +		return 0;
> > +
> > +	if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
> > +		/* enable the clk previously disabled */
> > +		ret = pm_runtime_force_resume(dev);
> > +		if (ret)
> > +			return ret;
> > +
> > +		clk_prepare_enable(priv->plat->clk_ptp_ref);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  const struct dev_pm_ops stmmac_pltfr_pm_ops = {
> >  	SET_SYSTEM_SLEEP_PM_OPS(stmmac_pltfr_suspend,
> stmmac_pltfr_resume)
> >  	SET_RUNTIME_PM_OPS(stmmac_runtime_suspend,
> stmmac_runtime_resume,
> > NULL)
> > +	SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(stmmac_pltfr_noirq_suspend,
> > +stmmac_pltfr_noirq_resume)
> >  };
> >  EXPORT_SYMBOL_GPL(stmmac_pltfr_pm_ops);
> >
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
  2021-11-12  1:21   ` Joakim Zhang
@ 2021-11-12 13:43     ` Vladimir Oltean
  0 siblings, 0 replies; 6+ messages in thread
From: Vladimir Oltean @ 2021-11-12 13:43 UTC (permalink / raw)
  To: Joakim Zhang
  Cc: peppe.cavallaro, alexandre.torgue, joabreu, davem, kuba,
	dl-linux-imx, netdev, linux-kernel

On Fri, Nov 12, 2021 at 01:21:57AM +0000, Joakim Zhang wrote:
> Hi Vladimir,
>
> > Hi Joakim,
> >
> > This patch looks strange to me. You are basically saying that the LPI timer for
> > MAC-level EEE is clocked from the clk_ptp_ref clock?! Are you sure this is
> > correct? I thought this clock is only used for the PTP timestamping counter.
> > Maybe the clock definitions in imx8mp.dtsi are not correct?
>
> No, MAC-level EEE is not clocked from the clk_ptp_ref clock, this clock is for PTP. To fix this issue,
> I can only move pm_runtime_force_suspend() into noirq suspend stage. As commit message said,
> "postpone clocks management from suspend stage to noirq suspend stage", that means I move all
> the clocks management into noirq suspend stage, including PTP clock, it's should be harmless for PTP
> function. Do you find any regression with this patch?

No, I didn't find any regression, I was just frustrated yesterday that a
bugfix patch I wanted to send to "stable" on the handling of clk_ptp_ref
is conflicting with your (apparently unnecessary) movement of this clock
to the noirq suspend stage. I'm better now.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-12 13:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-08  7:43 [PATCH net] net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume Joakim Zhang
2021-09-08 11:40 ` patchwork-bot+netdevbpf
2021-09-09  8:46 ` kernel test robot
2021-11-11 15:16 ` Vladimir Oltean
2021-11-12  1:21   ` Joakim Zhang
2021-11-12 13:43     ` Vladimir Oltean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.