linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bough Chen <haibo.chen@nxp.com>
To: Lucas Stach <l.stach@pengutronix.de>,
	Fabio Estevam <festevam@gmail.com>,
	Angus Ainslie <angus@akkea.ca>,
	Leonard Crestez <leonard.crestez@nxp.com>,
	Peng Fan <peng.fan@nxp.com>, Abel Vesa <abel.vesa@nxp.com>,
	Stephen Boyd <sboyd@kernel.org>,
	Michael Turquette <mturquette@baylibre.com>
Cc: "Ulf Hansson" <ulf.hansson@linaro.org>,
	"Guido Günther" <agx@sigxcpu.org>,
	linux-mmc <linux-mmc@vger.kernel.org>,
	"Adrian Hunter" <adrian.hunter@intel.com>,
	dl-linux-imx <linux-imx@nxp.com>,
	"Sascha Hauer" <kernel@pengutronix.de>,
	"moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE"
	<linux-arm-kernel@lists.infradead.org>
Subject: RE: sdhci timeout on imx8mq
Date: Wed, 6 Jan 2021 09:29:45 +0000	[thread overview]
Message-ID: <VI1PR04MB52945F58FDD4A2902A30F2C290D00@VI1PR04MB5294.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <cd99776c0107833d69c9c7fc4c8d6ba1a41ea3d7.camel@pengutronix.de>

> -----Original Message-----
> From: Lucas Stach [mailto:l.stach@pengutronix.de]
> Sent: 2021年1月5日 23:07
> To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard Crestez
> <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel Vesa
> <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael Turquette
> <mturquette@baylibre.com>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha Hauer
> <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
> 
> Hi all,
> 
> Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: 2020年7月7日 20:45
> > > To: Angus Ainslie <angus@akkea.ca>
> > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>; linux-
> > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Angus,
> > >
> > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > wrote:
> > >
> > > > Has there been any progress with this. I'm getting this on about
> > > > 50% of
> > >
> > > Not from my side, sorry.
> > >
> > > Bough,
> > >
> > > Do you know why this problem affects the imx8mq-evk versions that
> > > are populated with the Micron eMMC and not the ones with Sandisk
> > > eMMC?
> >
> > Hi Angus,
> >
> > Can you show me the full fail log? I do not meet this issue on my
> > side, besides, which kind of uboot do you use?
> 
> I was finally able to bisect this issue, which wasn't that much fun due to the
> issue not being reproducible 100%. :/ Turns out that the issue is even more
> interesting than I thought and likely doesn't have anything to do with SDHCI or
> used bootloader versions. Here's my current debugging state:
> 
> I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define gates for
> pll1/2 fixed dividers). The change itself looks fine to me, still CC'ed Leonard for
> good measure.
> 
> In my testing the following partial revert fixes the issue:
> 
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
>         hws[IMX8MQ_SYS1_PLL_133M_CG] =
> imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
>         hws[IMX8MQ_SYS1_PLL_160M_CG] =
> imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
>         hws[IMX8MQ_SYS1_PLL_200M_CG] =
> imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> -       hws[IMX8MQ_SYS1_PLL_266M_CG] =
> imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
>         hws[IMX8MQ_SYS1_PLL_400M_CG] =
> imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
>         hws[IMX8MQ_SYS1_PLL_800M_CG] =
> imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
> 
> @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
>         hws[IMX8MQ_SYS1_PLL_133M] =
> imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
>         hws[IMX8MQ_SYS1_PLL_160M] =
> imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
>         hws[IMX8MQ_SYS1_PLL_200M] =
> imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> -       hws[IMX8MQ_SYS1_PLL_266M] =
> imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> +       hws[IMX8MQ_SYS1_PLL_266M] =
> + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
>         hws[IMX8MQ_SYS1_PLL_400M] =
> imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
>         hws[IMX8MQ_SYS1_PLL_800M] =
> imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
> 
> The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated that the
> SDHCI driver properly enables this bus clock across the problematic card access.
> So what I think is happening here is that both nand_usdhc_bus and
> sys1_pll_266m are initially enabled. Sometime during boot sys1_pll_266m gets
> disabled due to runtime PM on the enet_axi clock, which is a direct child of
> sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but no consumer
> has claimed the clock yet, so the parent clock gets disabled while this branch of
> the clock tree is still active.

Hi Lucas,

According to the clock tree, if nand_usdhc_bus is still enabled, then sys1_pll_266m has no chance to disable.

    sys1_pll_266m_cg                  1        1        0   800000000          0     0  50000         Y
       sys1_pll_266m                  1        1        0   266666666          0     0  50000         Y
          nand_usdhc_bus              0        0        0   266666666          0     0  50000         N
             nand_usdhc_rawnand_clk       0        0        0   266666666          0     0  50000         N
          enet_axi                    1        1        0   266666666          0     0  50000         Y
             enet1_root_clk           2        2        0   266666666          0     0  50000         Y


This issue seems related with the following errta:

e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS clocks
Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
Due to current physical design implementation, AHB_BUS and IPG_CLK must come from
same clock source to maintain clock sync.
Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.

After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus and USDHC IPG_clk again. (Here usdhc AHB BUS source from nand_usdhc_bus.)
This sync is handle by hardware, and maybe need some time, during this sync period, usdhc operation may has issue.

I just double check our local v5.10 branch, already revert the commit b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
So to fix this issue, one method is revert this patch, another method is keep the 'nand_usdhc_bus' always on. Add change like this:

diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
index 779ea69e639c..939806b36916 100644
--- a/drivers/clk/imx/clk-imx8mq.c
+++ b/drivers/clk/imx/clk-imx8mq.c
@@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
        /* BUS */
        hws[IMX8MQ_CLK_MAIN_AXI] = imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels, base + 0x8800);
        hws[IMX8MQ_CLK_ENET_AXI] = imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base + 0x8880);
-       hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
+       hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
        hws[IMX8MQ_CLK_VPU_BUS] = imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base + 0x8980);
        hws[IMX8MQ_CLK_DISP_AXI] = imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base + 0x8a00);
        hws[IMX8MQ_CLK_DISP_APB] = imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base + 0x8a80);


What you think? Or any other suggestion?

> 
> The reference manual states about this situation: "For any clock, its source
> must be left on when it is kept on. Behavior is undefined if this rule is violated."
> And it seems this is exactly what's happening here: some kind of glitch is
> introduced in the nand_usdhc_bus clock, which prevents the SDHCI controller
> from working, even though the clock branch is properly enabled later on. On my
> system the SDHCI timeout and following runtime suspend/resume cycle on the
> nand_usdhc_bus clock seem to get it back into a working state.
> 
> So I think we need some solution at the clock driver/framework level to prevent
> shutting down parent clocks that have active branches, even if those branches
> aren't claimed by a consumer (yet).
> 
> Regards,
> Lucas


  reply	other threads:[~2021-01-06  9:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-03 19:19 sdhci timeout on imx8mq Fabio Estevam
2020-02-05  9:26 ` Guido Günther
2020-02-05 13:18   ` Fabio Estevam
2020-02-07  2:11     ` BOUGH CHEN
     [not found]       ` <VI1PR04MB504091C7991353F6092A8D91901A0@VI1PR04MB5040.eurprd04.prod.outlook.com>
2020-02-13 10:53         ` Fabio Estevam
2020-06-30 19:39           ` Angus Ainslie
2020-07-07 12:44             ` Fabio Estevam
2020-07-08  1:32               ` BOUGH CHEN
2020-12-18 20:07                 ` Lucas Stach
2020-12-18 20:45                   ` Angus Ainslie
2020-12-23 21:06                   ` Angus Ainslie
2021-01-05 15:06                 ` Lucas Stach
2021-01-06  9:29                   ` Bough Chen [this message]
2021-01-06 15:09                     ` Lucas Stach
2021-01-07  1:47                       ` Bough Chen
2021-01-06 18:56                   ` Fabio Estevam
2021-01-07  1:30                     ` Jacky Bai
2021-01-07 11:26                       ` Lucas Stach
2021-01-08  1:27                         ` Jacky Bai
2021-03-09  7:35                         ` Heiko Thiery
2021-01-19  2:35                   ` Peng Fan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR04MB52945F58FDD4A2902A30F2C290D00@VI1PR04MB5294.eurprd04.prod.outlook.com \
    --to=haibo.chen@nxp.com \
    --cc=abel.vesa@nxp.com \
    --cc=adrian.hunter@intel.com \
    --cc=agx@sigxcpu.org \
    --cc=angus@akkea.ca \
    --cc=festevam@gmail.com \
    --cc=kernel@pengutronix.de \
    --cc=l.stach@pengutronix.de \
    --cc=leonard.crestez@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-imx@nxp.com \
    --cc=linux-mmc@vger.kernel.org \
    --cc=mturquette@baylibre.com \
    --cc=peng.fan@nxp.com \
    --cc=sboyd@kernel.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).