* Re: sdhci timeout on imx8mq
@ 2021-01-05 15:06 ` Lucas Stach
0 siblings, 0 replies; 42+ messages in thread
From: Lucas Stach @ 2021-01-05 15:06 UTC (permalink / raw)
To: BOUGH CHEN, Fabio Estevam, Angus Ainslie, Leonard Crestez,
Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi all,
Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > -----Original Message-----
> > From: Fabio Estevam [mailto:festevam@gmail.com]
> > Sent: 2020年7月7日 20:45
> > To: Angus Ainslie <angus@akkea.ca>
> > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>; linux-
> > mmc
> > <linux-mmc@vger.kernel.org>; Adrian Hunter
> > <adrian.hunter@intel.com>;
> > dl-linux-imx <linux-imx@nxp.com>; Sascha Hauer <
> > kernel@pengutronix.de>;
> > moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> > <linux-arm-kernel@lists.infradead.org>
> > Subject: Re: sdhci timeout on imx8mq
> >
> > Hi Angus,
> >
> > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > wrote:
> >
> > > Has there been any progress with this. I'm getting this on about
> > > 50%
> > > of
> >
> > Not from my side, sorry.
> >
> > Bough,
> >
> > Do you know why this problem affects the imx8mq-evk versions that
> > are
> > populated with the Micron eMMC and not the ones with Sandisk eMMC?
>
> Hi Angus,
>
> Can you show me the full fail log? I do not meet this issue on my
> side, besides, which kind of uboot do you use?
I was finally able to bisect this issue, which wasn't that much fun due
to the issue not being reproducible 100%. :/ Turns out that the issue
is even more interesting than I thought and likely doesn't have
anything to do with SDHCI or used bootloader versions. Here's my
current debugging state:
I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define gates
for pll1/2 fixed dividers). The change itself looks fine to me, still
CC'ed Leonard for good measure.
In my testing the following partial revert fixes the issue:
--- a/drivers/clk/imx/clk-imx8mq.c
+++ b/drivers/clk/imx/clk-imx8mq.c
@@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
hws[IMX8MQ_SYS1_PLL_133M_CG] = imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
hws[IMX8MQ_SYS1_PLL_160M_CG] = imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
hws[IMX8MQ_SYS1_PLL_200M_CG] = imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
- hws[IMX8MQ_SYS1_PLL_266M_CG] = imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
hws[IMX8MQ_SYS1_PLL_400M_CG] = imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
hws[IMX8MQ_SYS1_PLL_800M_CG] = imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
@@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
hws[IMX8MQ_SYS1_PLL_133M] = imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
hws[IMX8MQ_SYS1_PLL_160M] = imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
hws[IMX8MQ_SYS1_PLL_200M] = imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
- hws[IMX8MQ_SYS1_PLL_266M] = imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
+ hws[IMX8MQ_SYS1_PLL_266M] = imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
hws[IMX8MQ_SYS1_PLL_400M] = imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
hws[IMX8MQ_SYS1_PLL_800M] = imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated that
the SDHCI driver properly enables this bus clock across the problematic
card access. So what I think is happening here is that both
nand_usdhc_bus and sys1_pll_266m are initially enabled. Sometime during
boot sys1_pll_266m gets disabled due to runtime PM on the enet_axi
clock, which is a direct child of sys1_pll_266m. At this point
nand_usdhc_bus is still enabled, but no consumer has claimed the clock
yet, so the parent clock gets disabled while this branch of the clock
tree is still active.
The reference manual states about this situation: "For any clock, its
source must be left on when it is kept on. Behavior is undefined if
this rule is violated."
And it seems this is exactly what's happening here: some kind of glitch
is introduced in the nand_usdhc_bus clock, which prevents the SDHCI
controller from working, even though the clock branch is properly
enabled later on. On my system the SDHCI timeout and following runtime
suspend/resume cycle on the nand_usdhc_bus clock seem to get it back
into a working state.
So I think we need some solution at the clock driver/framework level to
prevent shutting down parent clocks that have active branches, even if
those branches aren't claimed by a consumer (yet).
Regards,
Lucas
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
2021-01-05 15:06 ` Lucas Stach
@ 2021-01-06 9:29 ` Bough Chen
-1 siblings, 0 replies; 42+ messages in thread
From: Bough Chen @ 2021-01-06 9:29 UTC (permalink / raw)
To: Lucas Stach, Fabio Estevam, Angus Ainslie, Leonard Crestez,
Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> -----Original Message-----
> From: Lucas Stach [mailto:l.stach@pengutronix.de]
> Sent: 2021年1月5日 23:07
> To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard Crestez
> <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel Vesa
> <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael Turquette
> <mturquette@baylibre.com>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha Hauer
> <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
>
> Hi all,
>
> Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: 2020年7月7日 20:45
> > > To: Angus Ainslie <angus@akkea.ca>
> > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>; linux-
> > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Angus,
> > >
> > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > wrote:
> > >
> > > > Has there been any progress with this. I'm getting this on about
> > > > 50% of
> > >
> > > Not from my side, sorry.
> > >
> > > Bough,
> > >
> > > Do you know why this problem affects the imx8mq-evk versions that
> > > are populated with the Micron eMMC and not the ones with Sandisk
> > > eMMC?
> >
> > Hi Angus,
> >
> > Can you show me the full fail log? I do not meet this issue on my
> > side, besides, which kind of uboot do you use?
>
> I was finally able to bisect this issue, which wasn't that much fun due to the
> issue not being reproducible 100%. :/ Turns out that the issue is even more
> interesting than I thought and likely doesn't have anything to do with SDHCI or
> used bootloader versions. Here's my current debugging state:
>
> I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define gates for
> pll1/2 fixed dividers). The change itself looks fine to me, still CC'ed Leonard for
> good measure.
>
> In my testing the following partial revert fixes the issue:
>
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M_CG] =
> imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
> hws[IMX8MQ_SYS1_PLL_160M_CG] =
> imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
> hws[IMX8MQ_SYS1_PLL_200M_CG] =
> imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
> hws[IMX8MQ_SYS1_PLL_400M_CG] =
> imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
> hws[IMX8MQ_SYS1_PLL_800M_CG] =
> imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
>
> @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M] =
> imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> hws[IMX8MQ_SYS1_PLL_160M] =
> imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> hws[IMX8MQ_SYS1_PLL_200M] =
> imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> - hws[IMX8MQ_SYS1_PLL_266M] =
> imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> + hws[IMX8MQ_SYS1_PLL_266M] =
> + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> hws[IMX8MQ_SYS1_PLL_400M] =
> imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> hws[IMX8MQ_SYS1_PLL_800M] =
> imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
>
> The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated that the
> SDHCI driver properly enables this bus clock across the problematic card access.
> So what I think is happening here is that both nand_usdhc_bus and
> sys1_pll_266m are initially enabled. Sometime during boot sys1_pll_266m gets
> disabled due to runtime PM on the enet_axi clock, which is a direct child of
> sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but no consumer
> has claimed the clock yet, so the parent clock gets disabled while this branch of
> the clock tree is still active.
Hi Lucas,
According to the clock tree, if nand_usdhc_bus is still enabled, then sys1_pll_266m has no chance to disable.
sys1_pll_266m_cg 1 1 0 800000000 0 0 50000 Y
sys1_pll_266m 1 1 0 266666666 0 0 50000 Y
nand_usdhc_bus 0 0 0 266666666 0 0 50000 N
nand_usdhc_rawnand_clk 0 0 0 266666666 0 0 50000 N
enet_axi 1 1 0 266666666 0 0 50000 Y
enet1_root_clk 2 2 0 266666666 0 0 50000 Y
This issue seems related with the following errta:
e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS clocks
Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
Due to current physical design implementation, AHB_BUS and IPG_CLK must come from
same clock source to maintain clock sync.
Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.
After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus and USDHC IPG_clk again. (Here usdhc AHB BUS source from nand_usdhc_bus.)
This sync is handle by hardware, and maybe need some time, during this sync period, usdhc operation may has issue.
I just double check our local v5.10 branch, already revert the commit b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
So to fix this issue, one method is revert this patch, another method is keep the 'nand_usdhc_bus' always on. Add change like this:
diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
index 779ea69e639c..939806b36916 100644
--- a/drivers/clk/imx/clk-imx8mq.c
+++ b/drivers/clk/imx/clk-imx8mq.c
@@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
/* BUS */
hws[IMX8MQ_CLK_MAIN_AXI] = imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels, base + 0x8800);
hws[IMX8MQ_CLK_ENET_AXI] = imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base + 0x8880);
- hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
+ hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
hws[IMX8MQ_CLK_VPU_BUS] = imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base + 0x8980);
hws[IMX8MQ_CLK_DISP_AXI] = imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base + 0x8a00);
hws[IMX8MQ_CLK_DISP_APB] = imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base + 0x8a80);
What you think? Or any other suggestion?
>
> The reference manual states about this situation: "For any clock, its source
> must be left on when it is kept on. Behavior is undefined if this rule is violated."
> And it seems this is exactly what's happening here: some kind of glitch is
> introduced in the nand_usdhc_bus clock, which prevents the SDHCI controller
> from working, even though the clock branch is properly enabled later on. On my
> system the SDHCI timeout and following runtime suspend/resume cycle on the
> nand_usdhc_bus clock seem to get it back into a working state.
>
> So I think we need some solution at the clock driver/framework level to prevent
> shutting down parent clocks that have active branches, even if those branches
> aren't claimed by a consumer (yet).
>
> Regards,
> Lucas
^ permalink raw reply related [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
@ 2021-01-06 9:29 ` Bough Chen
0 siblings, 0 replies; 42+ messages in thread
From: Bough Chen @ 2021-01-06 9:29 UTC (permalink / raw)
To: Lucas Stach, Fabio Estevam, Angus Ainslie, Leonard Crestez,
Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> -----Original Message-----
> From: Lucas Stach [mailto:l.stach@pengutronix.de]
> Sent: 2021年1月5日 23:07
> To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard Crestez
> <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel Vesa
> <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael Turquette
> <mturquette@baylibre.com>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha Hauer
> <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
>
> Hi all,
>
> Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: 2020年7月7日 20:45
> > > To: Angus Ainslie <angus@akkea.ca>
> > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>; linux-
> > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Angus,
> > >
> > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > wrote:
> > >
> > > > Has there been any progress with this. I'm getting this on about
> > > > 50% of
> > >
> > > Not from my side, sorry.
> > >
> > > Bough,
> > >
> > > Do you know why this problem affects the imx8mq-evk versions that
> > > are populated with the Micron eMMC and not the ones with Sandisk
> > > eMMC?
> >
> > Hi Angus,
> >
> > Can you show me the full fail log? I do not meet this issue on my
> > side, besides, which kind of uboot do you use?
>
> I was finally able to bisect this issue, which wasn't that much fun due to the
> issue not being reproducible 100%. :/ Turns out that the issue is even more
> interesting than I thought and likely doesn't have anything to do with SDHCI or
> used bootloader versions. Here's my current debugging state:
>
> I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define gates for
> pll1/2 fixed dividers). The change itself looks fine to me, still CC'ed Leonard for
> good measure.
>
> In my testing the following partial revert fixes the issue:
>
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M_CG] =
> imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
> hws[IMX8MQ_SYS1_PLL_160M_CG] =
> imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
> hws[IMX8MQ_SYS1_PLL_200M_CG] =
> imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
> hws[IMX8MQ_SYS1_PLL_400M_CG] =
> imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
> hws[IMX8MQ_SYS1_PLL_800M_CG] =
> imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
>
> @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M] =
> imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> hws[IMX8MQ_SYS1_PLL_160M] =
> imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> hws[IMX8MQ_SYS1_PLL_200M] =
> imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> - hws[IMX8MQ_SYS1_PLL_266M] =
> imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> + hws[IMX8MQ_SYS1_PLL_266M] =
> + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> hws[IMX8MQ_SYS1_PLL_400M] =
> imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> hws[IMX8MQ_SYS1_PLL_800M] =
> imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
>
> The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated that the
> SDHCI driver properly enables this bus clock across the problematic card access.
> So what I think is happening here is that both nand_usdhc_bus and
> sys1_pll_266m are initially enabled. Sometime during boot sys1_pll_266m gets
> disabled due to runtime PM on the enet_axi clock, which is a direct child of
> sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but no consumer
> has claimed the clock yet, so the parent clock gets disabled while this branch of
> the clock tree is still active.
Hi Lucas,
According to the clock tree, if nand_usdhc_bus is still enabled, then sys1_pll_266m has no chance to disable.
sys1_pll_266m_cg 1 1 0 800000000 0 0 50000 Y
sys1_pll_266m 1 1 0 266666666 0 0 50000 Y
nand_usdhc_bus 0 0 0 266666666 0 0 50000 N
nand_usdhc_rawnand_clk 0 0 0 266666666 0 0 50000 N
enet_axi 1 1 0 266666666 0 0 50000 Y
enet1_root_clk 2 2 0 266666666 0 0 50000 Y
This issue seems related with the following errta:
e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS clocks
Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
Due to current physical design implementation, AHB_BUS and IPG_CLK must come from
same clock source to maintain clock sync.
Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.
After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus and USDHC IPG_clk again. (Here usdhc AHB BUS source from nand_usdhc_bus.)
This sync is handle by hardware, and maybe need some time, during this sync period, usdhc operation may has issue.
I just double check our local v5.10 branch, already revert the commit b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
So to fix this issue, one method is revert this patch, another method is keep the 'nand_usdhc_bus' always on. Add change like this:
diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
index 779ea69e639c..939806b36916 100644
--- a/drivers/clk/imx/clk-imx8mq.c
+++ b/drivers/clk/imx/clk-imx8mq.c
@@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
/* BUS */
hws[IMX8MQ_CLK_MAIN_AXI] = imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels, base + 0x8800);
hws[IMX8MQ_CLK_ENET_AXI] = imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base + 0x8880);
- hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
+ hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
hws[IMX8MQ_CLK_VPU_BUS] = imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base + 0x8980);
hws[IMX8MQ_CLK_DISP_AXI] = imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base + 0x8a00);
hws[IMX8MQ_CLK_DISP_APB] = imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base + 0x8a80);
What you think? Or any other suggestion?
>
> The reference manual states about this situation: "For any clock, its source
> must be left on when it is kept on. Behavior is undefined if this rule is violated."
> And it seems this is exactly what's happening here: some kind of glitch is
> introduced in the nand_usdhc_bus clock, which prevents the SDHCI controller
> from working, even though the clock branch is properly enabled later on. On my
> system the SDHCI timeout and following runtime suspend/resume cycle on the
> nand_usdhc_bus clock seem to get it back into a working state.
>
> So I think we need some solution at the clock driver/framework level to prevent
> shutting down parent clocks that have active branches, even if those branches
> aren't claimed by a consumer (yet).
>
> Regards,
> Lucas
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
2021-01-06 9:29 ` Bough Chen
@ 2021-01-06 15:09 ` Lucas Stach
-1 siblings, 0 replies; 42+ messages in thread
From: Lucas Stach @ 2021-01-06 15:09 UTC (permalink / raw)
To: Bough Chen, Fabio Estevam, Angus Ainslie, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Bough,
Am Mittwoch, dem 06.01.2021 um 09:29 +0000 schrieb Bough Chen:
> > -----Original Message-----
> > From: Lucas Stach [mailto:l.stach@pengutronix.de]
> > Sent: 2021年1月5日 23:07
> > To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> > <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard
> > Crestez
> > <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel Vesa
> > <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael
> > Turquette
> > <mturquette@baylibre.com>
> > Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <
> > agx@sigxcpu.org>;
> > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > Hauer
> > <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> > ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > Subject: Re: sdhci timeout on imx8mq
> >
> > Hi all,
> >
> > Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > > -----Original Message-----
> > > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > > Sent: 2020年7月7日 20:45
> > > > To: Angus Ainslie <angus@akkea.ca>
> > > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > > linux-
> > > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>;
> > > > Sascha
> > > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE
> > > > IMX /
> > > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > > Subject: Re: sdhci timeout on imx8mq
> > > >
> > > > Hi Angus,
> > > >
> > > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > > wrote:
> > > >
> > > > > Has there been any progress with this. I'm getting this on
> > > > > about
> > > > > 50% of
> > > >
> > > > Not from my side, sorry.
> > > >
> > > > Bough,
> > > >
> > > > Do you know why this problem affects the imx8mq-evk versions
> > > > that
> > > > are populated with the Micron eMMC and not the ones with
> > > > Sandisk
> > > > eMMC?
> > >
> > > Hi Angus,
> > >
> > > Can you show me the full fail log? I do not meet this issue on my
> > > side, besides, which kind of uboot do you use?
> >
> > I was finally able to bisect this issue, which wasn't that much fun
> > due to the
> > issue not being reproducible 100%. :/ Turns out that the issue is
> > even more
> > interesting than I thought and likely doesn't have anything to do
> > with SDHCI or
> > used bootloader versions. Here's my current debugging state:
> >
> > I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define
> > gates for
> > pll1/2 fixed dividers). The change itself looks fine to me, still
> > CC'ed Leonard for
> > good measure.
> >
> > In my testing the following partial revert fixes the issue:
> >
> > --- a/drivers/clk/imx/clk-imx8mq.c
> > +++ b/drivers/clk/imx/clk-imx8mq.c
> > @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> > platform_device *pdev)
> > hws[IMX8MQ_SYS1_PLL_133M_CG] =
> > imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
> > hws[IMX8MQ_SYS1_PLL_160M_CG] =
> > imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
> > hws[IMX8MQ_SYS1_PLL_200M_CG] =
> > imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> > - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> > imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
> > hws[IMX8MQ_SYS1_PLL_400M_CG] =
> > imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
> > hws[IMX8MQ_SYS1_PLL_800M_CG] =
> > imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
> >
> > @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> > platform_device *pdev)
> > hws[IMX8MQ_SYS1_PLL_133M] =
> > imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> > hws[IMX8MQ_SYS1_PLL_160M] =
> > imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> > hws[IMX8MQ_SYS1_PLL_200M] =
> > imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> > - hws[IMX8MQ_SYS1_PLL_266M] =
> > imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> > + hws[IMX8MQ_SYS1_PLL_266M] =
> > + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> > hws[IMX8MQ_SYS1_PLL_400M] =
> > imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> > hws[IMX8MQ_SYS1_PLL_800M] =
> > imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
> >
> > The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated
> > that the
> > SDHCI driver properly enables this bus clock across the problematic
> > card access.
> > So what I think is happening here is that both nand_usdhc_bus and
> > sys1_pll_266m are initially enabled. Sometime during boot
> > sys1_pll_266m gets
> > disabled due to runtime PM on the enet_axi clock, which is a direct
> > child of
> > sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but
> > no consumer
> > has claimed the clock yet, so the parent clock gets disabled while
> > this branch of
> > the clock tree is still active.
>
> Hi Lucas,
>
> According to the clock tree, if nand_usdhc_bus is still enabled, then
> sys1_pll_266m has no chance to disable.
This statement is only correct after the SDHCI driver is probed an has
enabled nand_usdhc_bus. Before the driver probes the refcounts on the
clocks are not synchronized, so sys1_pll_266m_cg can be disabled, while
nand_usdhc_bus is enabled (from software running before Linux), even
though no consumer is using nand_usdhc_bus, yet.
> sys1_pll_266m_cg 1 1 0 800000000 0 0 50000 Y
> sys1_pll_266m 1 1 0 266666666 0 0 50000 Y
> nand_usdhc_bus 0 0 0 266666666 0 0 50000 N
> nand_usdhc_rawnand_clk 0 0 0 266666666 0 0 50000 N
> enet_axi 1 1 0 266666666 0 0 50000 Y
> enet1_root_clk 2 2 0 266666666 0 0 50000 Y
>
>
> This issue seems related with the following errta:
>
> e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS
> clocks
> Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
> Due to current physical design implementation, AHB_BUS and IPG_CLK
> must come from
> same clock source to maintain clock sync.
> Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.
>
> After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus
> and USDHC IPG_clk again. (Here usdhc AHB BUS source from
> nand_usdhc_bus.)
> This sync is handle by hardware, and maybe need some time, during
> this sync period, usdhc operation may has issue.
Where in HW is this synchronization done? If it's at the uSDHC
controller side, I would expect this issue to show up even with the
commit reverted, as nand_usdhc_bus gets gated due to runtime PM from
the controller side. The only difference with the commit in question is
that now the clock branch can be gated _before_ nand_usdhc_bus. If the
synchronization is done somewhere in the clock tree than this might be
an issue.
>
> I just double check our local v5.10 branch, already revert the commit
> b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
> So to fix this issue, one method is revert this patch, another method
> is keep the 'nand_usdhc_bus' always on. Add change like this:
>
> diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
> index 779ea69e639c..939806b36916 100644
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
> /* BUS */
> hws[IMX8MQ_CLK_MAIN_AXI] = imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels, base + 0x8800);
> hws[IMX8MQ_CLK_ENET_AXI] = imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base + 0x8880);
> - hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
> + hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
> hws[IMX8MQ_CLK_VPU_BUS] = imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base + 0x8980);
> hws[IMX8MQ_CLK_DISP_AXI] = imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base + 0x8a00);
> hws[IMX8MQ_CLK_DISP_APB] = imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base + 0x8a80);
>
>
> What you think? Or any other suggestion?
This is suboptimal, as it will not allow to gate the uSDHC controller
AHB clock in runtime suspend. Also my testing shows that it's the gate
_before_ the nand_usdhc_bus slice that's causing the issue. So my
minimal fix from the previous mail would still be better, as it allows
to gate the nand_usdhc_bus clock, while keeping sys1_pll_266m enabled.
Regards,
Lucas
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
@ 2021-01-06 15:09 ` Lucas Stach
0 siblings, 0 replies; 42+ messages in thread
From: Lucas Stach @ 2021-01-06 15:09 UTC (permalink / raw)
To: Bough Chen, Fabio Estevam, Angus Ainslie, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Bough,
Am Mittwoch, dem 06.01.2021 um 09:29 +0000 schrieb Bough Chen:
> > -----Original Message-----
> > From: Lucas Stach [mailto:l.stach@pengutronix.de]
> > Sent: 2021年1月5日 23:07
> > To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> > <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard
> > Crestez
> > <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel Vesa
> > <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael
> > Turquette
> > <mturquette@baylibre.com>
> > Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <
> > agx@sigxcpu.org>;
> > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > Hauer
> > <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> > ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > Subject: Re: sdhci timeout on imx8mq
> >
> > Hi all,
> >
> > Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > > -----Original Message-----
> > > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > > Sent: 2020年7月7日 20:45
> > > > To: Angus Ainslie <angus@akkea.ca>
> > > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > > linux-
> > > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>;
> > > > Sascha
> > > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE
> > > > IMX /
> > > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > > Subject: Re: sdhci timeout on imx8mq
> > > >
> > > > Hi Angus,
> > > >
> > > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > > wrote:
> > > >
> > > > > Has there been any progress with this. I'm getting this on
> > > > > about
> > > > > 50% of
> > > >
> > > > Not from my side, sorry.
> > > >
> > > > Bough,
> > > >
> > > > Do you know why this problem affects the imx8mq-evk versions
> > > > that
> > > > are populated with the Micron eMMC and not the ones with
> > > > Sandisk
> > > > eMMC?
> > >
> > > Hi Angus,
> > >
> > > Can you show me the full fail log? I do not meet this issue on my
> > > side, besides, which kind of uboot do you use?
> >
> > I was finally able to bisect this issue, which wasn't that much fun
> > due to the
> > issue not being reproducible 100%. :/ Turns out that the issue is
> > even more
> > interesting than I thought and likely doesn't have anything to do
> > with SDHCI or
> > used bootloader versions. Here's my current debugging state:
> >
> > I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define
> > gates for
> > pll1/2 fixed dividers). The change itself looks fine to me, still
> > CC'ed Leonard for
> > good measure.
> >
> > In my testing the following partial revert fixes the issue:
> >
> > --- a/drivers/clk/imx/clk-imx8mq.c
> > +++ b/drivers/clk/imx/clk-imx8mq.c
> > @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> > platform_device *pdev)
> > hws[IMX8MQ_SYS1_PLL_133M_CG] =
> > imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
> > hws[IMX8MQ_SYS1_PLL_160M_CG] =
> > imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
> > hws[IMX8MQ_SYS1_PLL_200M_CG] =
> > imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> > - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> > imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
> > hws[IMX8MQ_SYS1_PLL_400M_CG] =
> > imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
> > hws[IMX8MQ_SYS1_PLL_800M_CG] =
> > imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
> >
> > @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> > platform_device *pdev)
> > hws[IMX8MQ_SYS1_PLL_133M] =
> > imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> > hws[IMX8MQ_SYS1_PLL_160M] =
> > imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> > hws[IMX8MQ_SYS1_PLL_200M] =
> > imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> > - hws[IMX8MQ_SYS1_PLL_266M] =
> > imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> > + hws[IMX8MQ_SYS1_PLL_266M] =
> > + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> > hws[IMX8MQ_SYS1_PLL_400M] =
> > imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> > hws[IMX8MQ_SYS1_PLL_800M] =
> > imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
> >
> > The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated
> > that the
> > SDHCI driver properly enables this bus clock across the problematic
> > card access.
> > So what I think is happening here is that both nand_usdhc_bus and
> > sys1_pll_266m are initially enabled. Sometime during boot
> > sys1_pll_266m gets
> > disabled due to runtime PM on the enet_axi clock, which is a direct
> > child of
> > sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but
> > no consumer
> > has claimed the clock yet, so the parent clock gets disabled while
> > this branch of
> > the clock tree is still active.
>
> Hi Lucas,
>
> According to the clock tree, if nand_usdhc_bus is still enabled, then
> sys1_pll_266m has no chance to disable.
This statement is only correct after the SDHCI driver is probed an has
enabled nand_usdhc_bus. Before the driver probes the refcounts on the
clocks are not synchronized, so sys1_pll_266m_cg can be disabled, while
nand_usdhc_bus is enabled (from software running before Linux), even
though no consumer is using nand_usdhc_bus, yet.
> sys1_pll_266m_cg 1 1 0 800000000 0 0 50000 Y
> sys1_pll_266m 1 1 0 266666666 0 0 50000 Y
> nand_usdhc_bus 0 0 0 266666666 0 0 50000 N
> nand_usdhc_rawnand_clk 0 0 0 266666666 0 0 50000 N
> enet_axi 1 1 0 266666666 0 0 50000 Y
> enet1_root_clk 2 2 0 266666666 0 0 50000 Y
>
>
> This issue seems related with the following errta:
>
> e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS
> clocks
> Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
> Due to current physical design implementation, AHB_BUS and IPG_CLK
> must come from
> same clock source to maintain clock sync.
> Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.
>
> After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus
> and USDHC IPG_clk again. (Here usdhc AHB BUS source from
> nand_usdhc_bus.)
> This sync is handle by hardware, and maybe need some time, during
> this sync period, usdhc operation may has issue.
Where in HW is this synchronization done? If it's at the uSDHC
controller side, I would expect this issue to show up even with the
commit reverted, as nand_usdhc_bus gets gated due to runtime PM from
the controller side. The only difference with the commit in question is
that now the clock branch can be gated _before_ nand_usdhc_bus. If the
synchronization is done somewhere in the clock tree than this might be
an issue.
>
> I just double check our local v5.10 branch, already revert the commit
> b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
> So to fix this issue, one method is revert this patch, another method
> is keep the 'nand_usdhc_bus' always on. Add change like this:
>
> diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
> index 779ea69e639c..939806b36916 100644
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
> /* BUS */
> hws[IMX8MQ_CLK_MAIN_AXI] = imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels, base + 0x8800);
> hws[IMX8MQ_CLK_ENET_AXI] = imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base + 0x8880);
> - hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
> + hws[IMX8MQ_CLK_NAND_USDHC_BUS] = imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus", imx8mq_nand_usdhc_sels, base + 0x8900);
> hws[IMX8MQ_CLK_VPU_BUS] = imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base + 0x8980);
> hws[IMX8MQ_CLK_DISP_AXI] = imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base + 0x8a00);
> hws[IMX8MQ_CLK_DISP_APB] = imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base + 0x8a80);
>
>
> What you think? Or any other suggestion?
This is suboptimal, as it will not allow to gate the uSDHC controller
AHB clock in runtime suspend. Also my testing shows that it's the gate
_before_ the nand_usdhc_bus slice that's causing the issue. So my
minimal fix from the previous mail would still be better, as it allows
to gate the nand_usdhc_bus clock, while keeping sys1_pll_266m enabled.
Regards,
Lucas
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
2021-01-06 15:09 ` Lucas Stach
@ 2021-01-07 1:47 ` Bough Chen
-1 siblings, 0 replies; 42+ messages in thread
From: Bough Chen @ 2021-01-07 1:47 UTC (permalink / raw)
To: Lucas Stach, Fabio Estevam, Angus Ainslie, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette, Jacky Bai
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> -----Original Message-----
> From: Lucas Stach [mailto:l.stach@pengutronix.de]
> Sent: 2021年1月6日 23:10
> To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Peng Fan
> <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha Hauer
> <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
>
> Hi Bough,
>
> Am Mittwoch, dem 06.01.2021 um 09:29 +0000 schrieb Bough Chen:
> > > -----Original Message-----
> > > From: Lucas Stach [mailto:l.stach@pengutronix.de]
> > > Sent: 2021年1月5日 23:07
> > > To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> > > <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard
> > > Crestez <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel
> > > Vesa <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael
> > > Turquette <mturquette@baylibre.com>
> > > Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <
> > > agx@sigxcpu.org>; linux-mmc <linux-mmc@vger.kernel.org>; Adrian
> > > Hunter <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>;
> > > Sascha Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE
> > > IMX / MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi all,
> > >
> > > Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > > > -----Original Message-----
> > > > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > > > Sent: 2020年7月7日 20:45
> > > > > To: Angus Ainslie <angus@akkea.ca>
> > > > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > > > linux-
> > > > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>;
> > > > > Sascha Hauer < kernel@pengutronix.de>; moderated
> > > > > list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> > > > > <linux-arm-kernel@lists.infradead.org>
> > > > > Subject: Re: sdhci timeout on imx8mq
> > > > >
> > > > > Hi Angus,
> > > > >
> > > > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > > > wrote:
> > > > >
> > > > > > Has there been any progress with this. I'm getting this on
> > > > > > about 50% of
> > > > >
> > > > > Not from my side, sorry.
> > > > >
> > > > > Bough,
> > > > >
> > > > > Do you know why this problem affects the imx8mq-evk versions
> > > > > that are populated with the Micron eMMC and not the ones with
> > > > > Sandisk eMMC?
> > > >
> > > > Hi Angus,
> > > >
> > > > Can you show me the full fail log? I do not meet this issue on my
> > > > side, besides, which kind of uboot do you use?
> > >
> > > I was finally able to bisect this issue, which wasn't that much fun
> > > due to the issue not being reproducible 100%. :/ Turns out that the
> > > issue is even more interesting than I thought and likely doesn't
> > > have anything to do with SDHCI or used bootloader versions. Here's
> > > my current debugging state:
> > >
> > > I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define
> > > gates for
> > > pll1/2 fixed dividers). The change itself looks fine to me, still
> > > CC'ed Leonard for good measure.
> > >
> > > In my testing the following partial revert fixes the issue:
> > >
> > > --- a/drivers/clk/imx/clk-imx8mq.c
> > > +++ b/drivers/clk/imx/clk-imx8mq.c
> > > @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> > > platform_device *pdev)
> > > hws[IMX8MQ_SYS1_PLL_133M_CG] =
> > > imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30,
> > > 15);
> > > hws[IMX8MQ_SYS1_PLL_160M_CG] =
> > > imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30,
> > > 17);
> > > hws[IMX8MQ_SYS1_PLL_200M_CG] =
> > > imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> > > - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> > > imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30,
> > > 21);
> > > hws[IMX8MQ_SYS1_PLL_400M_CG] =
> > > imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30,
> > > 23);
> > > hws[IMX8MQ_SYS1_PLL_800M_CG] =
> > > imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30,
> > > 25);
> > >
> > > @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> > > platform_device *pdev)
> > > hws[IMX8MQ_SYS1_PLL_133M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> > > hws[IMX8MQ_SYS1_PLL_160M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> > > hws[IMX8MQ_SYS1_PLL_200M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> > > - hws[IMX8MQ_SYS1_PLL_266M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> > > + hws[IMX8MQ_SYS1_PLL_266M] =
> > > + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> > > hws[IMX8MQ_SYS1_PLL_400M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> > > hws[IMX8MQ_SYS1_PLL_800M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
> > >
> > > The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated
> > > that the SDHCI driver properly enables this bus clock across the
> > > problematic card access.
> > > So what I think is happening here is that both nand_usdhc_bus and
> > > sys1_pll_266m are initially enabled. Sometime during boot
> > > sys1_pll_266m gets disabled due to runtime PM on the enet_axi clock,
> > > which is a direct child of sys1_pll_266m. At this point
> > > nand_usdhc_bus is still enabled, but no consumer has claimed the
> > > clock yet, so the parent clock gets disabled while this branch of
> > > the clock tree is still active.
> >
> > Hi Lucas,
> >
> > According to the clock tree, if nand_usdhc_bus is still enabled, then
> > sys1_pll_266m has no chance to disable.
>
> This statement is only correct after the SDHCI driver is probed an has enabled
> nand_usdhc_bus. Before the driver probes the refcounts on the clocks are not
> synchronized, so sys1_pll_266m_cg can be disabled, while nand_usdhc_bus is
> enabled (from software running before Linux), even though no consumer is
> using nand_usdhc_bus, yet.
Yes, agree. For current case, uboot gate on the sys1_pll_266m, then boot the Linux.
In Linux, after clock driver probe, due the the support of sys1_pll_266m_cg, this sys1_pll_266m is gate off by clock driver due to no default consumer.
>
> > sys1_pll_266m_cg 1 1 0
> 800000000 0 0 50000 Y
> > sys1_pll_266m 1 1 0
> 266666666 0 0 50000 Y
> > nand_usdhc_bus 0 0 0
> 266666666 0 0 50000 N
> > nand_usdhc_rawnand_clk 0 0 0
> 266666666 0 0 50000 N
> > enet_axi 1 1 0
> 266666666 0 0 50000 Y
> > enet1_root_clk 2 2 0
> 266666666 0 0 50000 Y
> >
> >
> > This issue seems related with the following errta:
> >
> > e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS
> > clocks
> > Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
> > Due to current physical design implementation, AHB_BUS and IPG_CLK
> > must come from same clock source to maintain clock sync.
> > Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.
> >
> > After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus
> > and USDHC IPG_clk again. (Here usdhc AHB BUS source from
> > nand_usdhc_bus.)
> > This sync is handle by hardware, and maybe need some time, during this
> > sync period, usdhc operation may has issue.
>
> Where in HW is this synchronization done? If it's at the uSDHC controller side, I
> would expect this issue to show up even with the commit reverted, as
> nand_usdhc_bus gets gated due to runtime PM from the controller side. The
> only difference with the commit in question is that now the clock branch can be
> gated _before_ nand_usdhc_bus. If the synchronization is done somewhere in
> the clock tree than this might be an issue.
>
Not in uSDHC side. This synchronization should be done somewhere in clock tree(hardware side).
> >
> > I just double check our local v5.10 branch, already revert the commit
> > b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
> > So to fix this issue, one method is revert this patch, another method
> > is keep the 'nand_usdhc_bus' always on. Add change like this:
> >
> > diff --git a/drivers/clk/imx/clk-imx8mq.c
> > b/drivers/clk/imx/clk-imx8mq.c index 779ea69e639c..939806b36916 100644
> > --- a/drivers/clk/imx/clk-imx8mq.c
> > +++ b/drivers/clk/imx/clk-imx8mq.c
> > @@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct
> > platform_device *pdev)
> > /* BUS */
> > hws[IMX8MQ_CLK_MAIN_AXI] =
> > imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels,
> > base + 0x8800);
> > hws[IMX8MQ_CLK_ENET_AXI] =
> imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base +
> 0x8880);
> > - hws[IMX8MQ_CLK_NAND_USDHC_BUS] =
> imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels,
> base + 0x8900);
> > + hws[IMX8MQ_CLK_NAND_USDHC_BUS] =
> > + imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus",
> > + imx8mq_nand_usdhc_sels, base + 0x8900);
> > hws[IMX8MQ_CLK_VPU_BUS] =
> > imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base +
> > 0x8980);
> > hws[IMX8MQ_CLK_DISP_AXI] =
> > imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base +
> > 0x8a00);
> > hws[IMX8MQ_CLK_DISP_APB] =
> > imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base +
> > 0x8a80);
> >
> >
> > What you think? Or any other suggestion?
>
> This is suboptimal, as it will not allow to gate the uSDHC controller AHB clock in
> runtime suspend. Also my testing shows that it's the gate _before_ the
> nand_usdhc_bus slice that's causing the issue. So my minimal fix from the
> previous mail would still be better, as it allows to gate the nand_usdhc_bus
> clock, while keeping sys1_pll_266m enabled.
Whether to choose your minimal fix or revert the commit, let's involve clock team member, Abel/Jacky, any comment?
Our local tree just revert this commit, I think there are some other reason, Jacky, could you help clarify that?
Best Regards
Haibo Chen
>
> Regards,
> Lucas
> >
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
@ 2021-01-07 1:47 ` Bough Chen
0 siblings, 0 replies; 42+ messages in thread
From: Bough Chen @ 2021-01-07 1:47 UTC (permalink / raw)
To: Lucas Stach, Fabio Estevam, Angus Ainslie, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette, Jacky Bai
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> -----Original Message-----
> From: Lucas Stach [mailto:l.stach@pengutronix.de]
> Sent: 2021年1月6日 23:10
> To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Peng Fan
> <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha Hauer
> <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC ARM
> ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
>
> Hi Bough,
>
> Am Mittwoch, dem 06.01.2021 um 09:29 +0000 schrieb Bough Chen:
> > > -----Original Message-----
> > > From: Lucas Stach [mailto:l.stach@pengutronix.de]
> > > Sent: 2021年1月5日 23:07
> > > To: Bough Chen <haibo.chen@nxp.com>; Fabio Estevam
> > > <festevam@gmail.com>; Angus Ainslie <angus@akkea.ca>; Leonard
> > > Crestez <leonard.crestez@nxp.com>; Peng Fan <peng.fan@nxp.com>; Abel
> > > Vesa <abel.vesa@nxp.com>; Stephen Boyd <sboyd@kernel.org>; Michael
> > > Turquette <mturquette@baylibre.com>
> > > Cc: Ulf Hansson <ulf.hansson@linaro.org>; Guido Günther <
> > > agx@sigxcpu.org>; linux-mmc <linux-mmc@vger.kernel.org>; Adrian
> > > Hunter <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>;
> > > Sascha Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE
> > > IMX / MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi all,
> > >
> > > Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > > > -----Original Message-----
> > > > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > > > Sent: 2020年7月7日 20:45
> > > > > To: Angus Ainslie <angus@akkea.ca>
> > > > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > > > linux-
> > > > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>;
> > > > > Sascha Hauer < kernel@pengutronix.de>; moderated
> > > > > list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> > > > > <linux-arm-kernel@lists.infradead.org>
> > > > > Subject: Re: sdhci timeout on imx8mq
> > > > >
> > > > > Hi Angus,
> > > > >
> > > > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > > > wrote:
> > > > >
> > > > > > Has there been any progress with this. I'm getting this on
> > > > > > about 50% of
> > > > >
> > > > > Not from my side, sorry.
> > > > >
> > > > > Bough,
> > > > >
> > > > > Do you know why this problem affects the imx8mq-evk versions
> > > > > that are populated with the Micron eMMC and not the ones with
> > > > > Sandisk eMMC?
> > > >
> > > > Hi Angus,
> > > >
> > > > Can you show me the full fail log? I do not meet this issue on my
> > > > side, besides, which kind of uboot do you use?
> > >
> > > I was finally able to bisect this issue, which wasn't that much fun
> > > due to the issue not being reproducible 100%. :/ Turns out that the
> > > issue is even more interesting than I thought and likely doesn't
> > > have anything to do with SDHCI or used bootloader versions. Here's
> > > my current debugging state:
> > >
> > > I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define
> > > gates for
> > > pll1/2 fixed dividers). The change itself looks fine to me, still
> > > CC'ed Leonard for good measure.
> > >
> > > In my testing the following partial revert fixes the issue:
> > >
> > > --- a/drivers/clk/imx/clk-imx8mq.c
> > > +++ b/drivers/clk/imx/clk-imx8mq.c
> > > @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> > > platform_device *pdev)
> > > hws[IMX8MQ_SYS1_PLL_133M_CG] =
> > > imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30,
> > > 15);
> > > hws[IMX8MQ_SYS1_PLL_160M_CG] =
> > > imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30,
> > > 17);
> > > hws[IMX8MQ_SYS1_PLL_200M_CG] =
> > > imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> > > - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> > > imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30,
> > > 21);
> > > hws[IMX8MQ_SYS1_PLL_400M_CG] =
> > > imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30,
> > > 23);
> > > hws[IMX8MQ_SYS1_PLL_800M_CG] =
> > > imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30,
> > > 25);
> > >
> > > @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> > > platform_device *pdev)
> > > hws[IMX8MQ_SYS1_PLL_133M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> > > hws[IMX8MQ_SYS1_PLL_160M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> > > hws[IMX8MQ_SYS1_PLL_200M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> > > - hws[IMX8MQ_SYS1_PLL_266M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> > > + hws[IMX8MQ_SYS1_PLL_266M] =
> > > + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> > > hws[IMX8MQ_SYS1_PLL_400M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> > > hws[IMX8MQ_SYS1_PLL_800M] =
> > > imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
> > >
> > > The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated
> > > that the SDHCI driver properly enables this bus clock across the
> > > problematic card access.
> > > So what I think is happening here is that both nand_usdhc_bus and
> > > sys1_pll_266m are initially enabled. Sometime during boot
> > > sys1_pll_266m gets disabled due to runtime PM on the enet_axi clock,
> > > which is a direct child of sys1_pll_266m. At this point
> > > nand_usdhc_bus is still enabled, but no consumer has claimed the
> > > clock yet, so the parent clock gets disabled while this branch of
> > > the clock tree is still active.
> >
> > Hi Lucas,
> >
> > According to the clock tree, if nand_usdhc_bus is still enabled, then
> > sys1_pll_266m has no chance to disable.
>
> This statement is only correct after the SDHCI driver is probed an has enabled
> nand_usdhc_bus. Before the driver probes the refcounts on the clocks are not
> synchronized, so sys1_pll_266m_cg can be disabled, while nand_usdhc_bus is
> enabled (from software running before Linux), even though no consumer is
> using nand_usdhc_bus, yet.
Yes, agree. For current case, uboot gate on the sys1_pll_266m, then boot the Linux.
In Linux, after clock driver probe, due the the support of sys1_pll_266m_cg, this sys1_pll_266m is gate off by clock driver due to no default consumer.
>
> > sys1_pll_266m_cg 1 1 0
> 800000000 0 0 50000 Y
> > sys1_pll_266m 1 1 0
> 266666666 0 0 50000 Y
> > nand_usdhc_bus 0 0 0
> 266666666 0 0 50000 N
> > nand_usdhc_rawnand_clk 0 0 0
> 266666666 0 0 50000 N
> > enet_axi 1 1 0
> 266666666 0 0 50000 Y
> > enet1_root_clk 2 2 0
> 266666666 0 0 50000 Y
> >
> >
> > This issue seems related with the following errta:
> >
> > e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS
> > clocks
> > Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized.
> > Due to current physical design implementation, AHB_BUS and IPG_CLK
> > must come from same clock source to maintain clock sync.
> > Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1.
> >
> > After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus
> > and USDHC IPG_clk again. (Here usdhc AHB BUS source from
> > nand_usdhc_bus.)
> > This sync is handle by hardware, and maybe need some time, during this
> > sync period, usdhc operation may has issue.
>
> Where in HW is this synchronization done? If it's at the uSDHC controller side, I
> would expect this issue to show up even with the commit reverted, as
> nand_usdhc_bus gets gated due to runtime PM from the controller side. The
> only difference with the commit in question is that now the clock branch can be
> gated _before_ nand_usdhc_bus. If the synchronization is done somewhere in
> the clock tree than this might be an issue.
>
Not in uSDHC side. This synchronization should be done somewhere in clock tree(hardware side).
> >
> > I just double check our local v5.10 branch, already revert the commit
> > b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers).
> > So to fix this issue, one method is revert this patch, another method
> > is keep the 'nand_usdhc_bus' always on. Add change like this:
> >
> > diff --git a/drivers/clk/imx/clk-imx8mq.c
> > b/drivers/clk/imx/clk-imx8mq.c index 779ea69e639c..939806b36916 100644
> > --- a/drivers/clk/imx/clk-imx8mq.c
> > +++ b/drivers/clk/imx/clk-imx8mq.c
> > @@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct
> > platform_device *pdev)
> > /* BUS */
> > hws[IMX8MQ_CLK_MAIN_AXI] =
> > imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels,
> > base + 0x8800);
> > hws[IMX8MQ_CLK_ENET_AXI] =
> imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base +
> 0x8880);
> > - hws[IMX8MQ_CLK_NAND_USDHC_BUS] =
> imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels,
> base + 0x8900);
> > + hws[IMX8MQ_CLK_NAND_USDHC_BUS] =
> > + imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus",
> > + imx8mq_nand_usdhc_sels, base + 0x8900);
> > hws[IMX8MQ_CLK_VPU_BUS] =
> > imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base +
> > 0x8980);
> > hws[IMX8MQ_CLK_DISP_AXI] =
> > imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base +
> > 0x8a00);
> > hws[IMX8MQ_CLK_DISP_APB] =
> > imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base +
> > 0x8a80);
> >
> >
> > What you think? Or any other suggestion?
>
> This is suboptimal, as it will not allow to gate the uSDHC controller AHB clock in
> runtime suspend. Also my testing shows that it's the gate _before_ the
> nand_usdhc_bus slice that's causing the issue. So my minimal fix from the
> previous mail would still be better, as it allows to gate the nand_usdhc_bus
> clock, while keeping sys1_pll_266m enabled.
Whether to choose your minimal fix or revert the commit, let's involve clock team member, Abel/Jacky, any comment?
Our local tree just revert this commit, I think there are some other reason, Jacky, could you help clarify that?
Best Regards
Haibo Chen
>
> Regards,
> Lucas
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
2021-01-05 15:06 ` Lucas Stach
@ 2021-01-06 18:56 ` Fabio Estevam
-1 siblings, 0 replies; 42+ messages in thread
From: Fabio Estevam @ 2021-01-06 18:56 UTC (permalink / raw)
To: Lucas Stach
Cc: BOUGH CHEN, Angus Ainslie, Leonard Crestez, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette, Ulf Hansson, Guido Günther,
linux-mmc, Adrian Hunter, dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Lucas,
On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach <l.stach@pengutronix.de> wrote:
> The reference manual states about this situation: "For any clock, its
> source must be left on when it is kept on. Behavior is undefined if
> this rule is violated."
> And it seems this is exactly what's happening here: some kind of glitch
> is introduced in the nand_usdhc_bus clock, which prevents the SDHCI
> controller from working, even though the clock branch is properly
> enabled later on. On my system the SDHCI timeout and following runtime
> suspend/resume cycle on the nand_usdhc_bus clock seem to get it back
> into a working state.
I think your analysis is correct and I recall helping a customer with
a similar issue:
https://community.nxp.com/t5/i-MX-Processors/External-clock-that-provide-root-clock-for-SAI3-and-SPDIF/m-p/1019834
Regards,
Fabio Estevam
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
@ 2021-01-06 18:56 ` Fabio Estevam
0 siblings, 0 replies; 42+ messages in thread
From: Fabio Estevam @ 2021-01-06 18:56 UTC (permalink / raw)
To: Lucas Stach
Cc: Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette,
Angus Ainslie, linux-mmc, Ulf Hansson, BOUGH CHEN, Adrian Hunter,
dl-linux-imx, Sascha Hauer, Leonard Crestez, Guido Günther,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Lucas,
On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach <l.stach@pengutronix.de> wrote:
> The reference manual states about this situation: "For any clock, its
> source must be left on when it is kept on. Behavior is undefined if
> this rule is violated."
> And it seems this is exactly what's happening here: some kind of glitch
> is introduced in the nand_usdhc_bus clock, which prevents the SDHCI
> controller from working, even though the clock branch is properly
> enabled later on. On my system the SDHCI timeout and following runtime
> suspend/resume cycle on the nand_usdhc_bus clock seem to get it back
> into a working state.
I think your analysis is correct and I recall helping a customer with
a similar issue:
https://community.nxp.com/t5/i-MX-Processors/External-clock-that-provide-root-clock-for-SAI3-and-SPDIF/m-p/1019834
Regards,
Fabio Estevam
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
2021-01-06 18:56 ` Fabio Estevam
@ 2021-01-07 1:30 ` Jacky Bai
-1 siblings, 0 replies; 42+ messages in thread
From: Jacky Bai @ 2021-01-07 1:30 UTC (permalink / raw)
To: Fabio Estevam, Lucas Stach
Cc: Bough Chen, Angus Ainslie, Leonard Crestez, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette, Ulf Hansson, Guido Günther,
linux-mmc, Adrian Hunter, dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> -----Original Message-----
> From: Fabio Estevam [mailto:festevam@gmail.com]
> Sent: Thursday, January 7, 2021 2:57 AM
> To: Lucas Stach <l.stach@pengutronix.de>
> Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie <angus@akkea.ca>;
> Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>; Ulf
> Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC
> ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
>
> Hi Lucas,
>
> On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach <l.stach@pengutronix.de>
> wrote:
>
> > The reference manual states about this situation: "For any clock, its
> > source must be left on when it is kept on. Behavior is undefined if
> > this rule is violated."
> > And it seems this is exactly what's happening here: some kind of
> > glitch is introduced in the nand_usdhc_bus clock, which prevents the
> > SDHCI controller from working, even though the clock branch is
> > properly enabled later on. On my system the SDHCI timeout and
> > following runtime suspend/resume cycle on the nand_usdhc_bus clock
> > seem to get it back into a working state.
>
> I think your analysis is correct and I recall helping a customer with a similar
> issue:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-root
> -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc
> 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> SyQhGeSHYysX1Co4%3D&reserved=0
>
For the customer case, it seem not the same issue. the customer issue is caused by clock source change while parent has no clock output.
This is inherit limitation for the CCM clock slice when using the smart interface to change the clock parent.
For current mmc timeout issue, I think we can have a try with nand_usdhc_bus clock gated at the beginning of kernel boot, directly modify the nand_usdhc_bus
Clock's HW register gate bit in clock-imx8mq.c.
BR
Jacky Bai
> Regards,
>
> Fabio Estevam
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
@ 2021-01-07 1:30 ` Jacky Bai
0 siblings, 0 replies; 42+ messages in thread
From: Jacky Bai @ 2021-01-07 1:30 UTC (permalink / raw)
To: Fabio Estevam, Lucas Stach
Cc: Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette,
Angus Ainslie, linux-mmc, Ulf Hansson, Bough Chen, Adrian Hunter,
dl-linux-imx, Sascha Hauer, Leonard Crestez, Guido Günther,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> -----Original Message-----
> From: Fabio Estevam [mailto:festevam@gmail.com]
> Sent: Thursday, January 7, 2021 2:57 AM
> To: Lucas Stach <l.stach@pengutronix.de>
> Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie <angus@akkea.ca>;
> Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>; Ulf
> Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX / MXC
> ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> Subject: Re: sdhci timeout on imx8mq
>
> Hi Lucas,
>
> On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach <l.stach@pengutronix.de>
> wrote:
>
> > The reference manual states about this situation: "For any clock, its
> > source must be left on when it is kept on. Behavior is undefined if
> > this rule is violated."
> > And it seems this is exactly what's happening here: some kind of
> > glitch is introduced in the nand_usdhc_bus clock, which prevents the
> > SDHCI controller from working, even though the clock branch is
> > properly enabled later on. On my system the SDHCI timeout and
> > following runtime suspend/resume cycle on the nand_usdhc_bus clock
> > seem to get it back into a working state.
>
> I think your analysis is correct and I recall helping a customer with a similar
> issue:
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-root
> -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc
> 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> SyQhGeSHYysX1Co4%3D&reserved=0
>
For the customer case, it seem not the same issue. the customer issue is caused by clock source change while parent has no clock output.
This is inherit limitation for the CCM clock slice when using the smart interface to change the clock parent.
For current mmc timeout issue, I think we can have a try with nand_usdhc_bus clock gated at the beginning of kernel boot, directly modify the nand_usdhc_bus
Clock's HW register gate bit in clock-imx8mq.c.
BR
Jacky Bai
> Regards,
>
> Fabio Estevam
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
2021-01-07 1:30 ` Jacky Bai
@ 2021-01-07 11:26 ` Lucas Stach
-1 siblings, 0 replies; 42+ messages in thread
From: Lucas Stach @ 2021-01-07 11:26 UTC (permalink / raw)
To: Jacky Bai, Fabio Estevam
Cc: Bough Chen, Angus Ainslie, Leonard Crestez, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette, Ulf Hansson, Guido Günther,
linux-mmc, Adrian Hunter, dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Jacky,
Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai:
> > -----Original Message-----
> > From: Fabio Estevam [mailto:festevam@gmail.com]
> > Sent: Thursday, January 7, 2021 2:57 AM
> > To: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie
> > <angus@akkea.ca>;
> > Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> > <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> > <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>;
> > Ulf
> > Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > MXC
> > ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > Subject: Re: sdhci timeout on imx8mq
> >
> > Hi Lucas,
> >
> > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach
> > <l.stach@pengutronix.de>
> > wrote:
> >
> > > The reference manual states about this situation: "For any clock,
> > > its
> > > source must be left on when it is kept on. Behavior is undefined
> > > if
> > > this rule is violated."
> > > And it seems this is exactly what's happening here: some kind of
> > > glitch is introduced in the nand_usdhc_bus clock, which prevents
> > > the
> > > SDHCI controller from working, even though the clock branch is
> > > properly enabled later on. On my system the SDHCI timeout and
> > > following runtime suspend/resume cycle on the nand_usdhc_bus
> > > clock
> > > seem to get it back into a working state.
> >
> > I think your analysis is correct and I recall helping a customer
> > with a similar
> > issue:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-
> > root
> > -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc
> > 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> > n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> > WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> > SyQhGeSHYysX1Co4%3D&reserved=0
> >
>
> For the customer case, it seem not the same issue. the customer issue
> is caused by clock source change while parent has no clock output.
> This is inherit limitation for the CCM clock slice when using the
> smart interface to change the clock parent.
>
> For current mmc timeout issue, I think we can have a try with
> nand_usdhc_bus clock gated at the beginning of kernel boot, directly
> modify the nand_usdhc_bus
> Clock's HW register gate bit in clock-imx8mq.c.
While this might be an option to fix this specific issue, I would hope
we can come up with something more generic, as the current clock
framework behavior allows to violate the system specification
constraint that parent clocks must not be disabled when any of the
children are active. This seems like a fundamental issue and might hurt
us also with other clocks than this specific nand_usdhc_bus clock.
Can you tell us if there were other issues found with the PLL1/2 gating
patch? The fact that, according to Bough, it's reverted in your tree
seems to suggest this.
Regards,
Lucas
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
@ 2021-01-07 11:26 ` Lucas Stach
0 siblings, 0 replies; 42+ messages in thread
From: Lucas Stach @ 2021-01-07 11:26 UTC (permalink / raw)
To: Jacky Bai, Fabio Estevam
Cc: Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette,
Angus Ainslie, linux-mmc, Ulf Hansson, Bough Chen, Adrian Hunter,
dl-linux-imx, Sascha Hauer, Leonard Crestez, Guido Günther,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Jacky,
Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai:
> > -----Original Message-----
> > From: Fabio Estevam [mailto:festevam@gmail.com]
> > Sent: Thursday, January 7, 2021 2:57 AM
> > To: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie
> > <angus@akkea.ca>;
> > Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> > <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> > <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>;
> > Ulf
> > Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > MXC
> > ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > Subject: Re: sdhci timeout on imx8mq
> >
> > Hi Lucas,
> >
> > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach
> > <l.stach@pengutronix.de>
> > wrote:
> >
> > > The reference manual states about this situation: "For any clock,
> > > its
> > > source must be left on when it is kept on. Behavior is undefined
> > > if
> > > this rule is violated."
> > > And it seems this is exactly what's happening here: some kind of
> > > glitch is introduced in the nand_usdhc_bus clock, which prevents
> > > the
> > > SDHCI controller from working, even though the clock branch is
> > > properly enabled later on. On my system the SDHCI timeout and
> > > following runtime suspend/resume cycle on the nand_usdhc_bus
> > > clock
> > > seem to get it back into a working state.
> >
> > I think your analysis is correct and I recall helping a customer
> > with a similar
> > issue:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-
> > root
> > -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc
> > 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> > n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> > WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> > SyQhGeSHYysX1Co4%3D&reserved=0
> >
>
> For the customer case, it seem not the same issue. the customer issue
> is caused by clock source change while parent has no clock output.
> This is inherit limitation for the CCM clock slice when using the
> smart interface to change the clock parent.
>
> For current mmc timeout issue, I think we can have a try with
> nand_usdhc_bus clock gated at the beginning of kernel boot, directly
> modify the nand_usdhc_bus
> Clock's HW register gate bit in clock-imx8mq.c.
While this might be an option to fix this specific issue, I would hope
we can come up with something more generic, as the current clock
framework behavior allows to violate the system specification
constraint that parent clocks must not be disabled when any of the
children are active. This seems like a fundamental issue and might hurt
us also with other clocks than this specific nand_usdhc_bus clock.
Can you tell us if there were other issues found with the PLL1/2 gating
patch? The fact that, according to Bough, it's reverted in your tree
seems to suggest this.
Regards,
Lucas
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
2021-01-07 11:26 ` Lucas Stach
@ 2021-01-08 1:27 ` Jacky Bai
-1 siblings, 0 replies; 42+ messages in thread
From: Jacky Bai @ 2021-01-08 1:27 UTC (permalink / raw)
To: Lucas Stach, Fabio Estevam
Cc: Bough Chen, Angus Ainslie, Leonard Crestez, Peng Fan, Abel Vesa,
Stephen Boyd, Michael Turquette, Ulf Hansson, Guido Günther,
linux-mmc, Adrian Hunter, dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> Subject: Re: sdhci timeout on imx8mq
>
> Hi Jacky,
>
> Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: Thursday, January 7, 2021 2:57 AM
> > > To: Lucas Stach <l.stach@pengutronix.de>
> > > Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie
> <angus@akkea.ca>;
> > > Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> > > <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> > > <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>; Ulf
> > > Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Lucas,
> > >
> > > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach <l.stach@pengutronix.de>
> > > wrote:
> > >
> > > > The reference manual states about this situation: "For any clock,
> > > > its source must be left on when it is kept on. Behavior is
> > > > undefined if this rule is violated."
> > > > And it seems this is exactly what's happening here: some kind of
> > > > glitch is introduced in the nand_usdhc_bus clock, which prevents
> > > > the SDHCI controller from working, even though the clock branch is
> > > > properly enabled later on. On my system the SDHCI timeout and
> > > > following runtime suspend/resume cycle on the nand_usdhc_bus clock
> > > > seem to get it back into a working state.
> > >
> > > I think your analysis is correct and I recall helping a customer
> > > with a similar
> > > issue:
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
> > > mm
> > > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-
> > > root
> > >
> -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> > > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d
> 3bc
> > >
> 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> > >
> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> > >
> WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> > > SyQhGeSHYysX1Co4%3D&reserved=0
> > >
> >
> > For the customer case, it seem not the same issue. the customer issue
> > is caused by clock source change while parent has no clock output.
> > This is inherit limitation for the CCM clock slice when using the
> > smart interface to change the clock parent.
> >
> > For current mmc timeout issue, I think we can have a try with
> > nand_usdhc_bus clock gated at the beginning of kernel boot, directly
> > modify the nand_usdhc_bus Clock's HW register gate bit in
> > clock-imx8mq.c.
>
> While this might be an option to fix this specific issue, I would hope we can
> come up with something more generic, as the current clock framework
> behavior allows to violate the system specification constraint that parent
> clocks must not be disabled when any of the children are active. This seems
> like a fundamental issue and might hurt us also with other clocks than this
> specific nand_usdhc_bus clock.
Yes, my proposal is for debug purpose. We will do further debug on our EVK board too.
>
> Can you tell us if there were other issues found with the PLL1/2 gating patch?
> The fact that, according to Bough, it's reverted in your tree seems to suggest
> this.
It is mainly because that the PLL1/2 derived clocks are shared between Cortex-A domain & Cortex-M domain.
M4 domain may also need some of these clocks enabled even A core domain does not use it. There is no
well-defined HW mechanism like CCM CCGR domain control to manage the PLL divider's gate based on each domain's request.
So I reverted that patch in our internal tree before we find other more solid solution for shared clock management.
BR
Jacky Bai
>
> Regards,
> Lucas
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
@ 2021-01-08 1:27 ` Jacky Bai
0 siblings, 0 replies; 42+ messages in thread
From: Jacky Bai @ 2021-01-08 1:27 UTC (permalink / raw)
To: Lucas Stach, Fabio Estevam
Cc: Peng Fan, Abel Vesa, Stephen Boyd, Michael Turquette,
Angus Ainslie, linux-mmc, Ulf Hansson, Bough Chen, Adrian Hunter,
dl-linux-imx, Sascha Hauer, Leonard Crestez, Guido Günther,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
> Subject: Re: sdhci timeout on imx8mq
>
> Hi Jacky,
>
> Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: Thursday, January 7, 2021 2:57 AM
> > > To: Lucas Stach <l.stach@pengutronix.de>
> > > Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie
> <angus@akkea.ca>;
> > > Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> > > <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> > > <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>; Ulf
> > > Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Lucas,
> > >
> > > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach <l.stach@pengutronix.de>
> > > wrote:
> > >
> > > > The reference manual states about this situation: "For any clock,
> > > > its source must be left on when it is kept on. Behavior is
> > > > undefined if this rule is violated."
> > > > And it seems this is exactly what's happening here: some kind of
> > > > glitch is introduced in the nand_usdhc_bus clock, which prevents
> > > > the SDHCI controller from working, even though the clock branch is
> > > > properly enabled later on. On my system the SDHCI timeout and
> > > > following runtime suspend/resume cycle on the nand_usdhc_bus clock
> > > > seem to get it back into a working state.
> > >
> > > I think your analysis is correct and I recall helping a customer
> > > with a similar
> > > issue:
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco
> > > mm
> > > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-
> > > root
> > >
> -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> > > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d
> 3bc
> > >
> 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> > >
> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> > >
> WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> > > SyQhGeSHYysX1Co4%3D&reserved=0
> > >
> >
> > For the customer case, it seem not the same issue. the customer issue
> > is caused by clock source change while parent has no clock output.
> > This is inherit limitation for the CCM clock slice when using the
> > smart interface to change the clock parent.
> >
> > For current mmc timeout issue, I think we can have a try with
> > nand_usdhc_bus clock gated at the beginning of kernel boot, directly
> > modify the nand_usdhc_bus Clock's HW register gate bit in
> > clock-imx8mq.c.
>
> While this might be an option to fix this specific issue, I would hope we can
> come up with something more generic, as the current clock framework
> behavior allows to violate the system specification constraint that parent
> clocks must not be disabled when any of the children are active. This seems
> like a fundamental issue and might hurt us also with other clocks than this
> specific nand_usdhc_bus clock.
Yes, my proposal is for debug purpose. We will do further debug on our EVK board too.
>
> Can you tell us if there were other issues found with the PLL1/2 gating patch?
> The fact that, according to Bough, it's reverted in your tree seems to suggest
> this.
It is mainly because that the PLL1/2 derived clocks are shared between Cortex-A domain & Cortex-M domain.
M4 domain may also need some of these clocks enabled even A core domain does not use it. There is no
well-defined HW mechanism like CCM CCGR domain control to manage the PLL divider's gate based on each domain's request.
So I reverted that patch in our internal tree before we find other more solid solution for shared clock management.
BR
Jacky Bai
>
> Regards,
> Lucas
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
2021-01-07 11:26 ` Lucas Stach
@ 2021-03-09 7:35 ` Heiko Thiery
-1 siblings, 0 replies; 42+ messages in thread
From: Heiko Thiery @ 2021-03-09 7:35 UTC (permalink / raw)
To: l.stach
Cc: abel.vesa, adrian.hunter, agx, angus, festevam, haibo.chen,
kernel, leonard.crestez, linux-arm-kernel, linux-imx, linux-mmc,
mturquette, peng.fan, ping.bai, sboyd, ulf.hansson
Hi all,
> Hi Jacky,
>
> Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: Thursday, January 7, 2021 2:57 AM
> > > To: Lucas Stach <l.stach@pengutronix.de>
> > > Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie
> > > <angus@akkea.ca>;
> > > Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> > > <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> > > <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>;
> > > Ulf
> > > Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC
> > > ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Lucas,
> > >
> > > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach
> > > <l.stach@pengutronix.de>
> > > wrote:
> > >
> > > > The reference manual states about this situation: "For any clock,
> > > > its
> > > > source must be left on when it is kept on. Behavior is undefined
> > > > if
> > > > this rule is violated."
> > > > And it seems this is exactly what's happening here: some kind of
> > > > glitch is introduced in the nand_usdhc_bus clock, which prevents
> > > > the
> > > > SDHCI controller from working, even though the clock branch is
> > > > properly enabled later on. On my system the SDHCI timeout and
> > > > following runtime suspend/resume cycle on the nand_usdhc_bus
> > > > clock
> > > > seem to get it back into a working state.
> > >
> > > I think your analysis is correct and I recall helping a customer
> > > with a similar
> > > issue:
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> > > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-
> > > root
> > > -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> > > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc
> > > 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> > > n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> > > WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> > > SyQhGeSHYysX1Co4%3D&reserved=0
> > >
> >
> > For the customer case, it seem not the same issue. the customer issue
> > is caused by clock source change while parent has no clock output.
> > This is inherit limitation for the CCM clock slice when using the
> > smart interface to change the clock parent.
> >
> > For current mmc timeout issue, I think we can have a try with
> > nand_usdhc_bus clock gated at the beginning of kernel boot, directly
> > modify the nand_usdhc_bus
> > Clock's HW register gate bit in clock-imx8mq.c.
>
> While this might be an option to fix this specific issue, I would hope
> we can come up with something more generic, as the current clock
> framework behavior allows to violate the system specification
> constraint that parent clocks must not be disabled when any of the
> children are active. This seems like a fundamental issue and might hurt
> us also with other clocks than this specific nand_usdhc_bus clock.
I am not sure if an error in the fec driver has the same or similar cause as this. But I noticed that the SOC hangs when accessing the timecounter register while the FEC is down. The CCGR10 seems to gate the ENET_REF_CLK_ROOT and the ENET_TIMER_CLK_ROOT. The clocks are disabled as soon as the interface is down.
[1] https://lore.kernel.org/lkml/20210220065654.25598-1-heiko.thiery@gmail.com/
>
> Can you tell us if there were other issues found with the PLL1/2 gating
> patch? The fact that, according to Bough, it's reverted in your tree
> seems to suggest this.
>
> Regards,
> Lucas
Thank you
--
Heiko
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: sdhci timeout on imx8mq
@ 2021-03-09 7:35 ` Heiko Thiery
0 siblings, 0 replies; 42+ messages in thread
From: Heiko Thiery @ 2021-03-09 7:35 UTC (permalink / raw)
To: l.stach
Cc: abel.vesa, adrian.hunter, agx, angus, festevam, haibo.chen,
kernel, leonard.crestez, linux-arm-kernel, linux-imx, linux-mmc,
mturquette, peng.fan, ping.bai, sboyd, ulf.hansson
Hi all,
> Hi Jacky,
>
> Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: Thursday, January 7, 2021 2:57 AM
> > > To: Lucas Stach <l.stach@pengutronix.de>
> > > Cc: Bough Chen <haibo.chen@nxp.com>; Angus Ainslie
> > > <angus@akkea.ca>;
> > > Leonard Crestez <leonard.crestez@nxp.com>; Peng Fan
> > > <peng.fan@nxp.com>; Abel Vesa <abel.vesa@nxp.com>; Stephen Boyd
> > > <sboyd@kernel.org>; Michael Turquette <mturquette@baylibre.com>;
> > > Ulf
> > > Hansson <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>;
> > > linux-mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer <kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC
> > > ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Lucas,
> > >
> > > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach
> > > <l.stach@pengutronix.de>
> > > wrote:
> > >
> > > > The reference manual states about this situation: "For any clock,
> > > > its
> > > > source must be left on when it is kept on. Behavior is undefined
> > > > if
> > > > this rule is violated."
> > > > And it seems this is exactly what's happening here: some kind of
> > > > glitch is introduced in the nand_usdhc_bus clock, which prevents
> > > > the
> > > > SDHCI controller from working, even though the clock branch is
> > > > properly enabled later on. On my system the SDHCI timeout and
> > > > following runtime suspend/resume cycle on the nand_usdhc_bus
> > > > clock
> > > > seem to get it back into a working state.
> > >
> > > I think your analysis is correct and I recall helping a customer
> > > with a similar
> > > issue:
> > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> > > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide-
> > > root
> > > -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping
> > > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc
> > > 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow
> > > n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> > > WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1
> > > SyQhGeSHYysX1Co4%3D&reserved=0
> > >
> >
> > For the customer case, it seem not the same issue. the customer issue
> > is caused by clock source change while parent has no clock output.
> > This is inherit limitation for the CCM clock slice when using the
> > smart interface to change the clock parent.
> >
> > For current mmc timeout issue, I think we can have a try with
> > nand_usdhc_bus clock gated at the beginning of kernel boot, directly
> > modify the nand_usdhc_bus
> > Clock's HW register gate bit in clock-imx8mq.c.
>
> While this might be an option to fix this specific issue, I would hope
> we can come up with something more generic, as the current clock
> framework behavior allows to violate the system specification
> constraint that parent clocks must not be disabled when any of the
> children are active. This seems like a fundamental issue and might hurt
> us also with other clocks than this specific nand_usdhc_bus clock.
I am not sure if an error in the fec driver has the same or similar cause as this. But I noticed that the SOC hangs when accessing the timecounter register while the FEC is down. The CCGR10 seems to gate the ENET_REF_CLK_ROOT and the ENET_TIMER_CLK_ROOT. The clocks are disabled as soon as the interface is down.
[1] https://lore.kernel.org/lkml/20210220065654.25598-1-heiko.thiery@gmail.com/
>
> Can you tell us if there were other issues found with the PLL1/2 gating
> patch? The fact that, according to Bough, it's reverted in your tree
> seems to suggest this.
>
> Regards,
> Lucas
Thank you
--
Heiko
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
2021-01-05 15:06 ` Lucas Stach
@ 2021-01-19 2:35 ` Peng Fan
-1 siblings, 0 replies; 42+ messages in thread
From: Peng Fan @ 2021-01-19 2:35 UTC (permalink / raw)
To: Lucas Stach, Bough Chen, Fabio Estevam, Angus Ainslie,
Leonard Crestez, Abel Vesa, Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Lucas,
> Subject: Re: sdhci timeout on imx8mq
>
> Hi all,
>
> Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: 2020年7月7日 20:45
> > > To: Angus Ainslie <angus@akkea.ca>
> > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>; linux-
> > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Angus,
> > >
> > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > wrote:
> > >
> > > > Has there been any progress with this. I'm getting this on about
> > > > 50% of
> > >
> > > Not from my side, sorry.
> > >
> > > Bough,
> > >
> > > Do you know why this problem affects the imx8mq-evk versions that
> > > are populated with the Micron eMMC and not the ones with Sandisk
> > > eMMC?
> >
> > Hi Angus,
> >
> > Can you show me the full fail log? I do not meet this issue on my
> > side, besides, which kind of uboot do you use?
>
> I was finally able to bisect this issue, which wasn't that much fun due to the
> issue not being reproducible 100%. :/ Turns out that the issue is even more
> interesting than I thought and likely doesn't have anything to do with SDHCI or
> used bootloader versions. Here's my current debugging state:
>
> I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define gates for
> pll1/2 fixed dividers). The change itself looks fine to me, still CC'ed Leonard for
> good measure.
>
> In my testing the following partial revert fixes the issue:
>
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M_CG] =
> imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
> hws[IMX8MQ_SYS1_PLL_160M_CG] =
> imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
> hws[IMX8MQ_SYS1_PLL_200M_CG] =
> imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
> hws[IMX8MQ_SYS1_PLL_400M_CG] =
> imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
> hws[IMX8MQ_SYS1_PLL_800M_CG] =
> imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
>
> @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M] =
> imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> hws[IMX8MQ_SYS1_PLL_160M] =
> imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> hws[IMX8MQ_SYS1_PLL_200M] =
> imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> - hws[IMX8MQ_SYS1_PLL_266M] =
> imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> + hws[IMX8MQ_SYS1_PLL_266M] =
> + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> hws[IMX8MQ_SYS1_PLL_400M] =
> imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> hws[IMX8MQ_SYS1_PLL_800M] =
> imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
>
> The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated that the
> SDHCI driver properly enables this bus clock across the problematic card
> access. So what I think is happening here is that both nand_usdhc_bus and
> sys1_pll_266m are initially enabled. Sometime during boot sys1_pll_266m
> gets disabled due to runtime PM on the enet_axi clock, which is a direct child
> of sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but no
> consumer has claimed the clock yet, so the parent clock gets disabled while
> this branch of the clock tree is still active.
>
> The reference manual states about this situation: "For any clock, its source
> must be left on when it is kept on. Behavior is undefined if this rule is
> violated."
I think you are referring
Clock source control,
For PLLs, we have channel on every PLL, every PFD and every divider. PLL is the
source of PFDs and dividers. For any clock, its source must be left on when it is kept on.
Behavior is undefined if this rule is violated.
I think this is not saying clock root slice.
Regards,
Peng.
> And it seems this is exactly what's happening here: some kind of glitch is
> introduced in the nand_usdhc_bus clock, which prevents the SDHCI controller
> from working, even though the clock branch is properly enabled later on. On
> my system the SDHCI timeout and following runtime suspend/resume cycle
> on the nand_usdhc_bus clock seem to get it back into a working state.
>
> So I think we need some solution at the clock driver/framework level to
> prevent shutting down parent clocks that have active branches, even if those
> branches aren't claimed by a consumer (yet).
>
> Regards,
> Lucas
^ permalink raw reply [flat|nested] 42+ messages in thread
* RE: sdhci timeout on imx8mq
@ 2021-01-19 2:35 ` Peng Fan
0 siblings, 0 replies; 42+ messages in thread
From: Peng Fan @ 2021-01-19 2:35 UTC (permalink / raw)
To: Lucas Stach, Bough Chen, Fabio Estevam, Angus Ainslie,
Leonard Crestez, Abel Vesa, Stephen Boyd, Michael Turquette
Cc: Ulf Hansson, Guido Günther, linux-mmc, Adrian Hunter,
dl-linux-imx, Sascha Hauer,
moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
Hi Lucas,
> Subject: Re: sdhci timeout on imx8mq
>
> Hi all,
>
> Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN:
> > > -----Original Message-----
> > > From: Fabio Estevam [mailto:festevam@gmail.com]
> > > Sent: 2020年7月7日 20:45
> > > To: Angus Ainslie <angus@akkea.ca>
> > > Cc: BOUGH CHEN <haibo.chen@nxp.com>; Ulf Hansson
> > > <ulf.hansson@linaro.org>; Guido Günther <agx@sigxcpu.org>; linux-
> > > mmc <linux-mmc@vger.kernel.org>; Adrian Hunter
> > > <adrian.hunter@intel.com>; dl-linux-imx <linux-imx@nxp.com>; Sascha
> > > Hauer < kernel@pengutronix.de>; moderated list:ARM/FREESCALE IMX /
> > > MXC ARM ARCHITECTURE <linux-arm-kernel@lists.infradead.org>
> > > Subject: Re: sdhci timeout on imx8mq
> > >
> > > Hi Angus,
> > >
> > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@akkea.ca>
> > > wrote:
> > >
> > > > Has there been any progress with this. I'm getting this on about
> > > > 50% of
> > >
> > > Not from my side, sorry.
> > >
> > > Bough,
> > >
> > > Do you know why this problem affects the imx8mq-evk versions that
> > > are populated with the Micron eMMC and not the ones with Sandisk
> > > eMMC?
> >
> > Hi Angus,
> >
> > Can you show me the full fail log? I do not meet this issue on my
> > side, besides, which kind of uboot do you use?
>
> I was finally able to bisect this issue, which wasn't that much fun due to the
> issue not being reproducible 100%. :/ Turns out that the issue is even more
> interesting than I thought and likely doesn't have anything to do with SDHCI or
> used bootloader versions. Here's my current debugging state:
>
> I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define gates for
> pll1/2 fixed dividers). The change itself looks fine to me, still CC'ed Leonard for
> good measure.
>
> In my testing the following partial revert fixes the issue:
>
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M_CG] =
> imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, 15);
> hws[IMX8MQ_SYS1_PLL_160M_CG] =
> imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, 17);
> hws[IMX8MQ_SYS1_PLL_200M_CG] =
> imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19);
> - hws[IMX8MQ_SYS1_PLL_266M_CG] =
> imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, 21);
> hws[IMX8MQ_SYS1_PLL_400M_CG] =
> imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, 23);
> hws[IMX8MQ_SYS1_PLL_800M_CG] =
> imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, 25);
>
> @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct
> platform_device *pdev)
> hws[IMX8MQ_SYS1_PLL_133M] =
> imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6);
> hws[IMX8MQ_SYS1_PLL_160M] =
> imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5);
> hws[IMX8MQ_SYS1_PLL_200M] =
> imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4);
> - hws[IMX8MQ_SYS1_PLL_266M] =
> imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3);
> + hws[IMX8MQ_SYS1_PLL_266M] =
> + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3);
> hws[IMX8MQ_SYS1_PLL_400M] =
> imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2);
> hws[IMX8MQ_SYS1_PLL_800M] =
> imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1);
>
> The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated that the
> SDHCI driver properly enables this bus clock across the problematic card
> access. So what I think is happening here is that both nand_usdhc_bus and
> sys1_pll_266m are initially enabled. Sometime during boot sys1_pll_266m
> gets disabled due to runtime PM on the enet_axi clock, which is a direct child
> of sys1_pll_266m. At this point nand_usdhc_bus is still enabled, but no
> consumer has claimed the clock yet, so the parent clock gets disabled while
> this branch of the clock tree is still active.
>
> The reference manual states about this situation: "For any clock, its source
> must be left on when it is kept on. Behavior is undefined if this rule is
> violated."
I think you are referring
Clock source control,
For PLLs, we have channel on every PLL, every PFD and every divider. PLL is the
source of PFDs and dividers. For any clock, its source must be left on when it is kept on.
Behavior is undefined if this rule is violated.
I think this is not saying clock root slice.
Regards,
Peng.
> And it seems this is exactly what's happening here: some kind of glitch is
> introduced in the nand_usdhc_bus clock, which prevents the SDHCI controller
> from working, even though the clock branch is properly enabled later on. On
> my system the SDHCI timeout and following runtime suspend/resume cycle
> on the nand_usdhc_bus clock seem to get it back into a working state.
>
> So I think we need some solution at the clock driver/framework level to
> prevent shutting down parent clocks that have active branches, even if those
> branches aren't claimed by a consumer (yet).
>
> Regards,
> Lucas
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 42+ messages in thread