All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
@ 2014-03-13 14:29 Geert Uytterhoeven
  2014-03-14  8:51 ` Simon Horman
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2014-03-13 14:29 UTC (permalink / raw)
  To: linux-sh

From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>

Due to issues with runtime PM clock management, clocks not explicitly
managed by their drivers may not be enabled at all, or be inadvertently
disabled by the clk_disable_unused() late initcall.

Until this is fixed, add a temporary workaround, calling
shmobile_clk_workaround() with enable = true.

For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
thermal. More clocks can be added if needed.

Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
---
Tested with and without CONFIG_PM_RUNTIME

 arch/arm/mach-shmobile/board-koelsch-reference.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/mach-shmobile/board-koelsch-reference.c b/arch/arm/mach-shmobile/board-koelsch-reference.c
index a760f7f19bc9..040c6b99cde2 100644
--- a/arch/arm/mach-shmobile/board-koelsch-reference.c
+++ b/arch/arm/mach-shmobile/board-koelsch-reference.c
@@ -107,9 +107,21 @@ static const struct clk_name clk_names[] = {
 	{ "lvds0", "lvds.0", "rcar-du-r8a7791" },
 };
 
+/*
+ * This is a really crude hack to work around core platform clock issues
+ */
+static const struct clk_name clk_enables[] = {
+	{ "ether", NULL, "ee700000.ethernet" },
+	{ "i2c2", NULL, "e6530000.i2c" },
+	{ "msiof0", NULL, "e6e20000.spi" },
+	{ "qspi_mod", NULL, "e6b10000.spi" },
+	{ "thermal", NULL, "e61f0000.thermal" },
+};
+
 static void __init koelsch_add_standard_devices(void)
 {
 	shmobile_clk_workaround(clk_names, ARRAY_SIZE(clk_names), false);
+	shmobile_clk_workaround(clk_enables, ARRAY_SIZE(clk_enables), true);
 	r8a7791_add_dt_devices();
 	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
@ 2014-03-14  8:51 ` Simon Horman
  2014-03-14  8:53 ` Magnus Damm
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2014-03-14  8:51 UTC (permalink / raw)
  To: linux-sh

On Thu, Mar 13, 2014 at 03:29:30PM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> 
> Due to issues with runtime PM clock management, clocks not explicitly
> managed by their drivers may not be enabled at all, or be inadvertently
> disabled by the clk_disable_unused() late initcall.
> 
> Until this is fixed, add a temporary workaround, calling
> shmobile_clk_workaround() with enable = true.
> 
> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
> thermal. More clocks can be added if needed.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> ---
> Tested with and without CONFIG_PM_RUNTIME
> 
>  arch/arm/mach-shmobile/board-koelsch-reference.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/arm/mach-shmobile/board-koelsch-reference.c b/arch/arm/mach-shmobile/board-koelsch-reference.c
> index a760f7f19bc9..040c6b99cde2 100644
> --- a/arch/arm/mach-shmobile/board-koelsch-reference.c
> +++ b/arch/arm/mach-shmobile/board-koelsch-reference.c
> @@ -107,9 +107,21 @@ static const struct clk_name clk_names[] = {
>  	{ "lvds0", "lvds.0", "rcar-du-r8a7791" },
>  };
>  
> +/*
> + * This is a really crude hack to work around core platform clock issues
> + */
> +static const struct clk_name clk_enables[] = {

I think this should be annotate as __initconst

> +	{ "ether", NULL, "ee700000.ethernet" },
> +	{ "i2c2", NULL, "e6530000.i2c" },
> +	{ "msiof0", NULL, "e6e20000.spi" },
> +	{ "qspi_mod", NULL, "e6b10000.spi" },
> +	{ "thermal", NULL, "e61f0000.thermal" },
> +};
> +
>  static void __init koelsch_add_standard_devices(void)
>  {
>  	shmobile_clk_workaround(clk_names, ARRAY_SIZE(clk_names), false);
> +	shmobile_clk_workaround(clk_enables, ARRAY_SIZE(clk_enables), true);
>  	r8a7791_add_dt_devices();
>  	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
>  
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
  2014-03-14  8:51 ` Simon Horman
@ 2014-03-14  8:53 ` Magnus Damm
  2014-03-14  9:09 ` Geert Uytterhoeven
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Magnus Damm @ 2014-03-14  8:53 UTC (permalink / raw)
  To: linux-sh

On Thu, Mar 13, 2014 at 11:29 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>
> Due to issues with runtime PM clock management, clocks not explicitly
> managed by their drivers may not be enabled at all, or be inadvertently
> disabled by the clk_disable_unused() late initcall.
>
> Until this is fixed, add a temporary workaround, calling
> shmobile_clk_workaround() with enable = true.
>
> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
> thermal. More clocks can be added if needed.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> ---
> Tested with and without CONFIG_PM_RUNTIME

Hi Geert,

Thanks for this patch. I'm happy to see that you find use for my clock
workaround function. This is exactly how I imagined that it would be
used. =)

So as alway I'm yet to test the code, but before doing that I'd like
to clarify if this patch is known to work together with this one?

[PATCH] clk: shmobile: mstp: Fix the is_enabled() operation

My gut feeling is that these two patches together should fix our
pending issues for Koelsch. Is it correct?

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
  2014-03-14  8:51 ` Simon Horman
  2014-03-14  8:53 ` Magnus Damm
@ 2014-03-14  9:09 ` Geert Uytterhoeven
  2014-03-14  9:23 ` Magnus Damm
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2014-03-14  9:09 UTC (permalink / raw)
  To: linux-sh

Hi Magnus,

On Fri, Mar 14, 2014 at 9:53 AM, Magnus Damm <magnus.damm@gmail.com> wrote:
> On Thu, Mar 13, 2014 at 11:29 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>
>> Due to issues with runtime PM clock management, clocks not explicitly
>> managed by their drivers may not be enabled at all, or be inadvertently
>> disabled by the clk_disable_unused() late initcall.
>>
>> Until this is fixed, add a temporary workaround, calling
>> shmobile_clk_workaround() with enable = true.
>>
>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>> thermal. More clocks can be added if needed.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>> ---
>> Tested with and without CONFIG_PM_RUNTIME
>
> Hi Geert,
>
> Thanks for this patch. I'm happy to see that you find use for my clock
> workaround function. This is exactly how I imagined that it would be
> used. =)
>
> So as alway I'm yet to test the code, but before doing that I'd like
> to clarify if this patch is known to work together with this one?
>
> [PATCH] clk: shmobile: mstp: Fix the is_enabled() operation

Yes, it works with that one. It disables the unused clocks, i.e.:

MSTP tmu0 OFF
MSTP i2c0 OFF
MSTP i2c1 OFF
MSTP i2c3 OFF
MSTP i2c4 OFF
MSTP i2c5 OFF
MSTP msiof1 OFF
MSTP msiof2 OFF

(note that i2c2, ether, msiof0, and qspi_mod are no longer disabled,
 as they're now explicitly marked in use).

> My gut feeling is that these two patches together should fix our
> pending issues for Koelsch. Is it correct?

It seems so. I guess the same hack has to be used on the other boards
that have/are receiving CCF support (Lager, Marzen, Genmai).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (2 preceding siblings ...)
  2014-03-14  9:09 ` Geert Uytterhoeven
@ 2014-03-14  9:23 ` Magnus Damm
  2014-03-14 11:02 ` Laurent Pinchart
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Magnus Damm @ 2014-03-14  9:23 UTC (permalink / raw)
  To: linux-sh

Hi Geert,

On Fri, Mar 14, 2014 at 6:09 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Magnus,
>
> On Fri, Mar 14, 2014 at 9:53 AM, Magnus Damm <magnus.damm@gmail.com> wrote:
>> On Thu, Mar 13, 2014 at 11:29 PM, Geert Uytterhoeven
>> <geert@linux-m68k.org> wrote:
>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>>
>>> Due to issues with runtime PM clock management, clocks not explicitly
>>> managed by their drivers may not be enabled at all, or be inadvertently
>>> disabled by the clk_disable_unused() late initcall.
>>>
>>> Until this is fixed, add a temporary workaround, calling
>>> shmobile_clk_workaround() with enable = true.
>>>
>>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>>> thermal. More clocks can be added if needed.
>>>
>>> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>> ---
>>> Tested with and without CONFIG_PM_RUNTIME
>>
>> Hi Geert,
>>
>> Thanks for this patch. I'm happy to see that you find use for my clock
>> workaround function. This is exactly how I imagined that it would be
>> used. =)
>>
>> So as alway I'm yet to test the code, but before doing that I'd like
>> to clarify if this patch is known to work together with this one?
>>
>> [PATCH] clk: shmobile: mstp: Fix the is_enabled() operation
>
> Yes, it works with that one. It disables the unused clocks, i.e.:
>
> MSTP tmu0 OFF
> MSTP i2c0 OFF
> MSTP i2c1 OFF
> MSTP i2c3 OFF
> MSTP i2c4 OFF
> MSTP i2c5 OFF
> MSTP msiof1 OFF
> MSTP msiof2 OFF
>
> (note that i2c2, ether, msiof0, and qspi_mod are no longer disabled,
>  as they're now explicitly marked in use).

Sure, that is fine as a workaround.

>> My gut feeling is that these two patches together should fix our
>> pending issues for Koelsch. Is it correct?
>
> It seems so. I guess the same hack has to be used on the other boards
> that have/are receiving CCF support (Lager, Marzen, Genmai).

Yep, I think so too. Thanks for your clarification!

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (3 preceding siblings ...)
  2014-03-14  9:23 ` Magnus Damm
@ 2014-03-14 11:02 ` Laurent Pinchart
  2014-03-14 11:10 ` Ben Dooks
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Laurent Pinchart @ 2014-03-14 11:02 UTC (permalink / raw)
  To: linux-sh

Hi Geert,

Thank you for the patch.

On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> 
> Due to issues with runtime PM clock management, clocks not explicitly
> managed by their drivers may not be enabled at all, or be inadvertently
> disabled by the clk_disable_unused() late initcall.
> 
> Until this is fixed, add a temporary workaround, calling
> shmobile_clk_workaround() with enable = true.
> 
> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
> thermal. More clocks can be added if needed.

This should do the job, but as you mentioned, it's a crude hack. As we're 
targeting v3.16, is there a chance we could fix the problem properly instead ?

> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> ---
> Tested with and without CONFIG_PM_RUNTIME
> 
>  arch/arm/mach-shmobile/board-koelsch-reference.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/arm/mach-shmobile/board-koelsch-reference.c
> b/arch/arm/mach-shmobile/board-koelsch-reference.c index
> a760f7f19bc9..040c6b99cde2 100644
> --- a/arch/arm/mach-shmobile/board-koelsch-reference.c
> +++ b/arch/arm/mach-shmobile/board-koelsch-reference.c
> @@ -107,9 +107,21 @@ static const struct clk_name clk_names[] = {
>  	{ "lvds0", "lvds.0", "rcar-du-r8a7791" },
>  };
> 
> +/*
> + * This is a really crude hack to work around core platform clock issues
> + */
> +static const struct clk_name clk_enables[] = {
> +	{ "ether", NULL, "ee700000.ethernet" },
> +	{ "i2c2", NULL, "e6530000.i2c" },
> +	{ "msiof0", NULL, "e6e20000.spi" },
> +	{ "qspi_mod", NULL, "e6b10000.spi" },
> +	{ "thermal", NULL, "e61f0000.thermal" },
> +};
> +
>  static void __init koelsch_add_standard_devices(void)
>  {
>  	shmobile_clk_workaround(clk_names, ARRAY_SIZE(clk_names), false);
> +	shmobile_clk_workaround(clk_enables, ARRAY_SIZE(clk_enables), true);
>  	r8a7791_add_dt_devices();
>  	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);

-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (4 preceding siblings ...)
  2014-03-14 11:02 ` Laurent Pinchart
@ 2014-03-14 11:10 ` Ben Dooks
  2014-03-14 12:39 ` Geert Uytterhoeven
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 11:10 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 11:02, Laurent Pinchart wrote:
> Hi Geert,
>
> Thank you for the patch.
>
> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>
>> Due to issues with runtime PM clock management, clocks not explicitly
>> managed by their drivers may not be enabled at all, or be inadvertently
>> disabled by the clk_disable_unused() late initcall.
>>
>> Until this is fixed, add a temporary workaround, calling
>> shmobile_clk_workaround() with enable = true.
>>
>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>> thermal. More clocks can be added if needed.
>
> This should do the job, but as you mentioned, it's a crude hack. As we're
> targeting v3.16, is there a chance we could fix the problem properly instead ?

The best fix would be to re-enable the PM and find out what is
actually causing the external abort. However currently there is
no information in the manuals about anything we could find out from
the AXI busses as to what the source actually is.

I have tried updating the CPSR to enable Aborts earlier in the
boot process but so far there's no sign as to what is causing
these issues.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (5 preceding siblings ...)
  2014-03-14 11:10 ` Ben Dooks
@ 2014-03-14 12:39 ` Geert Uytterhoeven
  2014-03-14 12:43 ` Laurent Pinchart
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2014-03-14 12:39 UTC (permalink / raw)
  To: linux-sh

Hi Ben, Laurent,

On Fri, Mar 14, 2014 at 12:10 PM, Ben Dooks <ben.dooks@codethink.co.uk> wrote:
> On 14/03/14 11:02, Laurent Pinchart wrote:
>> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>>
>>> Due to issues with runtime PM clock management, clocks not explicitly
>>> managed by their drivers may not be enabled at all, or be inadvertently
>>> disabled by the clk_disable_unused() late initcall.
>>>
>>> Until this is fixed, add a temporary workaround, calling
>>> shmobile_clk_workaround() with enable = true.
>>>
>>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>>> thermal. More clocks can be added if needed.
>>
>> This should do the job, but as you mentioned, it's a crude hack. As we're
>> targeting v3.16, is there a chance we could fix the problem properly
>> instead ?

Of course the goal is to fix it for real, so the crude hack will no longer
be needed. But for now, it looks like a good short-term workaround.

> The best fix would be to re-enable the PM and find out what is

Sure, but in a multiplatform-aware way.

> actually causing the external abort. However currently there is
> no information in the manuals about anything we could find out from
> the AXI busses as to what the source actually is.

I re-applied your patch "ARM: shmobile: compile drivers/sh for
CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get
the external abort.

Some experimenting revealed it's due to the "ether" clock in the
clk_enables[] array. As long as that's enabled early, the system seems
to boot fine with your patch.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (6 preceding siblings ...)
  2014-03-14 12:39 ` Geert Uytterhoeven
@ 2014-03-14 12:43 ` Laurent Pinchart
  2014-03-14 13:02 ` Geert Uytterhoeven
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Laurent Pinchart @ 2014-03-14 12:43 UTC (permalink / raw)
  To: linux-sh

Hi Geert,

On Friday 14 March 2014 13:39:43 Geert Uytterhoeven wrote:
> On Fri, Mar 14, 2014 at 12:10 PM, Ben Dooks wrote:
> > On 14/03/14 11:02, Laurent Pinchart wrote:
> >> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
> >>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> >>> 
> >>> Due to issues with runtime PM clock management, clocks not explicitly
> >>> managed by their drivers may not be enabled at all, or be inadvertently
> >>> disabled by the clk_disable_unused() late initcall.
> >>> 
> >>> Until this is fixed, add a temporary workaround, calling
> >>> shmobile_clk_workaround() with enable = true.
> >>> 
> >>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
> >>> thermal. More clocks can be added if needed.
> >> 
> >> This should do the job, but as you mentioned, it's a crude hack. As we're
> >> targeting v3.16, is there a chance we could fix the problem properly
> >> instead ?
> 
> Of course the goal is to fix it for real, so the crude hack will no longer
> be needed. But for now, it looks like a good short-term workaround.
> 
> > The best fix would be to re-enable the PM and find out what is
> 
> Sure, but in a multiplatform-aware way.

Of course. Are you working on that, or should I give it a try ? Would you like 
to discuss this ?

> > actually causing the external abort. However currently there is
> > no information in the manuals about anything we could find out from
> > the AXI busses as to what the source actually is.
> 
> I re-applied your patch "ARM: shmobile: compile drivers/sh for
> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get the external
> abort.
> 
> Some experimenting revealed it's due to the "ether" clock in the
> clk_enables[] array. As long as that's enabled early, the system seems to
> boot fine with your patch.

At what point do you get the external abort without the ether clock workaround 
?

-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (7 preceding siblings ...)
  2014-03-14 12:43 ` Laurent Pinchart
@ 2014-03-14 13:02 ` Geert Uytterhoeven
  2014-03-14 14:13 ` Ben Dooks
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2014-03-14 13:02 UTC (permalink / raw)
  To: linux-sh

Hi Laurent,

On Fri, Mar 14, 2014 at 1:43 PM, Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
> > >> This should do the job, but as you mentioned, it's a crude hack. As we're
> > >> targeting v3.16, is there a chance we could fix the problem properly
> > >> instead ?
> >
> > Of course the goal is to fix it for real, so the crude hack will no longer
> > be needed. But for now, it looks like a good short-term workaround.
> >
> > > The best fix would be to re-enable the PM and find out what is
> >
> > Sure, but in a multiplatform-aware way.
>
> Of course. Are you working on that, or should I give it a try ? Would you like
> to discuss this ?

Yes, I plan to work on this. But all input is welcome, of course.

>> > actually causing the external abort. However currently there is
>> > no information in the manuals about anything we could find out from
>> > the AXI busses as to what the source actually is.
>>
>> I re-applied your patch "ARM: shmobile: compile drivers/sh for
>> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get the external
>> abort.
>>
>> Some experimenting revealed it's due to the "ether" clock in the
>> clk_enables[] array. As long as that's enabled early, the system seems to
>> boot fine with your patch.
>
> At what point do you get the external abort without the ether clock workaround
> ?

When userspace starts:

Freeing unused kernel memory: 204K (c042b000 - c045e000)
Unhandled fault: imprecise external abort (0x1406) at 0x00000000
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007

CPU: 1 PID: 1 Comm: init Not tainted
3.14.0-rc6-koelsch-reference-00362-gf29bb90d4995-dirty #164
Backtrace:
[<c00120f4>] (dump_backtrace) from [<c0012294>] (show_stack+0x18/0x1c)
 r6:eec799c0 r5:ee49ce40 r4:00000000 r3:00000204
[<c001227c>] (show_stack) from [<c032e46c>] (dump_stack+0x70/0x8c)
[<c032e3fc>] (dump_stack) from [<c032c978>] (panic+0x90/0x1ec)
 r4:eec799c0 r3:00000001
[<c032c8ec>] (panic) from [<c0025d3c>] (do_exit+0x494/0x8bc)
 r3:eec73dc0 r2:00000000 r1:00000007 r0:c03d33ac
 r7:ee49ce78
[<c00258a8>] (do_exit) from [<c00262f4>] (do_group_exit+0xa4/0xd0)
 r7:ee431040
[<c0026250>] (do_group_exit) from [<c0031854>]
(get_signal_to_deliver+0x4bc/0x520)
 r7:ee431040 r6:eec7bee4 r5:eec7a000 r4:01060013
[<c0031398>] (get_signal_to_deliver) from [<c00115f4>] (do_signal+0xa8/0x3c0)
 r10:00000000 r9:eec7a000 r8:00000000 r7:eec7a000 r6:00000000 r5:00000000
 r4:eec7bfb0
[<c001154c>] (do_signal) from [<c0011c1c>] (do_work_pending+0x54/0x9c)
 r10:00000000 r8:00000000 r7:00000000 r6:00000000 r5:eec7a000 r4:eec7bfb0
[<c0011bc8>] (do_work_pending) from [<c000ed40>] (work_pending+0xc/0x20)
 r6:ffffffff r5:00000030 r4:b6ef0bc0 r3:eec799c0
CPU0: stopping
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
3.14.0-rc6-koelsch-reference-00362-gf29bb90d4995-dirty #164
Backtrace:
[<c00120f4>] (dump_backtrace) from [<c0012294>] (show_stack+0x18/0x1c)
 r6:c0468844 r5:00000000 r4:00000000 r3:00200000
[<c001227c>] (show_stack) from [<c032e46c>] (dump_stack+0x70/0x8c)
[<c032e3fc>] (dump_stack) from [<c0013fe4>] (handle_IPI+0xcc/0x164)
 r4:c0484b98 r3:c046eae0
[<c0013f18>] (handle_IPI) from [<c0009314>] (gic_handle_irq+0x58/0x60)
 r5:c0461f18 r4:f0002000
[<c00092bc>] (gic_handle_irq) from [<c0012e00>] (__irq_svc+0x40/0x50)
Exception stack(0xc0461f18 to 0xc0461f60)
1f00:                                                       ef1ed698 00000000
1f20: 006e076b 00000000 c045d698 2ed90000 60000113 ef1ed698 c0468380 413fc0f2
1f40: ef7fccc0 c0461f8c c0461f60 c0461f60 c0067e14 c0067e18 60000113 ffffffff
 r6:ffffffff r5:60000113 r4:c0067e18 r3:c0067e14
[<c0067d68>] (rcu_idle_exit) from [<c005f660>] (cpu_startup_entry+0xe4/0x118)
 r8:c0468380 r7:c03357f4 r6:c0468454 r5:c0484780 r4:c0460000
[<c005f57c>] (cpu_startup_entry) from [<c032b228>] (rest_init+0x68/0x80)
 r7:c0454d90 r3:00000000
[<c032b1c0>] (rest_init) from [<c042bb04>] (start_kernel+0x2fc/0x358)
[<c042b808>] (start_kernel) from [<40008074>] (0x40008074)

Difference in clk_summary output between working and failed case just before
"Freeing unused kernel memory" is:

-             ether                        2            2    65000000          0
+             ether                        1            1    65000000          0

so at that point the clock is still enabled.

You once mentioned that if you try to access a module's registers while its
MSTP clock is not running you may get an exception (on some SoCs).
Is this such an exception?

Note that I never got exceptions when accessing QSPI or MSIOF on r8a7791
with the respective MSTP clocks disabled. I also didn't get one when Ethernet
stopped working after the is_enabled() MSTP fix. That was before NFS root
was mounted, though.

Running actual executables after mounting is different. Demand paging is
involved there. Perhaps there's a bug somewhere in nfs root mmap() or in
the Ethernet driver, not propagating the errors due to the lost Ethernet clock,
so /sbin/init starts running an uninitalized page?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (8 preceding siblings ...)
  2014-03-14 13:02 ` Geert Uytterhoeven
@ 2014-03-14 14:13 ` Ben Dooks
  2014-03-14 14:26 ` Laurent Pinchart
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 14:13 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 12:43, Laurent Pinchart wrote:
> Hi Geert,
>
> On Friday 14 March 2014 13:39:43 Geert Uytterhoeven wrote:
>> On Fri, Mar 14, 2014 at 12:10 PM, Ben Dooks wrote:
>>> On 14/03/14 11:02, Laurent Pinchart wrote:
>>>> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
>>>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>>>>
>>>>> Due to issues with runtime PM clock management, clocks not explicitly
>>>>> managed by their drivers may not be enabled at all, or be inadvertently
>>>>> disabled by the clk_disable_unused() late initcall.
>>>>>
>>>>> Until this is fixed, add a temporary workaround, calling
>>>>> shmobile_clk_workaround() with enable = true.
>>>>>
>>>>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>>>>> thermal. More clocks can be added if needed.
>>>>
>>>> This should do the job, but as you mentioned, it's a crude hack. As we're
>>>> targeting v3.16, is there a chance we could fix the problem properly
>>>> instead ?
>>
>> Of course the goal is to fix it for real, so the crude hack will no longer
>> be needed. But for now, it looks like a good short-term workaround.
>>
>>> The best fix would be to re-enable the PM and find out what is
>>
>> Sure, but in a multiplatform-aware way.
>
> Of course. Are you working on that, or should I give it a try ? Would you like
> to discuss this ?

I did send a patch to try and re-enable the drivers/sh build for
the shmobile pm_runtime code. I will try and re-look at this over
the weekend once I have sorted out the other work I have been trying
to get done.

>
>>> actually causing the external abort. However currently there is
>>> no information in the manuals about anything we could find out from
>>> the AXI busses as to what the source actually is.
>>
>> I re-applied your patch "ARM: shmobile: compile drivers/sh for
>> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get the external
>> abort.
>>
>> Some experimenting revealed it's due to the "ether" clock in the
>> clk_enables[] array. As long as that's enabled early, the system seems to
>> boot fine with your patch.
>
> At what point do you get the external abort without the ether clock workaround
> ?

I thought it was early in the sequence but it seems to be coming
sometime after init is started.


-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (9 preceding siblings ...)
  2014-03-14 14:13 ` Ben Dooks
@ 2014-03-14 14:26 ` Laurent Pinchart
  2014-03-14 14:43 ` Laurent Pinchart
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Laurent Pinchart @ 2014-03-14 14:26 UTC (permalink / raw)
  To: linux-sh

Hi Ben,

On Friday 14 March 2014 14:13:59 Ben Dooks wrote:
> On 14/03/14 12:43, Laurent Pinchart wrote:
> > On Friday 14 March 2014 13:39:43 Geert Uytterhoeven wrote:
> >> On Fri, Mar 14, 2014 at 12:10 PM, Ben Dooks wrote:
> >>> On 14/03/14 11:02, Laurent Pinchart wrote:
> >>>> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
> >>>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> >>>>> 
> >>>>> Due to issues with runtime PM clock management, clocks not explicitly
> >>>>> managed by their drivers may not be enabled at all, or be
> >>>>> inadvertently disabled by the clk_disable_unused() late initcall.
> >>>>> 
> >>>>> Until this is fixed, add a temporary workaround, calling
> >>>>> shmobile_clk_workaround() with enable = true.
> >>>>> 
> >>>>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod,
> >>>>> and thermal. More clocks can be added if needed.
> >>>> 
> >>>> This should do the job, but as you mentioned, it's a crude hack. As
> >>>> we're targeting v3.16, is there a chance we could fix the problem
> >>>> properly instead ?
> >> 
> >> Of course the goal is to fix it for real, so the crude hack will no
> >> longer be needed. But for now, it looks like a good short-term
> >> workaround.
> >> 
> >>> The best fix would be to re-enable the PM and find out what is
> >> 
> >> Sure, but in a multiplatform-aware way.
> > 
> > Of course. Are you working on that, or should I give it a try ? Would you
> > like to discuss this ?
> 
> I did send a patch to try and re-enable the drivers/sh build for
> the shmobile pm_runtime code. I will try and re-look at this over
> the weekend once I have sorted out the other work I have been trying
> to get done.

I remember that. If I'm not mistaken the issue was that we code would register 
the Renesas pm clock notifier on non-Renesas platforms when running a multi-
platform kernel.

I'm wondering whether the approach proposed by Felipe Balbi in 
https://lkml.org/lkml/2014/1/31/290 wouldn't be a better solution than custom 
code. I have a few concerns with the proposed patch but nothing that can't be 
solved.

Could you please coordinate with Geert (as I believe he's working on this) and 
Felipe ? Feel free to CC me. I can also be available for a chat on IRC if 
needed.

> >>> actually causing the external abort. However currently there is
> >>> no information in the manuals about anything we could find out from
> >>> the AXI busses as to what the source actually is.
> >> 
> >> I re-applied your patch "ARM: shmobile: compile drivers/sh for
> >> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get the
> >> external abort.
> >> 
> >> Some experimenting revealed it's due to the "ether" clock in the
> >> clk_enables[] array. As long as that's enabled early, the system seems to
> >> boot fine with your patch.
> > 
> > At what point do you get the external abort without the ether clock
> > workaround ?
> 
> I thought it was early in the sequence but it seems to be coming sometime
> after init is started.

-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (10 preceding siblings ...)
  2014-03-14 14:26 ` Laurent Pinchart
@ 2014-03-14 14:43 ` Laurent Pinchart
  2014-03-14 14:45 ` Ben Dooks
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Laurent Pinchart @ 2014-03-14 14:43 UTC (permalink / raw)
  To: linux-sh

Hi Geert,

On Friday 14 March 2014 14:02:59 Geert Uytterhoeven wrote:
> On Fri, Mar 14, 2014 at 1:43 PM, Laurent Pinchart wrote:
> > > >> This should do the job, but as you mentioned, it's a crude hack. As
> > > >> we're targeting v3.16, is there a chance we could fix the problem
> > > >> properly instead ?
> > > 
> > > Of course the goal is to fix it for real, so the crude hack will no
> > > longer be needed. But for now, it looks like a good short-term
> > > workaround.
> > > 
> > > > The best fix would be to re-enable the PM and find out what is
> > > 
> > > Sure, but in a multiplatform-aware way.
> > 
> > Of course. Are you working on that, or should I give it a try ? Would you
> > like to discuss this ?
> 
> Yes, I plan to work on this. But all input is welcome, of course.

Any opinion on https://lkml.org/lkml/2014/1/31/290 ?

> >> > actually causing the external abort. However currently there is
> >> > no information in the manuals about anything we could find out from
> >> > the AXI busses as to what the source actually is.
> >> 
> >> I re-applied your patch "ARM: shmobile: compile drivers/sh for
> >> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get the
> >> external abort.
> >> 
> >> Some experimenting revealed it's due to the "ether" clock in the
> >> clk_enables[] array. As long as that's enabled early, the system seems to
> >> boot fine with your patch.
> > 
> > At what point do you get the external abort without the ether clock
> > workaround ?
> 
> When userspace starts:
> 
> Freeing unused kernel memory: 204K (c042b000 - c045e000)
> Unhandled fault: imprecise external abort (0x1406) at 0x00000000
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007
> 
> CPU: 1 PID: 1 Comm: init Not tainted
> 3.14.0-rc6-koelsch-reference-00362-gf29bb90d4995-dirty #164
> Backtrace:
> [<c00120f4>] (dump_backtrace) from [<c0012294>] (show_stack+0x18/0x1c)
>  r6:eec799c0 r5:ee49ce40 r4:00000000 r3:00000204
> [<c001227c>] (show_stack) from [<c032e46c>] (dump_stack+0x70/0x8c)
> [<c032e3fc>] (dump_stack) from [<c032c978>] (panic+0x90/0x1ec)
>  r4:eec799c0 r3:00000001
> [<c032c8ec>] (panic) from [<c0025d3c>] (do_exit+0x494/0x8bc)
>  r3:eec73dc0 r2:00000000 r1:00000007 r0:c03d33ac
>  r7:ee49ce78
> [<c00258a8>] (do_exit) from [<c00262f4>] (do_group_exit+0xa4/0xd0)
>  r7:ee431040
> [<c0026250>] (do_group_exit) from [<c0031854>]
> (get_signal_to_deliver+0x4bc/0x520)
>  r7:ee431040 r6:eec7bee4 r5:eec7a000 r4:01060013
> [<c0031398>] (get_signal_to_deliver) from [<c00115f4>]
> (do_signal+0xa8/0x3c0) r10:00000000 r9:eec7a000 r8:00000000 r7:eec7a000
> r6:00000000 r5:00000000 r4:eec7bfb0
> [<c001154c>] (do_signal) from [<c0011c1c>] (do_work_pending+0x54/0x9c)
>  r10:00000000 r8:00000000 r7:00000000 r6:00000000 r5:eec7a000 r4:eec7bfb0
> [<c0011bc8>] (do_work_pending) from [<c000ed40>] (work_pending+0xc/0x20)
>  r6:ffffffff r5:00000030 r4:b6ef0bc0 r3:eec799c0
> CPU0: stopping
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 3.14.0-rc6-koelsch-reference-00362-gf29bb90d4995-dirty #164
> Backtrace:
> [<c00120f4>] (dump_backtrace) from [<c0012294>] (show_stack+0x18/0x1c)
>  r6:c0468844 r5:00000000 r4:00000000 r3:00200000
> [<c001227c>] (show_stack) from [<c032e46c>] (dump_stack+0x70/0x8c)
> [<c032e3fc>] (dump_stack) from [<c0013fe4>] (handle_IPI+0xcc/0x164)
>  r4:c0484b98 r3:c046eae0
> [<c0013f18>] (handle_IPI) from [<c0009314>] (gic_handle_irq+0x58/0x60)
>  r5:c0461f18 r4:f0002000
> [<c00092bc>] (gic_handle_irq) from [<c0012e00>] (__irq_svc+0x40/0x50)
> Exception stack(0xc0461f18 to 0xc0461f60)
> 1f00:                                                       ef1ed698
> 00000000 1f20: 006e076b 00000000 c045d698 2ed90000 60000113 ef1ed698
> c0468380 413fc0f2 1f40: ef7fccc0 c0461f8c c0461f60 c0461f60 c0067e14
> c0067e18 60000113 ffffffff r6:ffffffff r5:60000113 r4:c0067e18 r3:c0067e14
> [<c0067d68>] (rcu_idle_exit) from [<c005f660>]
> (cpu_startup_entry+0xe4/0x118) r8:c0468380 r7:c03357f4 r6:c0468454
> r5:c0484780 r4:c0460000
> [<c005f57c>] (cpu_startup_entry) from [<c032b228>] (rest_init+0x68/0x80)
>  r7:c0454d90 r3:00000000
> [<c032b1c0>] (rest_init) from [<c042bb04>] (start_kernel+0x2fc/0x358)
> [<c042b808>] (start_kernel) from [<40008074>] (0x40008074)

As the external abort is imprecise the backtrace is pretty useless :-/ All we 
can tell from the DFSR value 0x1406 is that the fault was generated by a read 
access not related to a cache maintenance operation. Bit 12 is an 
implementation defined bit that might provide more information, but it isn't 
documented in the R8A7791 datasheet.

Could you try to enable LPAE ? The DFSR format is slightly different in that 
case, it may provide more information.

> Difference in clk_summary output between working and failed case just before
> "Freeing unused kernel memory" is:
> 
> -       ether                        2            2    65000000          0
> +       ether                        1            1    65000000          0
> 
> so at that point the clock is still enabled.
> 
> You once mentioned that if you try to access a module's registers while its
> MSTP clock is not running you may get an exception (on some SoCs).
> Is this such an exception?

Yes, those are the same symptoms.

> Note that I never got exceptions when accessing QSPI or MSIOF on r8a7791
> with the respective MSTP clocks disabled. I also didn't get one when
> Ethernet stopped working after the is_enabled() MSTP fix. That was before
> NFS root was mounted, though.
> 
> Running actual executables after mounting is different. Demand paging is
> involved there. Perhaps there's a bug somewhere in nfs root mmap() or in the
> Ethernet driver, not propagating the errors due to the lost Ethernet clock,
> so /sbin/init starts running an uninitalized page?

I don't think so. According to 
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211h/Caccdbdh.html, 
external aborts are errors "that occur in the memory system other than those 
that are detected by an MMU." That looks really device-related to me.

-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (11 preceding siblings ...)
  2014-03-14 14:43 ` Laurent Pinchart
@ 2014-03-14 14:45 ` Ben Dooks
  2014-03-14 15:51 ` Ben Dooks
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 14:45 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 14:43, Laurent Pinchart wrote:
> Hi Geert,
>
> On Friday 14 March 2014 14:02:59 Geert Uytterhoeven wrote:
>> On Fri, Mar 14, 2014 at 1:43 PM, Laurent Pinchart wrote:
>>>>>> This should do the job, but as you mentioned, it's a crude hack. As
>>>>>> we're targeting v3.16, is there a chance we could fix the problem
>>>>>> properly instead ?
>>>>
>>>> Of course the goal is to fix it for real, so the crude hack will no
>>>> longer be needed. But for now, it looks like a good short-term
>>>> workaround.
>>>>
>>>>> The best fix would be to re-enable the PM and find out what is
>>>>
>>>> Sure, but in a multiplatform-aware way.
>>>
>>> Of course. Are you working on that, or should I give it a try ? Would you
>>> like to discuss this ?
>>
>> Yes, I plan to work on this. But all input is welcome, of course.
>
> Any opinion on https://lkml.org/lkml/2014/1/31/290 ?
>
>>>>> actually causing the external abort. However currently there is
>>>>> no information in the manuals about anything we could find out from
>>>>> the AXI busses as to what the source actually is.
>>>>
>>>> I re-applied your patch "ARM: shmobile: compile drivers/sh for
>>>> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get the
>>>> external abort.
>>>>
>>>> Some experimenting revealed it's due to the "ether" clock in the
>>>> clk_enables[] array. As long as that's enabled early, the system seems to
>>>> boot fine with your patch.
>>>
>>> At what point do you get the external abort without the ether clock
>>> workaround ?
>>
>> When userspace starts:
>>
>> Freeing unused kernel memory: 204K (c042b000 - c045e000)
>> Unhandled fault: imprecise external abort (0x1406) at 0x00000000
>> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007
>>
>> CPU: 1 PID: 1 Comm: init Not tainted
>> 3.14.0-rc6-koelsch-reference-00362-gf29bb90d4995-dirty #164
>> Backtrace:
>> [<c00120f4>] (dump_backtrace) from [<c0012294>] (show_stack+0x18/0x1c)
>>   r6:eec799c0 r5:ee49ce40 r4:00000000 r3:00000204
>> [<c001227c>] (show_stack) from [<c032e46c>] (dump_stack+0x70/0x8c)
>> [<c032e3fc>] (dump_stack) from [<c032c978>] (panic+0x90/0x1ec)
>>   r4:eec799c0 r3:00000001
>> [<c032c8ec>] (panic) from [<c0025d3c>] (do_exit+0x494/0x8bc)
>>   r3:eec73dc0 r2:00000000 r1:00000007 r0:c03d33ac
>>   r7:ee49ce78
>> [<c00258a8>] (do_exit) from [<c00262f4>] (do_group_exit+0xa4/0xd0)
>>   r7:ee431040
>> [<c0026250>] (do_group_exit) from [<c0031854>]
>> (get_signal_to_deliver+0x4bc/0x520)
>>   r7:ee431040 r6:eec7bee4 r5:eec7a000 r4:01060013
>> [<c0031398>] (get_signal_to_deliver) from [<c00115f4>]
>> (do_signal+0xa8/0x3c0) r10:00000000 r9:eec7a000 r8:00000000 r7:eec7a000
>> r6:00000000 r5:00000000 r4:eec7bfb0
>> [<c001154c>] (do_signal) from [<c0011c1c>] (do_work_pending+0x54/0x9c)
>>   r10:00000000 r8:00000000 r7:00000000 r6:00000000 r5:eec7a000 r4:eec7bfb0
>> [<c0011bc8>] (do_work_pending) from [<c000ed40>] (work_pending+0xc/0x20)
>>   r6:ffffffff r5:00000030 r4:b6ef0bc0 r3:eec799c0
>> CPU0: stopping
>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted
>> 3.14.0-rc6-koelsch-reference-00362-gf29bb90d4995-dirty #164
>> Backtrace:
>> [<c00120f4>] (dump_backtrace) from [<c0012294>] (show_stack+0x18/0x1c)
>>   r6:c0468844 r5:00000000 r4:00000000 r3:00200000
>> [<c001227c>] (show_stack) from [<c032e46c>] (dump_stack+0x70/0x8c)
>> [<c032e3fc>] (dump_stack) from [<c0013fe4>] (handle_IPI+0xcc/0x164)
>>   r4:c0484b98 r3:c046eae0
>> [<c0013f18>] (handle_IPI) from [<c0009314>] (gic_handle_irq+0x58/0x60)
>>   r5:c0461f18 r4:f0002000
>> [<c00092bc>] (gic_handle_irq) from [<c0012e00>] (__irq_svc+0x40/0x50)
>> Exception stack(0xc0461f18 to 0xc0461f60)
>> 1f00:                                                       ef1ed698
>> 00000000 1f20: 006e076b 00000000 c045d698 2ed90000 60000113 ef1ed698
>> c0468380 413fc0f2 1f40: ef7fccc0 c0461f8c c0461f60 c0461f60 c0067e14
>> c0067e18 60000113 ffffffff r6:ffffffff r5:60000113 r4:c0067e18 r3:c0067e14
>> [<c0067d68>] (rcu_idle_exit) from [<c005f660>]
>> (cpu_startup_entry+0xe4/0x118) r8:c0468380 r7:c03357f4 r6:c0468454
>> r5:c0484780 r4:c0460000
>> [<c005f57c>] (cpu_startup_entry) from [<c032b228>] (rest_init+0x68/0x80)
>>   r7:c0454d90 r3:00000000
>> [<c032b1c0>] (rest_init) from [<c042bb04>] (start_kernel+0x2fc/0x358)
>> [<c042b808>] (start_kernel) from [<40008074>] (0x40008074)
>
> As the external abort is imprecise the backtrace is pretty useless :-/ All we
> can tell from the DFSR value 0x1406 is that the fault was generated by a read
> access not related to a cache maintenance operation. Bit 12 is an
> implementation defined bit that might provide more information, but it isn't
> documented in the R8A7791 datasheet.
>
> Could you try to enable LPAE ? The DFSR format is slightly different in that
> case, it may provide more information.
>
>> Difference in clk_summary output between working and failed case just before
>> "Freeing unused kernel memory" is:
>>
>> -       ether                        2            2    65000000          0
>> +       ether                        1            1    65000000          0
>>
>> so at that point the clock is still enabled.
>>
>> You once mentioned that if you try to access a module's registers while its
>> MSTP clock is not running you may get an exception (on some SoCs).
>> Is this such an exception?
>
> Yes, those are the same symptoms.
>
>> Note that I never got exceptions when accessing QSPI or MSIOF on r8a7791
>> with the respective MSTP clocks disabled. I also didn't get one when
>> Ethernet stopped working after the is_enabled() MSTP fix. That was before
>> NFS root was mounted, though.
>>
>> Running actual executables after mounting is different. Demand paging is
>> involved there. Perhaps there's a bug somewhere in nfs root mmap() or in the
>> Ethernet driver, not propagating the errors due to the lost Ethernet clock,
>> so /sbin/init starts running an uninitalized page?
>
> I don't think so. According to
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211h/Caccdbdh.html,
> external aborts are errors "that occur in the memory system other than those
> that are detected by an MMU." That looks really device-related to me.

I've also had these when trying to access a bad address for one of the
AXI busses (IICC).



-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (12 preceding siblings ...)
  2014-03-14 14:45 ` Ben Dooks
@ 2014-03-14 15:51 ` Ben Dooks
  2014-03-14 16:48 ` Magnus Damm
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 15:51 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 12:39, Geert Uytterhoeven wrote:
> Hi Ben, Laurent,
>
> On Fri, Mar 14, 2014 at 12:10 PM, Ben Dooks <ben.dooks@codethink.co.uk> wrote:
>> On 14/03/14 11:02, Laurent Pinchart wrote:
>>> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
>>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>>>
>>>> Due to issues with runtime PM clock management, clocks not explicitly
>>>> managed by their drivers may not be enabled at all, or be inadvertently
>>>> disabled by the clk_disable_unused() late initcall.
>>>>
>>>> Until this is fixed, add a temporary workaround, calling
>>>> shmobile_clk_workaround() with enable = true.
>>>>
>>>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>>>> thermal. More clocks can be added if needed.
>>>
>>> This should do the job, but as you mentioned, it's a crude hack. As we're
>>> targeting v3.16, is there a chance we could fix the problem properly
>>> instead ?
>
> Of course the goal is to fix it for real, so the crude hack will no longer
> be needed. But for now, it looks like a good short-term workaround.
>
>> The best fix would be to re-enable the PM and find out what is
>
> Sure, but in a multiplatform-aware way.
>
>> actually causing the external abort. However currently there is
>> no information in the manuals about anything we could find out from
>> the AXI busses as to what the source actually is.
>
> I re-applied your patch "ARM: shmobile: compile drivers/sh for
> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get
> the external abort.
>
> Some experimenting revealed it's due to the "ether" clock in the
> clk_enables[] array. As long as that's enabled early, the system seems

I did post an updated to ensure the MDIO bus is accessed with clock
enabled, but that has not cured the issue for me. I wonder if some
other part of the driver is also accessing the hardware without
the correct pm accesses.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (13 preceding siblings ...)
  2014-03-14 15:51 ` Ben Dooks
@ 2014-03-14 16:48 ` Magnus Damm
  2014-03-14 17:11 ` Ben Dooks
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Magnus Damm @ 2014-03-14 16:48 UTC (permalink / raw)
  To: linux-sh

Hi Ben,

On Sat, Mar 15, 2014 at 12:51 AM, Ben Dooks <ben.dooks@codethink.co.uk> wrote:
> On 14/03/14 12:39, Geert Uytterhoeven wrote:
>>
>> Hi Ben, Laurent,
>>
>> On Fri, Mar 14, 2014 at 12:10 PM, Ben Dooks <ben.dooks@codethink.co.uk>
>> wrote:
>>>
>>> On 14/03/14 11:02, Laurent Pinchart wrote:
>>>>
>>>> On Thursday 13 March 2014 15:29:30 Geert Uytterhoeven wrote:
>>>>>
>>>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>>>>
>>>>> Due to issues with runtime PM clock management, clocks not explicitly
>>>>> managed by their drivers may not be enabled at all, or be inadvertently
>>>>> disabled by the clk_disable_unused() late initcall.
>>>>>
>>>>> Until this is fixed, add a temporary workaround, calling
>>>>> shmobile_clk_workaround() with enable = true.
>>>>>
>>>>> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
>>>>> thermal. More clocks can be added if needed.
>>>>
>>>>
>>>> This should do the job, but as you mentioned, it's a crude hack. As
>>>> we're
>>>> targeting v3.16, is there a chance we could fix the problem properly
>>>> instead ?
>>
>>
>> Of course the goal is to fix it for real, so the crude hack will no longer
>> be needed. But for now, it looks like a good short-term workaround.
>>
>>> The best fix would be to re-enable the PM and find out what is
>>
>>
>> Sure, but in a multiplatform-aware way.
>>
>>> actually causing the external abort. However currently there is
>>> no information in the manuals about anything we could find out from
>>> the AXI busses as to what the source actually is.
>>
>>
>> I re-applied your patch "ARM: shmobile: compile drivers/sh for
>> CONFIG_ARCH_SHMOBILE_MULTI", and surprisingly, I no longer get
>> the external abort.
>>
>> Some experimenting revealed it's due to the "ether" clock in the
>> clk_enables[] array. As long as that's enabled early, the system seems
>
>
> I did post an updated to ensure the MDIO bus is accessed with clock
> enabled, but that has not cured the issue for me. I wonder if some
> other part of the driver is also accessing the hardware without
> the correct pm accesses.

That would not surprise me. But it would trigger both for
multiplatform and legacy in such case, don't you think?

If static enablement using the clock workaround fixes the
multiplatform case then it is most likely related to that the driver
assumes that Runtime PM controls the clock. Or perhaps some hidden
clock dependency that only triggers with CCF?

Thanks,

/ magnus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (14 preceding siblings ...)
  2014-03-14 16:48 ` Magnus Damm
@ 2014-03-14 17:11 ` Ben Dooks
  2014-03-14 17:33 ` Ben Dooks
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 17:11 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 16:48, Magnus Damm wrote:
> Hi Ben,
>

>> I did post an updated to ensure the MDIO bus is accessed with clock
>> enabled, but that has not cured the issue for me. I wonder if some
>> other part of the driver is also accessing the hardware without
>> the correct pm accesses.
>
> That would not surprise me. But it would trigger both for
> multiplatform and legacy in such case, don't you think?
>
> If static enablement using the clock workaround fixes the
> multiplatform case then it is most likely related to that the driver
> assumes that Runtime PM controls the clock. Or perhaps some hidden
> clock dependency that only triggers with CCF?

I am not quite sure what is going on here. So far I am down to the
pm_runtime_enable(&pdev->dev) call setting the clock on and then the
pm_runtime_resume(&pdev->dev) call after it immediately shutting the
clock down!

I added code to do a WARN_ON(!__clk_is_enabled) on the read/write
calls and it has been triggering quite a bit.

> Thanks,
>
> / magnus
>


-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (15 preceding siblings ...)
  2014-03-14 17:11 ` Ben Dooks
@ 2014-03-14 17:33 ` Ben Dooks
  2014-03-14 17:55 ` Ben Dooks
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 17:33 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 16:48, Magnus Damm wrote:
> Hi Ben,
>
> That would not surprise me. But it would trigger both for
> multiplatform and legacy in such case, don't you think?
>
> If static enablement using the clock workaround fixes the
> multiplatform case then it is most likely related to that the driver
> assumes that Runtime PM controls the clock. Or perhaps some hidden
> clock dependency that only triggers with CCF?

It seems very sensitive to code (or possibly compiler) in some but
not all cases we are seeing the system pm the device and then the
driver does something that requires clocks.

At the moment the depending on how much debug it either works, or
does not.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (16 preceding siblings ...)
  2014-03-14 17:33 ` Ben Dooks
@ 2014-03-14 17:55 ` Ben Dooks
  2014-03-14 18:20 ` Ben Dooks
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 17:55 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 17:33, Ben Dooks wrote:
> On 14/03/14 16:48, Magnus Damm wrote:
>> Hi Ben,
>>
>> That would not surprise me. But it would trigger both for
>> multiplatform and legacy in such case, don't you think?
>>
>> If static enablement using the clock workaround fixes the
>> multiplatform case then it is most likely related to that the driver
>> assumes that Runtime PM controls the clock. Or perhaps some hidden
>> clock dependency that only triggers with CCF?
>
> It seems very sensitive to code (or possibly compiler) in some but
> not all cases we are seeing the system pm the device and then the
> driver does something that requires clocks.
>
> At the moment the depending on how much debug it either works, or
> does not.

So far I am down to the following being executed

         pm_runtime_enable(&pdev->dev);
         pm_runtime_resume(&pdev->dev);

and by the time it gets to:

	read_mac_address(ndev, pd->mac_addr);

the ethernet unit's clock has already been disabled.



-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (17 preceding siblings ...)
  2014-03-14 17:55 ` Ben Dooks
@ 2014-03-14 18:20 ` Ben Dooks
  2014-03-17  1:15 ` Simon Horman
  2014-03-18  0:25 ` Simon Horman
  20 siblings, 0 replies; 22+ messages in thread
From: Ben Dooks @ 2014-03-14 18:20 UTC (permalink / raw)
  To: linux-sh

On 14/03/14 17:55, Ben Dooks wrote:
> On 14/03/14 17:33, Ben Dooks wrote:
>> On 14/03/14 16:48, Magnus Damm wrote:
>>> Hi Ben,
>>>
>>> That would not surprise me. But it would trigger both for
>>> multiplatform and legacy in such case, don't you think?
>>>
>>> If static enablement using the clock workaround fixes the
>>> multiplatform case then it is most likely related to that the driver
>>> assumes that Runtime PM controls the clock. Or perhaps some hidden
>>> clock dependency that only triggers with CCF?
>>
>> It seems very sensitive to code (or possibly compiler) in some but
>> not all cases we are seeing the system pm the device and then the
>> driver does something that requires clocks.
>>
>> At the moment the depending on how much debug it either works, or
>> does not.
>
> So far I am down to the following being executed
>
>          pm_runtime_enable(&pdev->dev);
>          pm_runtime_resume(&pdev->dev);
>
> and by the time it gets to:
>
>      read_mac_address(ndev, pd->mac_addr);
>
> the ethernet unit's clock has already been disabled.

 From investigation, the pm is running a work-queue that
is causing the clock to be disabled. The only thing I can
think is we need to change the

 >          pm_runtime_enable(&pdev->dev);
 >          pm_runtime_resume(&pdev->dev);

to include a pm_runtime_get_sync() call.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (18 preceding siblings ...)
  2014-03-14 18:20 ` Ben Dooks
@ 2014-03-17  1:15 ` Simon Horman
  2014-03-18  0:25 ` Simon Horman
  20 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2014-03-17  1:15 UTC (permalink / raw)
  To: linux-sh

On Thu, Mar 13, 2014 at 03:29:30PM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> 
> Due to issues with runtime PM clock management, clocks not explicitly
> managed by their drivers may not be enabled at all, or be inadvertently
> disabled by the clk_disable_unused() late initcall.
> 
> Until this is fixed, add a temporary workaround, calling
> shmobile_clk_workaround() with enable = true.
> 
> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
> thermal. More clocks can be added if needed.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> ---
> Tested with and without CONFIG_PM_RUNTIME
> 
>  arch/arm/mach-shmobile/board-koelsch-reference.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)

I have tested that with this patch in place the thermal device appears to
report more sane temperatures.

before:
# cat /sys/devices/virtual/thermal/thermal_zone0/temp
-65000

after:
# cat /sys/devices/virtual/thermal/thermal_zone0/temp
35000


> 
> diff --git a/arch/arm/mach-shmobile/board-koelsch-reference.c b/arch/arm/mach-shmobile/board-koelsch-reference.c
> index a760f7f19bc9..040c6b99cde2 100644
> --- a/arch/arm/mach-shmobile/board-koelsch-reference.c
> +++ b/arch/arm/mach-shmobile/board-koelsch-reference.c
> @@ -107,9 +107,21 @@ static const struct clk_name clk_names[] = {
>  	{ "lvds0", "lvds.0", "rcar-du-r8a7791" },
>  };
>  
> +/*
> + * This is a really crude hack to work around core platform clock issues
> + */
> +static const struct clk_name clk_enables[] = {
> +	{ "ether", NULL, "ee700000.ethernet" },
> +	{ "i2c2", NULL, "e6530000.i2c" },
> +	{ "msiof0", NULL, "e6e20000.spi" },
> +	{ "qspi_mod", NULL, "e6b10000.spi" },
> +	{ "thermal", NULL, "e61f0000.thermal" },
> +};
> +
>  static void __init koelsch_add_standard_devices(void)
>  {
>  	shmobile_clk_workaround(clk_names, ARRAY_SIZE(clk_names), false);
> +	shmobile_clk_workaround(clk_enables, ARRAY_SIZE(clk_enables), true);
>  	r8a7791_add_dt_devices();
>  	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
>  
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues
  2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
                   ` (19 preceding siblings ...)
  2014-03-17  1:15 ` Simon Horman
@ 2014-03-18  0:25 ` Simon Horman
  20 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2014-03-18  0:25 UTC (permalink / raw)
  To: linux-sh

On Thu, Mar 13, 2014 at 03:29:30PM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> 
> Due to issues with runtime PM clock management, clocks not explicitly
> managed by their drivers may not be enabled at all, or be inadvertently
> disabled by the clk_disable_unused() late initcall.
> 
> Until this is fixed, add a temporary workaround, calling
> shmobile_clk_workaround() with enable = true.
> 
> For now this enables the clocks for: ether, i2c2, msiof0, qspi_mod, and
> thermal. More clocks can be added if needed.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>

Thanks, I have queued this up.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-03-18  0:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-13 14:29 [PATCH] [RFC] ARM: shmobile: koelsch-reference: Work around core clock issues Geert Uytterhoeven
2014-03-14  8:51 ` Simon Horman
2014-03-14  8:53 ` Magnus Damm
2014-03-14  9:09 ` Geert Uytterhoeven
2014-03-14  9:23 ` Magnus Damm
2014-03-14 11:02 ` Laurent Pinchart
2014-03-14 11:10 ` Ben Dooks
2014-03-14 12:39 ` Geert Uytterhoeven
2014-03-14 12:43 ` Laurent Pinchart
2014-03-14 13:02 ` Geert Uytterhoeven
2014-03-14 14:13 ` Ben Dooks
2014-03-14 14:26 ` Laurent Pinchart
2014-03-14 14:43 ` Laurent Pinchart
2014-03-14 14:45 ` Ben Dooks
2014-03-14 15:51 ` Ben Dooks
2014-03-14 16:48 ` Magnus Damm
2014-03-14 17:11 ` Ben Dooks
2014-03-14 17:33 ` Ben Dooks
2014-03-14 17:55 ` Ben Dooks
2014-03-14 18:20 ` Ben Dooks
2014-03-17  1:15 ` Simon Horman
2014-03-18  0:25 ` Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.