linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3]     ARM: rockchip: fix the SMP
@ 2015-06-05  4:47 Caesar Wang
  2015-06-05  4:47 ` [PATCH 1/3] ARM: rockchip: fix the CPU soft reset Caesar Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Caesar Wang @ 2015-06-05  4:47 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Caesar Wang,
	Russell King, linux-arm-kernel, linux-kernel

    Verified on url =
    https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-3.14


Caesar Wang (3):
  ARM: rockchip: fix the CPU soft reset
  ARM: rockchip: ensure CPU to enter WIF state
  ARM: rockchip: fix the SMP code style

 arch/arm/mach-rockchip/platsmp.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] ARM: rockchip: fix the CPU soft reset
  2015-06-05  4:47 [PATCH 0/3] ARM: rockchip: fix the SMP Caesar Wang
@ 2015-06-05  4:47 ` Caesar Wang
  2015-06-05  8:45   ` Heiko Stübner
  2015-06-05  4:47 ` [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state Caesar Wang
  2015-06-05  4:47 ` [PATCH 3/3] ARM: rockchip: fix the SMP code style Caesar Wang
  2 siblings, 1 reply; 9+ messages in thread
From: Caesar Wang @ 2015-06-05  4:47 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Caesar Wang,
	Russell King, linux-arm-kernel, linux-kernel

In general, the correct flow is:

cpu off:
    reset_control_assert
    regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))

cpu on:
    regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
    reset_control_deassert

You can repro it with bringing CPU up and down.
Says:(test scripts)

    cd /sys/devices/system/cpu/
    for i in $(seq 1000); do
    echo "================= $i ============"
        for j in $(seq 100); do
            while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "000" ]]; do
                echo 0 > cpu1/online
                echo 0 > cpu2/online
                echo 0 > cpu3/online
            done
            while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "111" ]]; do
                echo 1 > cpu1/online
                echo 1 > cpu2/online
                echo 1 > cpu3/online
            done
        done
    done

The following is reproducile log:
[34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
[34466.186824] Disabling non-boot CPUs ...
[34466.187509] CPU1: shutdown
[34466.188672] CPU2: shutdown
[34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0
[34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
[34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>] (show_stack+0x20/0x24)
[34473.736711] [<c010a460>] (show_stack) from [<c0630670>] (dump_stack+0x70/0x8c)
[34473.736731] [<c0630670>] (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc)
[34473.736754] [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
[34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>] (__run_hrtimer+0x118/0x1e0)
[34473.736797] [<c014103c>] (__run_hrtimer) from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0)
[34473.736820] [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>] (arch_timer_handler_phys+0x38/0x48)
[34473.736844] [<c04d9994>] (arch_timer_handler_phys) from [<c016acc4>] (handle_percpu_devid_irq+0xb8/0x124)
[34473.736867] [<c016acc4>] (handle_percpu_devid_irq) from [<c0166a30>] (generic_handle_irq+0x30/0x40)
[34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>] (__handle_domain_irq+0x8c/0xb0)
[34473.736905] [<c0166d28>] (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
[34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>] (__irq_svc+0x40/0x50)
[34473.736936] Exception stack(0xee127f70 to 0xee127fb8)
[34473.736948] 7f60:                                     ffffffed 00000000 2dd6d000 00000000
[34473.736964] 7f80: ee126000 00000015 c0b46bac c0b46bac 0000406a 410fc0d1 00000000 ee127fc4
[34473.736979] 7fa0: ee127fb8 ee127fb8 c0107038 c010703c 600f0013 ffffffff
[34473.736995] [<c010b040>] (__irq_svc) from [<c010703c>] (arch_cpu_idle+0x40/0x48)
[34473.737013] [<c010703c>] (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
[34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>] (secondary_start_kernel+0x138/0x160)
[34473.737059] [<c010c254>] (secondary_start_kernel) from [<00100464>] (0x100464)
[34474.903740] SMP: failed to stop secondary CPUs
[34476.099964] SMP: failed to stop secondary CPUs
...

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
---

 arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 5b4ca3c..1230d3d 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
 			return PTR_ERR(rstc);
 		}
 
-		if (on)
+		if (on) {
+			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
 			reset_control_deassert(rstc);
-		else
+		} else {
 			reset_control_assert(rstc);
+			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
+		}
 
 		reset_control_put(rstc);
 	}
 
-	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
-	if (ret < 0) {
-		pr_err("%s: could not update power domain\n", __func__);
-		return ret;
-	}
-
 	ret = -1;
 	while (ret != on) {
 		ret = pmu_power_domain_is_on(pd);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state
  2015-06-05  4:47 [PATCH 0/3] ARM: rockchip: fix the SMP Caesar Wang
  2015-06-05  4:47 ` [PATCH 1/3] ARM: rockchip: fix the CPU soft reset Caesar Wang
@ 2015-06-05  4:47 ` Caesar Wang
  2015-06-05  6:32   ` Kever Yang
  2015-06-05  4:47 ` [PATCH 3/3] ARM: rockchip: fix the SMP code style Caesar Wang
  2 siblings, 1 reply; 9+ messages in thread
From: Caesar Wang @ 2015-06-05  4:47 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Caesar Wang,
	Russell King, linux-arm-kernel, linux-kernel

In idle mode, core1/2/3 of Cortex-A17 should be either power off or in
WFI/WFE state.
we can delay 1ms to ensure the CPU enter WFI state.

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
---

 arch/arm/mach-rockchip/platsmp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 1230d3d..978c357 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -316,6 +316,9 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus)
 #ifdef CONFIG_HOTPLUG_CPU
 static int rockchip_cpu_kill(unsigned int cpu)
 {
+	/* ensure CPU can enter the WFI/WFE state */
+	mdelay(1);
+
 	pmu_set_power_domain(0 + cpu, false);
 	return 1;
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] ARM: rockchip: fix the SMP code style
  2015-06-05  4:47 [PATCH 0/3] ARM: rockchip: fix the SMP Caesar Wang
  2015-06-05  4:47 ` [PATCH 1/3] ARM: rockchip: fix the CPU soft reset Caesar Wang
  2015-06-05  4:47 ` [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state Caesar Wang
@ 2015-06-05  4:47 ` Caesar Wang
  2 siblings, 0 replies; 9+ messages in thread
From: Caesar Wang @ 2015-06-05  4:47 UTC (permalink / raw)
  To: Heiko Stuebner
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Caesar Wang,
	Russell King, linux-arm-kernel, linux-kernel

Use the below scripts to check:
scripts/checkpatch.pl -f arch/arm/mach-rockchip/platsmp.c

Signed-off-by: Caesar Wang <wxt@rock-chips.com>
---

 arch/arm/mach-rockchip/platsmp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
index 978c357..7eea805 100644
--- a/arch/arm/mach-rockchip/platsmp.c
+++ b/arch/arm/mach-rockchip/platsmp.c
@@ -104,7 +104,7 @@ static int pmu_set_power_domain(int pd, bool on)
 		ret = pmu_power_domain_is_on(pd);
 		if (ret < 0) {
 			pr_err("%s: could not read power domain state\n",
-				 __func__);
+			       __func__);
 			return ret;
 		}
 	}
@@ -128,7 +128,7 @@ static int __cpuinit rockchip_boot_secondary(unsigned int cpu,
 
 	if (cpu >= ncores) {
 		pr_err("%s: cpu %d outside maximum number of cpus %d\n",
-							__func__, cpu, ncores);
+		       __func__, cpu, ncores);
 		return -ENXIO;
 	}
 
@@ -147,7 +147,7 @@ static int __cpuinit rockchip_boot_secondary(unsigned int cpu,
 		 * */
 		udelay(10);
 		writel(virt_to_phys(rockchip_secondary_startup),
-			sram_base_addr + 8);
+		       sram_base_addr + 8);
 		writel(0xDEADBEAF, sram_base_addr + 4);
 		dsb_sev();
 	}
@@ -326,7 +326,7 @@ static int rockchip_cpu_kill(unsigned int cpu)
 static void rockchip_cpu_die(unsigned int cpu)
 {
 	v7_exit_coherency_flush(louis);
-	while(1)
+	while (1)
 		cpu_do_idle();
 }
 #endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state
  2015-06-05  4:47 ` [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state Caesar Wang
@ 2015-06-05  6:32   ` Kever Yang
  2015-06-05  6:58     ` Caesar Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Kever Yang @ 2015-06-05  6:32 UTC (permalink / raw)
  To: Caesar Wang, Heiko Stuebner
  Cc: Russell King, Dmitry Torokhov, dianders, linux-kernel,
	linux-rockchip, linux-arm-kernel

Hi Caesar,

Subject typo WIF/WFI.

On 06/05/2015 12:47 PM, Caesar Wang wrote:
> In idle mode, core1/2/3 of Cortex-A17 should be either power off or in
> WFI/WFE state.
> we can delay 1ms to ensure the CPU enter WFI state.
>
> Signed-off-by: Caesar Wang <wxt@rock-chips.com>
> ---
>
>   arch/arm/mach-rockchip/platsmp.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c
> index 1230d3d..978c357 100644
> --- a/arch/arm/mach-rockchip/platsmp.c
> +++ b/arch/arm/mach-rockchip/platsmp.c
> @@ -316,6 +316,9 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus)
>   #ifdef CONFIG_HOTPLUG_CPU
>   static int rockchip_cpu_kill(unsigned int cpu)
>   {
> +	/* ensure CPU can enter the WFI/WFE state */
> +	mdelay(1);
> +
Does it matter if core is not in WFI state when we want to power down it?

Thanks,
- Kever
>   	pmu_set_power_domain(0 + cpu, false);
>   	return 1;
>   }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state
  2015-06-05  6:32   ` Kever Yang
@ 2015-06-05  6:58     ` Caesar Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Caesar Wang @ 2015-06-05  6:58 UTC (permalink / raw)
  To: Kever Yang, Heiko Stuebner
  Cc: Russell King, Dmitry Torokhov, dianders, linux-kernel,
	linux-rockchip, linux-arm-kernel



在 2015年06月05日 14:32, Kever Yang 写道:
> Hi Caesar,
>
> Subject typo WIF/WFI.
OK

>
> On 06/05/2015 12:47 PM, Caesar Wang wrote:
>> In idle mode, core1/2/3 of Cortex-A17 should be either power off or in
>> WFI/WFE state.
>> we can delay 1ms to ensure the CPU enter WFI state.
>>
>> Signed-off-by: Caesar Wang <wxt@rock-chips.com>
>> ---
>>
>>   arch/arm/mach-rockchip/platsmp.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/arm/mach-rockchip/platsmp.c 
>> b/arch/arm/mach-rockchip/platsmp.c
>> index 1230d3d..978c357 100644
>> --- a/arch/arm/mach-rockchip/platsmp.c
>> +++ b/arch/arm/mach-rockchip/platsmp.c
>> @@ -316,6 +316,9 @@ static void __init 
>> rockchip_smp_prepare_cpus(unsigned int max_cpus)
>>   #ifdef CONFIG_HOTPLUG_CPU
>>   static int rockchip_cpu_kill(unsigned int cpu)
>>   {
>> +    /* ensure CPU can enter the WFI/WFE state */
>> +    mdelay(1);
>> +
> Does it matter if core is not in WFI state when we want to power down it?
>
As HuangTao suggestion,

In gerenal, we need enter the WFI state when core power down, right?
That will be more better if the hardware can judge the state.

Anyway, we can delay 1ms or more to wait the WFI state.
That should be more better, right?

> Thanks,
> - Kever
>>       pmu_set_power_domain(0 + cpu, false);
>>       return 1;
>>   }
>
>
>
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset
  2015-06-05  4:47 ` [PATCH 1/3] ARM: rockchip: fix the CPU soft reset Caesar Wang
@ 2015-06-05  8:45   ` Heiko Stübner
  2015-06-05  9:00     ` Caesar Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Heiko Stübner @ 2015-06-05  8:45 UTC (permalink / raw)
  To: Caesar Wang
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Russell King,
	linux-arm-kernel, linux-kernel

Hi Caesar,


thanks for investigating this.

Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang:
> In general, the correct flow is:
> 
> cpu off:
>     reset_control_assert
>     regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> 
> cpu on:
>     regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>     reset_control_deassert
> 
> You can repro it with bringing CPU up and down.
> Says:(test scripts)
> 
>     cd /sys/devices/system/cpu/
>     for i in $(seq 1000); do
>     echo "================= $i ============"
>         for j in $(seq 100); do
>             while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> cpu3/online)" != "000" ]]; do echo 0 > cpu1/online
>                 echo 0 > cpu2/online
>                 echo 0 > cpu3/online
>             done
>             while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
>                 echo 1 > cpu2/online
>                 echo 1 > cpu3/online
>             done
>         done
>     done
> 
> The following is reproducile log:
> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
> [34466.186824] Disabling non-boot CPUs ...
> [34466.187509] CPU1: shutdown
> [34466.188672] CPU2: shutdown
> [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP on
> cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
> [34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>]
> (show_stack+0x20/0x24) [34473.736711] [<c010a460>] (show_stack) from
> [<c0630670>] (dump_stack+0x70/0x8c) [34473.736731] [<c0630670>]
> (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc) [34473.736754]
> [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
> [34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>]
> (__run_hrtimer+0x118/0x1e0) [34473.736797] [<c014103c>] (__run_hrtimer)
> from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0) [34473.736820]
> [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>]
> (arch_timer_handler_phys+0x38/0x48) [34473.736844] [<c04d9994>]
> (arch_timer_handler_phys) from [<c016acc4>]
> (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [<c016acc4>]
> (handle_percpu_devid_irq) from [<c0166a30>] (generic_handle_irq+0x30/0x40)
> [34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>]
> (__handle_domain_irq+0x8c/0xb0) [34473.736905] [<c0166d28>]
> (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
> [34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>]
> (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to
> 0xee127fb8)
> [34473.736948] 7f60:                                     ffffffed 00000000
> 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac c0b46bac
> 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8 ee127fb8
> c0107038 c010703c 600f0013 ffffffff [34473.736995] [<c010b040>] (__irq_svc)
> from [<c010703c>] (arch_cpu_idle+0x40/0x48) [34473.737013] [<c010703c>]
> (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
> [34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>]
> (secondary_start_kernel+0x138/0x160) [34473.737059] [<c010c254>]
> (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP:
> failed to stop secondary CPUs
> [34476.099964] SMP: failed to stop secondary CPUs
> ...
> 
> Signed-off-by: Caesar Wang <wxt@rock-chips.com>
> ---
> 
>  arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/mach-rockchip/platsmp.c
> b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644
> --- a/arch/arm/mach-rockchip/platsmp.c
> +++ b/arch/arm/mach-rockchip/platsmp.c
> @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
>  			return PTR_ERR(rstc);
>  		}
> 
> -		if (on)
> +		if (on) {
> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>  			reset_control_deassert(rstc);
> -		else
> +		} else {
>  			reset_control_assert(rstc);
> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> +		}

you're loosing the return value of regmap_update_bits here, I guess it should 
look like below?

		if (on) {
			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
			if (ret < 0) {
				pr_err("%s: could not update power domain\n", __func__);
		  		reset_control_put(rstc);
				return ret;
			}

			reset_control_deassert(rstc);
		} else {
			reset_control_assert(rstc);

			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
			if (ret < 0) {
				pr_err("%s: could not update power domain\n", __func__);
		  		reset_control_put(rstc);
				return ret;
			}
		}


> 
>  		reset_control_put(rstc);
>  	}
> 
> -	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> -	if (ret < 0) {
> -		pr_err("%s: could not update power domain\n", __func__);
> -		return ret;
> -	}
> -
>  	ret = -1;
>  	while (ret != on) {
>  		ret = pmu_power_domain_is_on(pd);

second question - with this patch, what happens actually is

 cpu off:
     reset_control_assert
     regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
     wait_for_power_domain_to_turn_off
 
 cpu on:
     regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
     reset_control_deassert
     wait_for_power_domain_to_turn_on

So shouldn't the deassertion of the reset happen after the powerdomain 
sucessfull turned on? Like

 cpu on:
     regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
     wait_for_power_domain_to_turn_on
     reset_control_deassert


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset
  2015-06-05  8:45   ` Heiko Stübner
@ 2015-06-05  9:00     ` Caesar Wang
  2015-06-05  9:46       ` Heiko Stübner
  0 siblings, 1 reply; 9+ messages in thread
From: Caesar Wang @ 2015-06-05  9:00 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Russell King,
	linux-arm-kernel, linux-kernel

Heiko,

在 2015年06月05日 16:45, Heiko Stübner 写道:
> Hi Caesar,
>
>
> thanks for investigating this.
>
> Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang:
>> In general, the correct flow is:
>>
>> cpu off:
>>      reset_control_assert
>>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
>>
>> cpu on:
>>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>>      reset_control_deassert
>>
>> You can repro it with bringing CPU up and down.
>> Says:(test scripts)
>>
>>      cd /sys/devices/system/cpu/
>>      for i in $(seq 1000); do
>>      echo "================= $i ============"
>>          for j in $(seq 100); do
>>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
>> cpu3/online)" != "000" ]]; do echo 0 > cpu1/online
>>                  echo 0 > cpu2/online
>>                  echo 0 > cpu3/online
>>              done
>>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
>> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
>>                  echo 1 > cpu2/online
>>                  echo 1 > cpu3/online
>>              done
>>          done
>>      done
>>
>> The following is reproducile log:
>> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
>> [34466.186824] Disabling non-boot CPUs ...
>> [34466.187509] CPU1: shutdown
>> [34466.188672] CPU2: shutdown
>> [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP on
>> cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
>> [34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>]
>> (show_stack+0x20/0x24) [34473.736711] [<c010a460>] (show_stack) from
>> [<c0630670>] (dump_stack+0x70/0x8c) [34473.736731] [<c0630670>]
>> (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc) [34473.736754]
>> [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
>> [34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>]
>> (__run_hrtimer+0x118/0x1e0) [34473.736797] [<c014103c>] (__run_hrtimer)
>> from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0) [34473.736820]
>> [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>]
>> (arch_timer_handler_phys+0x38/0x48) [34473.736844] [<c04d9994>]
>> (arch_timer_handler_phys) from [<c016acc4>]
>> (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [<c016acc4>]
>> (handle_percpu_devid_irq) from [<c0166a30>] (generic_handle_irq+0x30/0x40)
>> [34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>]
>> (__handle_domain_irq+0x8c/0xb0) [34473.736905] [<c0166d28>]
>> (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
>> [34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>]
>> (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to
>> 0xee127fb8)
>> [34473.736948] 7f60:                                     ffffffed 00000000
>> 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac c0b46bac
>> 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8 ee127fb8
>> c0107038 c010703c 600f0013 ffffffff [34473.736995] [<c010b040>] (__irq_svc)
>> from [<c010703c>] (arch_cpu_idle+0x40/0x48) [34473.737013] [<c010703c>]
>> (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
>> [34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>]
>> (secondary_start_kernel+0x138/0x160) [34473.737059] [<c010c254>]
>> (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP:
>> failed to stop secondary CPUs
>> [34476.099964] SMP: failed to stop secondary CPUs
>> ...
>>
>> Signed-off-by: Caesar Wang <wxt@rock-chips.com>
>> ---
>>
>>   arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
>>   1 file changed, 5 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/mach-rockchip/platsmp.c
>> b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644
>> --- a/arch/arm/mach-rockchip/platsmp.c
>> +++ b/arch/arm/mach-rockchip/platsmp.c
>> @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
>>   			return PTR_ERR(rstc);
>>   		}
>>
>> -		if (on)
>> +		if (on) {
>> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>>   			reset_control_deassert(rstc);
>> -		else
>> +		} else {
>>   			reset_control_assert(rstc);
>> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>> +		}
> you're loosing the return value of regmap_update_bits here, I guess it should
> look like below?
>
> 		if (on) {
> 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);

I search the kernel, Nobody do that return value.
Need we are that?
I will fix it on next patch if it's need.
> 			if (ret < 0) {
> 				pr_err("%s: could not update power domain\n", __func__);
> 		  		reset_control_put(rstc);
> 				return ret;
> 			}
>
> 			reset_control_deassert(rstc);
> 		} else {
> 			reset_control_assert(rstc);
>
> 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> 			if (ret < 0) {
> 				pr_err("%s: could not update power domain\n", __func__);
> 		  		reset_control_put(rstc);
> 				return ret;
> 			}
> 		}
>
>
>>   		reset_control_put(rstc);
>>   	}
>>
>> -	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
>> -	if (ret < 0) {
>> -		pr_err("%s: could not update power domain\n", __func__);
>> -		return ret;
>> -	}
>> -
>>   	ret = -1;
>>   	while (ret != on) {
>>   		ret = pmu_power_domain_is_on(pd);
> second question - with this patch, what happens actually is
>
>   cpu off:
>       reset_control_assert
>       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
>       wait_for_power_domain_to_turn_off
>   
>   cpu on:
>       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>       reset_control_deassert
>       wait_for_power_domain_to_turn_on
>
> So shouldn't the deassertion of the reset happen after the powerdomain
> sucessfull turned on? Like
>
>   cpu on:
>       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
>       wait_for_power_domain_to_turn_on
>       reset_control_deassert

You are right.
>
>
>

-- 
Thanks,
- Caesar



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] ARM: rockchip: fix the CPU soft reset
  2015-06-05  9:00     ` Caesar Wang
@ 2015-06-05  9:46       ` Heiko Stübner
  0 siblings, 0 replies; 9+ messages in thread
From: Heiko Stübner @ 2015-06-05  9:46 UTC (permalink / raw)
  To: Caesar Wang
  Cc: dianders, Dmitry Torokhov, linux-rockchip, Russell King,
	linux-arm-kernel, linux-kernel

Am Freitag, 5. Juni 2015, 17:00:02 schrieb Caesar Wang:
> Heiko,
> 
> 在 2015年06月05日 16:45, Heiko Stübner 写道:
> > Hi Caesar,
> > 
> > 
> > thanks for investigating this.
> > 
> > Am Freitag, 5. Juni 2015, 12:47:55 schrieb Caesar Wang:
> >> In general, the correct flow is:
> >> 
> >> cpu off:
> >>      reset_control_assert
> >>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> >> 
> >> cpu on:
> >>      regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >>      reset_control_deassert
> >> 
> >> You can repro it with bringing CPU up and down.
> >> Says:(test scripts)
> >> 
> >>      cd /sys/devices/system/cpu/
> >>      for i in $(seq 1000); do
> >>      echo "================= $i ============"
> >>      
> >>          for j in $(seq 100); do
> >>          
> >>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> >> 
> >> cpu3/online)" != "000" ]]; do echo 0 > cpu1/online
> >> 
> >>                  echo 0 > cpu2/online
> >>                  echo 0 > cpu3/online
> >>              
> >>              done
> >>              while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat
> >> 
> >> cpu3/online)" != "111" ]]; do echo 1 > cpu1/online
> >> 
> >>                  echo 1 > cpu2/online
> >>                  echo 1 > cpu3/online
> >>              
> >>              done
> >>          
> >>          done
> >>      
> >>      done
> >> 
> >> The following is reproducile log:
> >> [34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
> >> [34466.186824] Disabling non-boot CPUs ...
> >> [34466.187509] CPU1: shutdown
> >> [34466.188672] CPU2: shutdown
> >> [34473.736627] Kernel panic - not syncing: Watchdog detected hard LOCKUP
> >> on
> >> cpu 0 [34473.736646] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0 #1
> >> [34473.736687] [<c010dfb0>] (unwind_backtrace) from [<c010a460>]
> >> (show_stack+0x20/0x24) [34473.736711] [<c010a460>] (show_stack) from
> >> [<c0630670>] (dump_stack+0x70/0x8c) [34473.736731] [<c0630670>]
> >> (dump_stack) from [<c062fa0c>] (panic+0xa8/0x1fc) [34473.736754]
> >> [<c062fa0c>] (panic) from [<c0191efc>] (watchdog_timer_fn+0x234/0x26c)
> >> [34473.736777] [<c0191efc>] (watchdog_timer_fn) from [<c014103c>]
> >> (__run_hrtimer+0x118/0x1e0) [34473.736797] [<c014103c>] (__run_hrtimer)
> >> from [<c0141c04>] (hrtimer_interrupt+0x148/0x2a0) [34473.736820]
> >> [<c0141c04>] (hrtimer_interrupt) from [<c04d9994>]
> >> (arch_timer_handler_phys+0x38/0x48) [34473.736844] [<c04d9994>]
> >> (arch_timer_handler_phys) from [<c016acc4>]
> >> (handle_percpu_devid_irq+0xb8/0x124) [34473.736867] [<c016acc4>]
> >> (handle_percpu_devid_irq) from [<c0166a30>]
> >> (generic_handle_irq+0x30/0x40)
> >> [34473.736887] [<c0166a30>] (generic_handle_irq) from [<c0166d28>]
> >> (__handle_domain_irq+0x8c/0xb0) [34473.736905] [<c0166d28>]
> >> (__handle_domain_irq) from [<c01003a0>] (gic_handle_irq+0x48/0x6c)
> >> [34473.736922] [<c01003a0>] (gic_handle_irq) from [<c010b040>]
> >> (__irq_svc+0x40/0x50) [34473.736936] Exception stack(0xee127f70 to
> >> 0xee127fb8)
> >> [34473.736948] 7f60:                                     ffffffed
> >> 00000000
> >> 2dd6d000 00000000 [34473.736964] 7f80: ee126000 00000015 c0b46bac
> >> c0b46bac
> >> 0000406a 410fc0d1 00000000 ee127fc4 [34473.736979] 7fa0: ee127fb8
> >> ee127fb8
> >> c0107038 c010703c 600f0013 ffffffff [34473.736995] [<c010b040>]
> >> (__irq_svc)
> >> from [<c010703c>] (arch_cpu_idle+0x40/0x48) [34473.737013] [<c010703c>]
> >> (arch_cpu_idle) from [<c0166978>] (cpu_startup_entry+0x170/0x1d0)
> >> [34473.737031] [<c0166978>] (cpu_startup_entry) from [<c010c254>]
> >> (secondary_start_kernel+0x138/0x160) [34473.737059] [<c010c254>]
> >> (secondary_start_kernel) from [<00100464>] (0x100464) [34474.903740] SMP:
> >> failed to stop secondary CPUs
> >> [34476.099964] SMP: failed to stop secondary CPUs
> >> ...
> >> 
> >> Signed-off-by: Caesar Wang <wxt@rock-chips.com>
> >> ---
> >> 
> >>   arch/arm/mach-rockchip/platsmp.c | 13 +++++--------
> >>   1 file changed, 5 insertions(+), 8 deletions(-)
> >> 
> >> diff --git a/arch/arm/mach-rockchip/platsmp.c
> >> b/arch/arm/mach-rockchip/platsmp.c index 5b4ca3c..1230d3d 100644
> >> --- a/arch/arm/mach-rockchip/platsmp.c
> >> +++ b/arch/arm/mach-rockchip/platsmp.c
> >> @@ -88,20 +88,17 @@ static int pmu_set_power_domain(int pd, bool on)
> >> 
> >>   			return PTR_ERR(rstc);
> >>   		
> >>   		}
> >> 
> >> -		if (on)
> >> +		if (on) {
> >> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> 
> >>   			reset_control_deassert(rstc);
> >> 
> >> -		else
> >> +		} else {
> >> 
> >>   			reset_control_assert(rstc);
> >> 
> >> +			regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> +		}
> > 
> > you're loosing the return value of regmap_update_bits here, I guess it
> > should look like below?
> > 
> > 		if (on) {
> > 		
> > 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> 
> I search the kernel, Nobody do that return value.

counter-example:
http://lxr.free-electrons.com/source/drivers/phy/phy-exynos5250-sata.c#L76
;-)

> Need we are that?
> I will fix it on next patch if it's need.

yes, while not very probable, it is nevertheless still good practice to handle 
it. But when chaning the logic as we observed below, the regmap update part 
can already stay how it is now, as you can simply do something like:

if (!on)
	reset_control_assert(rstc);

regmap_update_bits
wait_for_pd

if (on)
	reset_control_deassert(rstc);


Heiko

> > 			if (ret < 0) {
> > 			
> > 				pr_err("%s: could not update power domain\n", __func__);
> > 				
> > 		  		reset_control_put(rstc);
> > 				
> > 				return ret;
> > 			
> > 			}
> > 			
> > 			reset_control_deassert(rstc);
> > 		
> > 		} else {
> > 		
> > 			reset_control_assert(rstc);
> > 			
> > 			ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> > 			if (ret < 0) {
> > 			
> > 				pr_err("%s: could not update power domain\n", __func__);
> > 				
> > 		  		reset_control_put(rstc);
> > 				
> > 				return ret;
> > 			
> > 			}
> > 		
> > 		}
> > 		
> >>   		reset_control_put(rstc);
> >>   	
> >>   	}
> >> 
> >> -	ret = regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), val);
> >> -	if (ret < 0) {
> >> -		pr_err("%s: could not update power domain\n", __func__);
> >> -		return ret;
> >> -	}
> >> -
> >> 
> >>   	ret = -1;
> >>   	while (ret != on) {
> >>   	
> >>   		ret = pmu_power_domain_is_on(pd);
> > 
> > second question - with this patch, what happens actually is
> > 
> >   cpu off:
> >       reset_control_assert
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
> >       wait_for_power_domain_to_turn_off
> >   
> >   cpu on:
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >       reset_control_deassert
> >       wait_for_power_domain_to_turn_on
> > 
> > So shouldn't the deassertion of the reset happen after the powerdomain
> > sucessfull turned on? Like
> > 
> >   cpu on:
> >       regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
> >       wait_for_power_domain_to_turn_on
> >       reset_control_deassert
> 
> You are right.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-06-05  9:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-05  4:47 [PATCH 0/3] ARM: rockchip: fix the SMP Caesar Wang
2015-06-05  4:47 ` [PATCH 1/3] ARM: rockchip: fix the CPU soft reset Caesar Wang
2015-06-05  8:45   ` Heiko Stübner
2015-06-05  9:00     ` Caesar Wang
2015-06-05  9:46       ` Heiko Stübner
2015-06-05  4:47 ` [PATCH 2/3] ARM: rockchip: ensure CPU to enter WIF state Caesar Wang
2015-06-05  6:32   ` Kever Yang
2015-06-05  6:58     ` Caesar Wang
2015-06-05  4:47 ` [PATCH 3/3] ARM: rockchip: fix the SMP code style Caesar Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).