[PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
@ 2018-04-24  4:41 Shilpasri G Bhat
  2018-04-24  5:10   ` Stewart Smith
  2018-04-24  6:00 ` Nicholas Piggin
  0 siblings, 2 replies; 8+ messages in thread
From: Shilpasri G Bhat @ 2018-04-24  4:41 UTC (permalink / raw)
  To: rjw, viresh.kumar
  Cc: npiggin, benh, mpe, linux-pm, linuxppc-dev, linux-kernel,
	ppaidipe, svaidy, Shilpasri G Bhat

gpstate_timer_handler() uses synchronous smp_call to set the pstate
on the requested core. This causes the below hard lockup:

[c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
[c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
[c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
[c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
[c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
[c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
[c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
[c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
[c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
[c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
--- interrupt: 901 at doorbell_global_ipi+0x34/0x50
LR = arch_send_call_function_ipi_mask+0x120/0x130
[c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
[c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
[c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
[c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
[c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
[c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
[c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
[c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c

Fix this by using the asynchronus smp_call in the timer interrupt handler.
We don't have to wait in this handler until the pstates are changed on
the core. This change will not have any impact on the global pstate
ramp-down algorithm.

Reported-by: Nicholas Piggin <npiggin@gmail.com>
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 0591874..7e0c752 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
 	spin_unlock(&gpstates->gpstate_lock);
 
 	/* Timer may get migrated to a different cpu on cpu hot unplug */
-	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
+	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
  2018-04-24  4:41 [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt Shilpasri G Bhat
@ 2018-04-24  5:10   ` Stewart Smith
  2018-04-24  6:00 ` Nicholas Piggin
  1 sibling, 0 replies; 8+ messages in thread
From: Stewart Smith @ 2018-04-24  5:10 UTC (permalink / raw)
  To: Shilpasri G Bhat, rjw, viresh.kumar
  Cc: npiggin, benh, mpe, linux-pm, linuxppc-dev, linux-kernel,
	ppaidipe, svaidy, Shilpasri G Bhat

Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> writes:
> gpstate_timer_handler() uses synchronous smp_call to set the pstate
> on the requested core. This causes the below hard lockup:
>
> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
> LR = arch_send_call_function_ipi_mask+0x120/0x130
> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
>
> Fix this by using the asynchronus smp_call in the timer interrupt handler.
> We don't have to wait in this handler until the pstates are changed on
> the core. This change will not have any impact on the global pstate
> ramp-down algorithm.
>
> Reported-by: Nicholas Piggin <npiggin@gmail.com>
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 0591874..7e0c752 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
>  	spin_unlock(&gpstates->gpstate_lock);
>
>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
>  }

Should this have:
Fixes: eaa2c3aeef83f
and CC stable v4.7+ ?

-- 
Stewart Smith
OPAL Architect, IBM.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
@ 2018-04-24  5:10   ` Stewart Smith
  0 siblings, 0 replies; 8+ messages in thread
From: Stewart Smith @ 2018-04-24  5:10 UTC (permalink / raw)
  To: rjw, viresh.kumar
  Cc: npiggin, benh, mpe, linux-pm, linuxppc-dev, linux-kernel,
	ppaidipe, svaidy, Shilpasri G Bhat

Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> writes:
> gpstate_timer_handler() uses synchronous smp_call to set the pstate
> on the requested core. This causes the below hard lockup:
>
> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
> LR = arch_send_call_function_ipi_mask+0x120/0x130
> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
>
> Fix this by using the asynchronus smp_call in the timer interrupt handler.
> We don't have to wait in this handler until the pstates are changed on
> the core. This change will not have any impact on the global pstate
> ramp-down algorithm.
>
> Reported-by: Nicholas Piggin <npiggin@gmail.com>
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 0591874..7e0c752 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
>  	spin_unlock(&gpstates->gpstate_lock);
>
>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
>  }

Should this have:
Fixes: eaa2c3aeef83f
and CC stable v4.7+ ?

-- 
Stewart Smith
OPAL Architect, IBM.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
  2018-04-24  5:10   ` Stewart Smith
  (?)
@ 2018-04-24  5:24   ` Shilpasri G Bhat
  -1 siblings, 0 replies; 8+ messages in thread
From: Shilpasri G Bhat @ 2018-04-24  5:24 UTC (permalink / raw)
  To: Stewart Smith, rjw, viresh.kumar
  Cc: npiggin, benh, mpe, linux-pm, linuxppc-dev, linux-kernel,
	ppaidipe, svaidy

Hi,

On 04/24/2018 10:40 AM, Stewart Smith wrote:
> Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> writes:
>> gpstate_timer_handler() uses synchronous smp_call to set the pstate
>> on the requested core. This causes the below hard lockup:
>>
>> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
>> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
>> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
>> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
>> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
>> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
>> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
>> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
>> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
>> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
>> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
>> LR = arch_send_call_function_ipi_mask+0x120/0x130
>> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
>> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
>> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
>> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
>> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
>> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
>> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
>> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
>>
>> Fix this by using the asynchronus smp_call in the timer interrupt handler.
>> We don't have to wait in this handler until the pstates are changed on
>> the core. This change will not have any impact on the global pstate
>> ramp-down algorithm.
>>
>> Reported-by: Nicholas Piggin <npiggin@gmail.com>
>> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
>> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
>> ---
>>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>> index 0591874..7e0c752 100644
>> --- a/drivers/cpufreq/powernv-cpufreq.c
>> +++ b/drivers/cpufreq/powernv-cpufreq.c
>> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
>>  	spin_unlock(&gpstates->gpstate_lock);
>>
>>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
>> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
>> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
>>  }
> 
> Should this have:
> Fixes: eaa2c3aeef83f
> and CC stable v4.7+ ?
> 

Yeah this is required.

Fixes: eaa2c3aeef83 (cpufreq: powernv: Ramp-down global pstate slower than
local-pstate)

Thanks and Regards,
Shilpa

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
  2018-04-24  4:41 [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt Shilpasri G Bhat
  2018-04-24  5:10   ` Stewart Smith
@ 2018-04-24  6:00 ` Nicholas Piggin
  2018-04-24  7:17   ` Shilpasri G Bhat
  1 sibling, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2018-04-24  6:00 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: rjw, viresh.kumar, benh, mpe, linux-pm, linuxppc-dev,
	linux-kernel, ppaidipe, svaidy

On Tue, 24 Apr 2018 10:11:46 +0530
Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> wrote:

> gpstate_timer_handler() uses synchronous smp_call to set the pstate
> on the requested core. This causes the below hard lockup:
> 
> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
> LR = arch_send_call_function_ipi_mask+0x120/0x130
> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
> 
> Fix this by using the asynchronus smp_call in the timer interrupt handler.
> We don't have to wait in this handler until the pstates are changed on
> the core. This change will not have any impact on the global pstate
> ramp-down algorithm.
> 
> Reported-by: Nicholas Piggin <npiggin@gmail.com>
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 0591874..7e0c752 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
>  	spin_unlock(&gpstates->gpstate_lock);
>  
>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
>  }
>  
>  /*

This can still deadlock because !wait case still ends up having to wait
if another !wait smp_call_function caller had previously used the
call single data for this cpu.

If you go this way you would have to use smp_call_function_async, which
is more work.

As a rule it would be better to avoid smp_call_function entirely if
possible. Can you ensure the timer is running on the right CPU? Use
add_timer_on and try again if the timer is on the wrong CPU, perhaps?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
  2018-04-24  6:00 ` Nicholas Piggin
@ 2018-04-24  7:17   ` Shilpasri G Bhat
  2018-04-24  7:31     ` Nicholas Piggin
  0 siblings, 1 reply; 8+ messages in thread
From: Shilpasri G Bhat @ 2018-04-24  7:17 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: rjw, viresh.kumar, benh, mpe, linux-pm, linuxppc-dev,
	linux-kernel, ppaidipe, svaidy

Hi,

On 04/24/2018 11:30 AM, Nicholas Piggin wrote:
> On Tue, 24 Apr 2018 10:11:46 +0530
> Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> wrote:
> 
>> gpstate_timer_handler() uses synchronous smp_call to set the pstate
>> on the requested core. This causes the below hard lockup:
>>
>> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
>> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
>> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
>> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
>> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
>> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
>> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
>> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
>> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
>> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
>> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
>> LR = arch_send_call_function_ipi_mask+0x120/0x130
>> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
>> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
>> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
>> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
>> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
>> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
>> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
>> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
>>
>> Fix this by using the asynchronus smp_call in the timer interrupt handler.
>> We don't have to wait in this handler until the pstates are changed on
>> the core. This change will not have any impact on the global pstate
>> ramp-down algorithm.
>>
>> Reported-by: Nicholas Piggin <npiggin@gmail.com>
>> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
>> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
>> ---
>>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>> index 0591874..7e0c752 100644
>> --- a/drivers/cpufreq/powernv-cpufreq.c
>> +++ b/drivers/cpufreq/powernv-cpufreq.c
>> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
>>  	spin_unlock(&gpstates->gpstate_lock);
>>  
>>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
>> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
>> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
>>  }
>>  
>>  /*
> 
> This can still deadlock because !wait case still ends up having to wait
> if another !wait smp_call_function caller had previously used the
> call single data for this cpu.
> 
> If you go this way you would have to use smp_call_function_async, which
> is more work.
> 
> As a rule it would be better to avoid smp_call_function entirely if
> possible. Can you ensure the timer is running on the right CPU? Use
> add_timer_on and try again if the timer is on the wrong CPU, perhaps?
> 

Yeah that is doable we can check for the cpu and re-queue it. We will only
ramp-down slower in that case which is no harm.

(If the targeted core turns out to be offline then we will not queue the timer
again as we would have already set the pstate to min in the cpu-down path.)

Thanks and Regards,
Shilpa

> Thanks,
> Nick
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
  2018-04-24  7:17   ` Shilpasri G Bhat
@ 2018-04-24  7:31     ` Nicholas Piggin
  2018-04-24 10:47       ` Shilpasri G Bhat
  0 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2018-04-24  7:31 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: rjw, viresh.kumar, benh, mpe, linux-pm, linuxppc-dev,
	linux-kernel, ppaidipe, svaidy

On Tue, 24 Apr 2018 12:47:32 +0530
Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> wrote:

> Hi,
> 
> On 04/24/2018 11:30 AM, Nicholas Piggin wrote:
> > On Tue, 24 Apr 2018 10:11:46 +0530
> > Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> wrote:
> >   
> >> gpstate_timer_handler() uses synchronous smp_call to set the pstate
> >> on the requested core. This causes the below hard lockup:
> >>
> >> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
> >> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
> >> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
> >> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
> >> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
> >> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
> >> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
> >> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
> >> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
> >> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
> >> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
> >> LR = arch_send_call_function_ipi_mask+0x120/0x130
> >> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
> >> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
> >> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
> >> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
> >> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
> >> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
> >> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
> >> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
> >>
> >> Fix this by using the asynchronus smp_call in the timer interrupt handler.
> >> We don't have to wait in this handler until the pstates are changed on
> >> the core. This change will not have any impact on the global pstate
> >> ramp-down algorithm.
> >>
> >> Reported-by: Nicholas Piggin <npiggin@gmail.com>
> >> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
> >> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> >> ---
> >>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> >> index 0591874..7e0c752 100644
> >> --- a/drivers/cpufreq/powernv-cpufreq.c
> >> +++ b/drivers/cpufreq/powernv-cpufreq.c
> >> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
> >>  	spin_unlock(&gpstates->gpstate_lock);
> >>  
> >>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
> >> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
> >> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
> >>  }
> >>  
> >>  /*  
> > 
> > This can still deadlock because !wait case still ends up having to wait
> > if another !wait smp_call_function caller had previously used the
> > call single data for this cpu.
> > 
> > If you go this way you would have to use smp_call_function_async, which
> > is more work.
> > 
> > As a rule it would be better to avoid smp_call_function entirely if
> > possible. Can you ensure the timer is running on the right CPU? Use
> > add_timer_on and try again if the timer is on the wrong CPU, perhaps?
> >   
> 
> Yeah that is doable we can check for the cpu and re-queue it. We will only
> ramp-down slower in that case which is no harm.

Great, I'd be much happier avoiding that IPI. I guess it should happen
quite rarely that we have to queue on a different CPU. I would say just
do add_timer unless we have migrated to the wrong CPU, then do add_timer_on
in that case (it's a bit slower).

> (If the targeted core turns out to be offline then we will not queue the timer
> again as we would have already set the pstate to min in the cpu-down path.)

Something I noticed is that if we can not get the lock (trylock fails),
then the timer does not get queued again. Should it?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt
  2018-04-24  7:31     ` Nicholas Piggin
@ 2018-04-24 10:47       ` Shilpasri G Bhat
  0 siblings, 0 replies; 8+ messages in thread
From: Shilpasri G Bhat @ 2018-04-24 10:47 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: rjw, viresh.kumar, benh, mpe, linux-pm, linuxppc-dev,
	linux-kernel, ppaidipe, svaidy



On 04/24/2018 01:01 PM, Nicholas Piggin wrote:
> On Tue, 24 Apr 2018 12:47:32 +0530
> Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> wrote:
> 
>> Hi,
>>
>> On 04/24/2018 11:30 AM, Nicholas Piggin wrote:
>>> On Tue, 24 Apr 2018 10:11:46 +0530
>>> Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> wrote:
>>>   
>>>> gpstate_timer_handler() uses synchronous smp_call to set the pstate
>>>> on the requested core. This causes the below hard lockup:
>>>>
>>>> [c000003fe566b320] [c0000000001d5340] smp_call_function_single+0x110/0x180 (unreliable)
>>>> [c000003fe566b390] [c0000000001d55e0] smp_call_function_any+0x180/0x250
>>>> [c000003fe566b3f0] [c000000000acd3e8] gpstate_timer_handler+0x1e8/0x580
>>>> [c000003fe566b4a0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
>>>> [c000003fe566b520] [c0000000001b4958] expire_timers+0x138/0x1f0
>>>> [c000003fe566b590] [c0000000001b4bf8] run_timer_softirq+0x1e8/0x270
>>>> [c000003fe566b630] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
>>>> [c000003fe566b710] [c000000000114be8] irq_exit+0xe8/0x120
>>>> [c000003fe566b730] [c000000000024d0c] timer_interrupt+0x9c/0xe0
>>>> [c000003fe566b760] [c000000000009014] decrementer_common+0x114/0x120
>>>> --- interrupt: 901 at doorbell_global_ipi+0x34/0x50
>>>> LR = arch_send_call_function_ipi_mask+0x120/0x130
>>>> [c000003fe566ba50] [c00000000004876c] arch_send_call_function_ipi_mask+0x4c/0x130 (unreliable)
>>>> [c000003fe566ba90] [c0000000001d59f0] smp_call_function_many+0x340/0x450
>>>> [c000003fe566bb00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
>>>> [c000003fe566bb30] [c0000000003a1120] change_huge_pmd+0xe0/0x270
>>>> [c000003fe566bba0] [c000000000349278] change_protection_range+0xb88/0xe40
>>>> [c000003fe566bcf0] [c0000000003496c0] mprotect_fixup+0x140/0x340
>>>> [c000003fe566bdb0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
>>>> [c000003fe566be30] [c00000000000b184] system_call+0x58/0x6c
>>>>
>>>> Fix this by using the asynchronus smp_call in the timer interrupt handler.
>>>> We don't have to wait in this handler until the pstates are changed on
>>>> the core. This change will not have any impact on the global pstate
>>>> ramp-down algorithm.
>>>>
>>>> Reported-by: Nicholas Piggin <npiggin@gmail.com>
>>>> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
>>>> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
>>>> ---
>>>>  drivers/cpufreq/powernv-cpufreq.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>>>> index 0591874..7e0c752 100644
>>>> --- a/drivers/cpufreq/powernv-cpufreq.c
>>>> +++ b/drivers/cpufreq/powernv-cpufreq.c
>>>> @@ -721,7 +721,7 @@ void gpstate_timer_handler(struct timer_list *t)
>>>>  	spin_unlock(&gpstates->gpstate_lock);
>>>>  
>>>>  	/* Timer may get migrated to a different cpu on cpu hot unplug */
>>>> -	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1);
>>>> +	smp_call_function_any(policy->cpus, set_pstate, &freq_data, 0);
>>>>  }
>>>>  
>>>>  /*  
>>>
>>> This can still deadlock because !wait case still ends up having to wait
>>> if another !wait smp_call_function caller had previously used the
>>> call single data for this cpu.
>>>
>>> If you go this way you would have to use smp_call_function_async, which
>>> is more work.
>>>
>>> As a rule it would be better to avoid smp_call_function entirely if
>>> possible. Can you ensure the timer is running on the right CPU? Use
>>> add_timer_on and try again if the timer is on the wrong CPU, perhaps?
>>>   
>>
>> Yeah that is doable we can check for the cpu and re-queue it. We will only
>> ramp-down slower in that case which is no harm.
> 
> Great, I'd be much happier avoiding that IPI. I guess it should happen
> quite rarely that we have to queue on a different CPU. I would say just
> do add_timer unless we have migrated to the wrong CPU, then do add_timer_on
> in that case (it's a bit slower).

(The gpstates->timer is initialized with TIMER_PINNED and is a timer per cpufreq
policy / or per core)

We are currently using mod_timer() and this gets triggered in the code-path of
the cpufreq's governor timer which is per-policy (i.e per core in our case).
This ensures the timer is always fired on one of the policy->cpus as the
deferred kworker is also scheduled on one of the policy->cpus.

We were good until this patch 7bc54b652f13119f64e87dd96bb792efbfc5a786
where after we could leave a migrated timer and subsequent re-queues from the
timer context on the wrong cpu. For this I agree we need a add_timer_on() to
correct it.

> 
>> (If the targeted core turns out to be offline then we will not queue the timer
>> again as we would have already set the pstate to min in the cpu-down path.)
> 
> Something I noticed is that if we can not get the lock (trylock fails),
> then the timer does not get queued again. Should it?
> 

Since the gpstates->timer is per-core I am assuming that it should not fail in
the trylock. (which sounds like an unlikely case to me where we have two expiry
on the same timer)

> Thanks,
> Nick
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-04-24 10:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-24  4:41 [PATCH] cpufreq: powernv: Fix the hardlockup by synchronus smp_call in timer interrupt Shilpasri G Bhat
2018-04-24  5:10 ` Stewart Smith
2018-04-24  5:10   ` Stewart Smith
2018-04-24  5:24   ` Shilpasri G Bhat
2018-04-24  6:00 ` Nicholas Piggin
2018-04-24  7:17   ` Shilpasri G Bhat
2018-04-24  7:31     ` Nicholas Piggin
2018-04-24 10:47       ` Shilpasri G Bhat

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.