linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] clocksource/arm_arch_timer: fix a lockdep warning
@ 2018-12-03 19:33 Qian Cai
  2018-12-03 20:07 ` Waiman Long
  0 siblings, 1 reply; 4+ messages in thread
From: Qian Cai @ 2018-12-03 19:33 UTC (permalink / raw)
  To: mark.rutland, marc.zyngier
  Cc: daniel.lezcano, tglx, peterz, mingo, longman, linux-arm-kernel,
	linux-kernel, Qian Cai

Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
warning.

[    0.000000]  lockdep_assert_cpus_held+0x50/0x60
[    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
[    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
[    0.000000]  arch_timer_acpi_init+0x274/0x6ac
[    0.000000]  acpi_table_parse+0x1ac/0x218
[    0.000000]  __acpi_probe_device_table+0x164/0x1ec
[    0.000000]  timer_probe+0x1bc/0x254
[    0.000000]  time_init+0x44/0x98
[    0.000000]  start_kernel+0x4ec/0x7d4

This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
the hotplug lock for _cpuslocked() operations"). Therefore, it will check
if it is really in the CPU hotplug path or not, and work around this
problem by using cpus_read_trylock(). The chance of not getting the read
lock is very small. If that happens, it will report a lockdep warning at
most.

Signed-off-by: Qian Cai <cai@gmx.us>
---
 drivers/clocksource/arm_arch_timer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 9a7d4dc..5c9acbd 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -497,11 +497,20 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
 			per_cpu(timer_unstable_counter_workaround, i) = wa;
 	}
 
+#ifdef CONFIG_HOTPLUG_CPU
+	i = 0;
+
 	/*
 	 * Use the locked version, as we're called from the CPU
 	 * hotplug framework. Otherwise, we end-up in deadlock-land.
 	 */
+	i = cpus_read_trylock();
 	static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
+	if (i)
+		cpus_read_unlock();
+#else
+	static_branch_enable(&arch_timer_read_ool_enabled);
+#endif
 
 	/*
 	 * Don't use the vdso fastpath if errata require using the
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] clocksource/arm_arch_timer: fix a lockdep warning
  2018-12-03 19:33 [PATCH] clocksource/arm_arch_timer: fix a lockdep warning Qian Cai
@ 2018-12-03 20:07 ` Waiman Long
  2018-12-03 20:31   ` Qian Cai
  0 siblings, 1 reply; 4+ messages in thread
From: Waiman Long @ 2018-12-03 20:07 UTC (permalink / raw)
  To: Qian Cai, mark.rutland, marc.zyngier
  Cc: daniel.lezcano, tglx, peterz, mingo, linux-arm-kernel, linux-kernel

On 12/03/2018 02:33 PM, Qian Cai wrote:
> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
> warning.
>
> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
> [    0.000000]  acpi_table_parse+0x1ac/0x218
> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
> [    0.000000]  timer_probe+0x1bc/0x254
> [    0.000000]  time_init+0x44/0x98
> [    0.000000]  start_kernel+0x4ec/0x7d4
>
> This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
> the hotplug lock for _cpuslocked() operations"). Therefore, it will check
> if it is really in the CPU hotplug path or not, and work around this
> problem by using cpus_read_trylock(). The chance of not getting the read
> lock is very small. If that happens, it will report a lockdep warning at
> most.
>
> Signed-off-by: Qian Cai <cai@gmx.us>
> ---
>  drivers/clocksource/arm_arch_timer.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 9a7d4dc..5c9acbd 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -497,11 +497,20 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
>  			per_cpu(timer_unstable_counter_workaround, i) = wa;
>  	}
>  
> +#ifdef CONFIG_HOTPLUG_CPU

If HOTPLUG_CPU isn't defined, all the cpus_lock() and related functions
are just no-op. You don't need to use conditional compilation directive
here.

> +	i = 0;
> +
>  	/*
>  	 * Use the locked version, as we're called from the CPU
>  	 * hotplug framework. Otherwise, we end-up in deadlock-land.
>  	 */

I think the main problem is the above comment may not be true anymore or
is only occasionally true. We need to audit the code to find the root cause.

> +	i = cpus_read_trylock();
>  	static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
> +	if (i)
> +		cpus_read_unlock();

This is not the right way of fixing the lockdep splash.

> +#else
> +	static_branch_enable(&arch_timer_read_ool_enabled);
> +#endif
>  
>  	/*
>  	 * Don't use the vdso fastpath if errata require using the

Cheers,
Longman



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] clocksource/arm_arch_timer: fix a lockdep warning
  2018-12-03 20:07 ` Waiman Long
@ 2018-12-03 20:31   ` Qian Cai
  2018-12-03 21:05     ` Waiman Long
  0 siblings, 1 reply; 4+ messages in thread
From: Qian Cai @ 2018-12-03 20:31 UTC (permalink / raw)
  To: Waiman Long, mark.rutland, marc.zyngier
  Cc: daniel.lezcano, tglx, peterz, mingo, linux-arm-kernel, linux-kernel

On Mon, 2018-12-03 at 15:07 -0500, Waiman Long wrote:
> On 12/03/2018 02:33 PM, Qian Cai wrote:
> > Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
> > warning.
> > 
> > [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
> > [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
> > [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
> > [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
> > [    0.000000]  acpi_table_parse+0x1ac/0x218
> > [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
> > [    0.000000]  timer_probe+0x1bc/0x254
> > [    0.000000]  time_init+0x44/0x98
> > [    0.000000]  start_kernel+0x4ec/0x7d4
> > 
> > This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
> > the hotplug lock for _cpuslocked() operations"). Therefore, it will check
> > if it is really in the CPU hotplug path or not, and work around this
> > problem by using cpus_read_trylock(). The chance of not getting the read
> > lock is very small. If that happens, it will report a lockdep warning at
> > most.
> > 
> > Signed-off-by: Qian Cai <cai@gmx.us>
> > ---
> >  drivers/clocksource/arm_arch_timer.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/clocksource/arm_arch_timer.c
> > b/drivers/clocksource/arm_arch_timer.c
> > index 9a7d4dc..5c9acbd 100644
> > --- a/drivers/clocksource/arm_arch_timer.c
> > +++ b/drivers/clocksource/arm_arch_timer.c
> > @@ -497,11 +497,20 @@ void arch_timer_enable_workaround(const struct
> > arch_timer_erratum_workaround *wa
> >  			per_cpu(timer_unstable_counter_workaround, i) = wa;
> >  	}
> >  
> > +#ifdef CONFIG_HOTPLUG_CPU
> 
> If HOTPLUG_CPU isn't defined, all the cpus_lock() and related functions
> are just no-op. You don't need to use conditional compilation directive
> here.

Make sense.

> 
> > +	i = 0;
> > +
> >  	/*
> >  	 * Use the locked version, as we're called from the CPU
> >  	 * hotplug framework. Otherwise, we end-up in deadlock-land.
> >  	 */
> 
> I think the main problem is the above comment may not be true anymore or
> is only occasionally true. We need to audit the code to find the root cause.

This was a commit introduced in Aug. 2017, 450f9689f294
(clocksource/arm_arch_timer: Use static_branch_enable_cpuslocked()) which
basically drop the cpus_read_lock(). May I ask what changes made you think the
above comment incorrect now?

> 
> > +	i = cpus_read_trylock();
> >  	static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
> > +	if (i)
> > +		cpus_read_unlock();
> 
> This is not the right way of fixing the lockdep splash.
> 

I should had said it is a workaround. I am all-ears for a proper way to fix
this. When the above commit 450f9689f294 was merged, there was no cb538267ea1e
so no lockdep warning.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] clocksource/arm_arch_timer: fix a lockdep warning
  2018-12-03 20:31   ` Qian Cai
@ 2018-12-03 21:05     ` Waiman Long
  0 siblings, 0 replies; 4+ messages in thread
From: Waiman Long @ 2018-12-03 21:05 UTC (permalink / raw)
  To: Qian Cai, mark.rutland, marc.zyngier
  Cc: daniel.lezcano, tglx, peterz, mingo, linux-arm-kernel, linux-kernel

On 12/03/2018 03:31 PM, Qian Cai wrote:
> On Mon, 2018-12-03 at 15:07 -0500, Waiman Long wrote:
>> On 12/03/2018 02:33 PM, Qian Cai wrote:
>>> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
>>> warning.
>>>
>>> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
>>> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
>>> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
>>> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
>>> [    0.000000]  acpi_table_parse+0x1ac/0x218
>>> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
>>> [    0.000000]  timer_probe+0x1bc/0x254
>>> [    0.000000]  time_init+0x44/0x98
>>> [    0.000000]  start_kernel+0x4ec/0x7d4
>>>
>>> This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
>>> the hotplug lock for _cpuslocked() operations"). Therefore, it will check
>>> if it is really in the CPU hotplug path or not, and work around this
>>> problem by using cpus_read_trylock(). The chance of not getting the read
>>> lock is very small. If that happens, it will report a lockdep warning at
>>> most.
>>>
>>> Signed-off-by: Qian Cai <cai@gmx.us>
>>> ---
>>>  drivers/clocksource/arm_arch_timer.c | 9 +++++++++
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/clocksource/arm_arch_timer.c
>>> b/drivers/clocksource/arm_arch_timer.c
>>> index 9a7d4dc..5c9acbd 100644
>>> --- a/drivers/clocksource/arm_arch_timer.c
>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>> @@ -497,11 +497,20 @@ void arch_timer_enable_workaround(const struct
>>> arch_timer_erratum_workaround *wa
>>>  			per_cpu(timer_unstable_counter_workaround, i) = wa;
>>>  	}
>>>  
>>> +#ifdef CONFIG_HOTPLUG_CPU
>> If HOTPLUG_CPU isn't defined, all the cpus_lock() and related functions
>> are just no-op. You don't need to use conditional compilation directive
>> here.
> Make sense.
>
>>> +	i = 0;
>>> +
>>>  	/*
>>>  	 * Use the locked version, as we're called from the CPU
>>>  	 * hotplug framework. Otherwise, we end-up in deadlock-land.
>>>  	 */
>> I think the main problem is the above comment may not be true anymore or
>> is only occasionally true. We need to audit the code to find the root cause.
> This was a commit introduced in Aug. 2017, 450f9689f294
> (clocksource/arm_arch_timer: Use static_branch_enable_cpuslocked()) which
> basically drop the cpus_read_lock(). May I ask what changes made you think the
> above comment incorrect now?
>

If the above comment is true, you won't have the lockdep splat in the
first place. I think the most likely case is that the  function can be
called from both a hotplug path and a non-hotplug path. So the comment
needs to be updated to reflect that.

>>> +	i = cpus_read_trylock();
>>>  	static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
>>> +	if (i)
>>> +		cpus_read_unlock();
>> This is not the right way of fixing the lockdep splash.
>>
> I should had said it is a workaround. I am all-ears for a proper way to fix
> this. When the above commit 450f9689f294 was merged, there was no cb538267ea1e
> so no lockdep warning.

As a workaround, you should better document that in the code. Thinking
about it some more, your workaround seems to be valid. Recursive
cpus_read_lock() is allowed. If it is called from a hotplug path,
cpus_write_lock() should have been taken and cpus_read_trylock() will
failed. Of course, we will have to assume that the current CPU is the
one that has taken the write lock in this case.

Cheers,
Longman




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-03 21:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-03 19:33 [PATCH] clocksource/arm_arch_timer: fix a lockdep warning Qian Cai
2018-12-03 20:07 ` Waiman Long
2018-12-03 20:31   ` Qian Cai
2018-12-03 21:05     ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).