linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] validate the delta of cycle_now and cycle_last on arm64
@ 2015-10-27 13:21 Yang Yingliang
  2015-10-27 13:21 ` [PATCH 1/2] clocksource: replace cycle_last validation with an equal way Yang Yingliang
  2015-10-27 13:21 ` [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last Yang Yingliang
  0 siblings, 2 replies; 12+ messages in thread
From: Yang Yingliang @ 2015-10-27 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Yang Yingliang, Thomas Gleixner

In multi-core system, if the clock is not sync perfectly, it
will make cycle_last that recorded by CPU-A is a little more
than cycle_now that read by CPU-B. With the negative result,
hrtimer_update_base() return a huge and wrong time. It leads
to the cpu can not finish the while loop in hrtimer_interrupt()
until the real nowtime which is returned from ktime_get() catch
up with the wrong time on clock monotonic base.

Fix it by select config CLOCKSOURCE_VALIDATE_LAST_CYCLE.

Cc: Thomas Gleixner <tglx@linutronix.de>

Yang Yingliang (2):
  clocksource: replace cycle_last validation with an equal way
  arm64: validate the cycle_last to prevent time going backwards

 arch/arm64/Kconfig                 | 1 +
 kernel/time/timekeeping_internal.h | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

-- 
2.5.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] clocksource: replace cycle_last validation with an equal way
  2015-10-27 13:21 [PATCH 0/2] validate the delta of cycle_now and cycle_last on arm64 Yang Yingliang
@ 2015-10-27 13:21 ` Yang Yingliang
  2015-10-30 14:56   ` Thomas Gleixner
  2015-10-27 13:21 ` [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last Yang Yingliang
  1 sibling, 1 reply; 12+ messages in thread
From: Yang Yingliang @ 2015-10-27 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Yang Yingliang, Thomas Gleixner

Mask the cycle values before subtraction. So we can use this
validation while the clocksource mask is not 64-bits.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timekeeping_internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
index 4ea005a..984f02e 100644
--- a/kernel/time/timekeeping_internal.h
+++ b/kernel/time/timekeeping_internal.h
@@ -15,7 +15,7 @@ extern void tk_debug_account_sleep_time(struct timespec64 *t);
 #ifdef CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE
 static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask)
 {
-	cycle_t ret = (now - last) & mask;
+	cycle_t ret = (now & mask) - (last & mask);
 
 	return (s64) ret > 0 ? ret : 0;
 }
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last
  2015-10-27 13:21 [PATCH 0/2] validate the delta of cycle_now and cycle_last on arm64 Yang Yingliang
  2015-10-27 13:21 ` [PATCH 1/2] clocksource: replace cycle_last validation with an equal way Yang Yingliang
@ 2015-10-27 13:21 ` Yang Yingliang
  2015-10-27 14:03   ` Mark Rutland
  1 sibling, 1 reply; 12+ messages in thread
From: Yang Yingliang @ 2015-10-27 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel; +Cc: Yang Yingliang, Thomas Gleixner

In multi-core system, if the clock is not sync perfectly, it
will make cycle_last that recorded by CPU-A is a little more
than cycle_now that read by CPU-B. With the negative result,
hrtimer_update_base() return a huge and wrong time. It leads
to the cpu can not finish the while loop in hrtimer_interrupt()
until the real nowtime which is returned from ktime_get() catch
up with the wrong time on clock monotonic base.

I was able to reproudce the problem with calling clock_settime
and clock_adjtime repeatedly on each cpu. The params of the calls
is random. Here is the calltrace:

Jan 01 00:02:29 Linux kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
Jan 01 00:02:29 Linux kernel:         0: (2 GPs behind) idle=913/1/0 softirq=59289/59291 fqs=488
Jan 01 00:02:29 Linux kernel:         (detected by 20, t=5252 jiffies, g=35769, c=35768, q=567)
Jan 01 00:02:29 Linux kernel: Task dump for CPU 0:
Jan 01 00:02:29 Linux kernel: swapper/0       R  running task        0   0      0 0x00000002
Jan 01 00:02:29 Linux kernel: Call trace:
Jan 01 00:02:29 Linux kernel: [<ffffffc000086c5c>] __switch_to+0x74/0x8c
Jan 01 00:02:29 Linux kernel: rcu_sched kthread starved for 4764 jiffies!
Jan 01 00:03:32 Linux kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
Jan 01 00:03:32 Linux kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.6+ #184
Jan 01 00:03:32 Linux kernel: task: ffffffc00091cdf0 ti: ffffffc000910000 task.ti: ffffffc000910000
Jan 01 00:03:32 Linux kernel: PC is at arch_cpu_idle+0x10/0x18
Jan 01 00:03:32 Linux kernel: LR is at arch_cpu_idle+0xc/0x18
Jan 01 00:03:32 Linux kernel: pc : [<ffffffc00008686c>] lr : [<ffffffc000086868>] pstate: 60000145
Jan 01 00:03:32 Linux kernel: sp : ffffffc000913f20
Jan 01 00:03:32 Linux kernel: x29: ffffffc000913f20 x28: 000000003f4bbab0
Jan 01 00:03:32 Linux kernel: x27: ffffffc00091669c x26: ffffffc00096e000
Jan 01 00:03:32 Linux kernel: x25: ffffffc000804000 x24: ffffffc000913f30
Jan 01 00:03:32 Linux kernel: x23: 0000000000000001 x22: ffffffc0006817f8
Jan 01 00:03:32 Linux kernel: x21: ffffffc0008fdb00 x20: ffffffc000916618
Jan 01 00:03:32 Linux kernel: x19: ffffffc000910000 x18: 00000000ffffffff
Jan 01 00:03:32 Linux kernel: x17: 0000007f9d6f682c x16: ffffffc0001e19d0
Jan 01 00:03:32 Linux kernel: x15: 0000000000000061 x14: 0000000000000072
Jan 01 00:03:32 Linux kernel: x13: 0000000000000067 x12: ffffffc000682528
Jan 01 00:03:32 Linux kernel: x11: 0000000000000005 x10: 00000001000faf9a
Jan 01 00:03:32 Linux kernel: x9 : ffffffc000913e60 x8 : ffffffc00091d350
Jan 01 00:03:32 Linux kernel: x7 : 0000000000000000 x6 : 002b24c4f00026aa
Jan 01 00:03:32 Linux kernel: x5 : 0000001ffd5c6000 x4 : ffffffc000913ea0
Jan 01 00:03:32 Linux kernel: x3 : ffffffdffdec3b44 x2 : ffffffdffdec3b44
Jan 01 00:03:32 Linux kernel: x1 : 0000000000000000 x0 : 0000000000000000

CPU-A updates the cycle_last in do_settimeofday64() under lock and CPU-B
reads the current cycles which is slightly behind CPU-A to substract the
cycle_last after unlock, then we get a negative result, after masking it
comes to a extremely huge value and lead to "hang" in hrtimer_interrupt().

And multi-core system on X86 had already met such problem and Thomas introduce
a fix which is commit 47001d603375 ("x86: tsc prevent time going backwards").
And then Thomas moved the fix code into the core code file of time in
commit 09ec54429c6d ("clocksource: Move cycle_last validation to core code").
Now the validation can be enabled by config CLOCKSOURCE_VALIDATE_LAST_CYCLE.
I think we can fix the problem on arm64 by selecting the config. This is no
side effect for systems with counters running properly.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 07d1811..6a53926 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -30,6 +30,7 @@ config ARM64
 	select GENERIC_ALLOCATOR
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_CLOCKEVENTS_BROADCAST
+	select CLOCKSOURCE_VALIDATE_LAST_CYCLE
 	select GENERIC_CPU_AUTOPROBE
 	select GENERIC_EARLY_IOREMAP
 	select GENERIC_IDLE_POLL_SETUP
-- 
2.5.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last
  2015-10-27 13:21 ` [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last Yang Yingliang
@ 2015-10-27 14:03   ` Mark Rutland
  2015-10-28  1:33     ` Ding Tianhong
  2015-10-29  7:36     ` Yang Yingliang
  0 siblings, 2 replies; 12+ messages in thread
From: Mark Rutland @ 2015-10-27 14:03 UTC (permalink / raw)
  To: Yang Yingliang
  Cc: linux-arm-kernel, linux-kernel, Thomas Gleixner, marc.zyngier,
	will.deacon, catalin.marinas

On Tue, Oct 27, 2015 at 09:21:13PM +0800, Yang Yingliang wrote:
> In multi-core system, if the clock is not sync perfectly, it
> will make cycle_last that recorded by CPU-A is a little more
> than cycle_now that read by CPU-B.

If that is happening, that sounds like a hardware and/or firmware bug.

The ARM ARM states the following (where a CPU is a device):

	The system counter must provide a uniform view of system time. More
	precisely, it must be impossible for the following sequence of events
	to show system time going backwards:

	1. Device A reads the time from the system counter.

	2. Device A communicates with another agent in the system, Device B.

	3. After recognizing the communication from Device A, Device B reads
	   the time from the system counter.

Per [1] it seems like the TSC is not architected to provide this guarantee for
x86.

Are you certain that the CPUs have clocks which are not in sync?

> With the negative result,
> hrtimer_update_base() return a huge and wrong time. It leads
> to the cpu can not finish the while loop in hrtimer_interrupt()
> until the real nowtime which is returned from ktime_get() catch
> up with the wrong time on clock monotonic base.
> 
> I was able to reproudce the problem with calling clock_settime
> and clock_adjtime repeatedly on each cpu. The params of the calls
> is random. 

Could you share your reproducer?

How long does it take to trigger the issue?

> Here is the calltrace:
>
> Jan 01 00:02:29 Linux kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
> Jan 01 00:02:29 Linux kernel:         0: (2 GPs behind) idle=913/1/0 softirq=59289/59291 fqs=488
> Jan 01 00:02:29 Linux kernel:         (detected by 20, t=5252 jiffies, g=35769, c=35768, q=567)
> Jan 01 00:02:29 Linux kernel: Task dump for CPU 0:
> Jan 01 00:02:29 Linux kernel: swapper/0       R  running task        0   0      0 0x00000002
> Jan 01 00:02:29 Linux kernel: Call trace:
> Jan 01 00:02:29 Linux kernel: [<ffffffc000086c5c>] __switch_to+0x74/0x8c
> Jan 01 00:02:29 Linux kernel: rcu_sched kthread starved for 4764 jiffies!
> Jan 01 00:03:32 Linux kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
> Jan 01 00:03:32 Linux kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.6+ #184
> Jan 01 00:03:32 Linux kernel: task: ffffffc00091cdf0 ti: ffffffc000910000 task.ti: ffffffc000910000
> Jan 01 00:03:32 Linux kernel: PC is at arch_cpu_idle+0x10/0x18
> Jan 01 00:03:32 Linux kernel: LR is at arch_cpu_idle+0xc/0x18
> Jan 01 00:03:32 Linux kernel: pc : [<ffffffc00008686c>] lr : [<ffffffc000086868>] pstate: 60000145
> Jan 01 00:03:32 Linux kernel: sp : ffffffc000913f20
> Jan 01 00:03:32 Linux kernel: x29: ffffffc000913f20 x28: 000000003f4bbab0
> Jan 01 00:03:32 Linux kernel: x27: ffffffc00091669c x26: ffffffc00096e000
> Jan 01 00:03:32 Linux kernel: x25: ffffffc000804000 x24: ffffffc000913f30
> Jan 01 00:03:32 Linux kernel: x23: 0000000000000001 x22: ffffffc0006817f8
> Jan 01 00:03:32 Linux kernel: x21: ffffffc0008fdb00 x20: ffffffc000916618
> Jan 01 00:03:32 Linux kernel: x19: ffffffc000910000 x18: 00000000ffffffff
> Jan 01 00:03:32 Linux kernel: x17: 0000007f9d6f682c x16: ffffffc0001e19d0
> Jan 01 00:03:32 Linux kernel: x15: 0000000000000061 x14: 0000000000000072
> Jan 01 00:03:32 Linux kernel: x13: 0000000000000067 x12: ffffffc000682528
> Jan 01 00:03:32 Linux kernel: x11: 0000000000000005 x10: 00000001000faf9a
> Jan 01 00:03:32 Linux kernel: x9 : ffffffc000913e60 x8 : ffffffc00091d350
> Jan 01 00:03:32 Linux kernel: x7 : 0000000000000000 x6 : 002b24c4f00026aa
> Jan 01 00:03:32 Linux kernel: x5 : 0000001ffd5c6000 x4 : ffffffc000913ea0
> Jan 01 00:03:32 Linux kernel: x3 : ffffffdffdec3b44 x2 : ffffffdffdec3b44
> Jan 01 00:03:32 Linux kernel: x1 : 0000000000000000 x0 : 0000000000000000

In this case was CNTVOFF uniform on all CPUs?

Was the kernel booted at EL2 or EL1N? Was it booted under a hypervisor?

> CPU-A updates the cycle_last in do_settimeofday64() under lock and CPU-B
> reads the current cycles which is slightly behind CPU-A to substract the
> cycle_last after unlock, then we get a negative result, after masking it
> comes to a extremely huge value and lead to "hang" in hrtimer_interrupt().

It's possible that we have missing ISBs or DSBs somewhere, and we're
encountering a re-ordering that we did not expect.

It would be very helpful to be able to analyse with your reproducer. I
have a kernel test in a local branch which I've never managed to trigger
a failure with otehr than on systems with a horrifically skewed CNTVOFF.

> And multi-core system on X86 had already met such problem and Thomas introduce
> a fix which is commit 47001d603375 ("x86: tsc prevent time going backwards").
> And then Thomas moved the fix code into the core code file of time in
> commit 09ec54429c6d ("clocksource: Move cycle_last validation to core code").
> Now the validation can be enabled by config CLOCKSOURCE_VALIDATE_LAST_CYCLE.
> I think we can fix the problem on arm64 by selecting the config. This is no
> side effect for systems with counters running properly.

As above, per [1], I'm not sure that the same rationale applies, and
it's somewhat worrying to mask the issue in this manner.

Thanks,
Mark.

[1] https://lkml.org/lkml/2007/8/23/96

> 
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/arm64/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 07d1811..6a53926 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -30,6 +30,7 @@ config ARM64
>  	select GENERIC_ALLOCATOR
>  	select GENERIC_CLOCKEVENTS
>  	select GENERIC_CLOCKEVENTS_BROADCAST
> +	select CLOCKSOURCE_VALIDATE_LAST_CYCLE
>  	select GENERIC_CPU_AUTOPROBE
>  	select GENERIC_EARLY_IOREMAP
>  	select GENERIC_IDLE_POLL_SETUP
> -- 
> 2.5.0
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last
  2015-10-27 14:03   ` Mark Rutland
@ 2015-10-28  1:33     ` Ding Tianhong
  2015-10-29  7:36     ` Yang Yingliang
  1 sibling, 0 replies; 12+ messages in thread
From: Ding Tianhong @ 2015-10-28  1:33 UTC (permalink / raw)
  To: Mark Rutland, Yang Yingliang
  Cc: marc.zyngier, catalin.marinas, will.deacon, linux-kernel,
	Thomas Gleixner, linux-arm-kernel

On 2015/10/27 22:03, Mark Rutland wrote:
> On Tue, Oct 27, 2015 at 09:21:13PM +0800, Yang Yingliang wrote:
>> In multi-core system, if the clock is not sync perfectly, it
>> will make cycle_last that recorded by CPU-A is a little more
>> than cycle_now that read by CPU-B.
> 
> If that is happening, that sounds like a hardware and/or firmware bug.
> 
> The ARM ARM states the following (where a CPU is a device):
> 
> 	The system counter must provide a uniform view of system time. More
> 	precisely, it must be impossible for the following sequence of events
> 	to show system time going backwards:
> 
> 	1. Device A reads the time from the system counter.
> 
> 	2. Device A communicates with another agent in the system, Device B.
> 
> 	3. After recognizing the communication from Device A, Device B reads
> 	   the time from the system counter.
> 
> Per [1] it seems like the TSC is not architected to provide this guarantee for
> x86.
> 
> Are you certain that the CPUs have clocks which are not in sync?
> 
Hi Mark:

we have check this with the chip developer, it is a hardware design problem, because there is
too much cores in the chip, and we could not sure all the chip from several company could totally
fix this problem, the x86 still has this problem, so I think this patch is need for
arm64 to compatible the different chip.

Ding

>> With the negative result,
>> hrtimer_update_base() return a huge and wrong time. It leads
>> to the cpu can not finish the while loop in hrtimer_interrupt()
>> until the real nowtime which is returned from ktime_get() catch
>> up with the wrong time on clock monotonic base.
>>
>> I was able to reproudce the problem with calling clock_settime
>> and clock_adjtime repeatedly on each cpu. The params of the calls
>> is random. 
> 
> Could you share your reproducer?
> 
> How long does it take to trigger the issue?
> 
>> Here is the calltrace:
>>
>> Jan 01 00:02:29 Linux kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
>> Jan 01 00:02:29 Linux kernel:         0: (2 GPs behind) idle=913/1/0 softirq=59289/59291 fqs=488
>> Jan 01 00:02:29 Linux kernel:         (detected by 20, t=5252 jiffies, g=35769, c=35768, q=567)
>> Jan 01 00:02:29 Linux kernel: Task dump for CPU 0:
>> Jan 01 00:02:29 Linux kernel: swapper/0       R  running task        0   0      0 0x00000002
>> Jan 01 00:02:29 Linux kernel: Call trace:
>> Jan 01 00:02:29 Linux kernel: [<ffffffc000086c5c>] __switch_to+0x74/0x8c
>> Jan 01 00:02:29 Linux kernel: rcu_sched kthread starved for 4764 jiffies!
>> Jan 01 00:03:32 Linux kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
>> Jan 01 00:03:32 Linux kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.6+ #184
>> Jan 01 00:03:32 Linux kernel: task: ffffffc00091cdf0 ti: ffffffc000910000 task.ti: ffffffc000910000
>> Jan 01 00:03:32 Linux kernel: PC is at arch_cpu_idle+0x10/0x18
>> Jan 01 00:03:32 Linux kernel: LR is at arch_cpu_idle+0xc/0x18
>> Jan 01 00:03:32 Linux kernel: pc : [<ffffffc00008686c>] lr : [<ffffffc000086868>] pstate: 60000145
>> Jan 01 00:03:32 Linux kernel: sp : ffffffc000913f20
>> Jan 01 00:03:32 Linux kernel: x29: ffffffc000913f20 x28: 000000003f4bbab0
>> Jan 01 00:03:32 Linux kernel: x27: ffffffc00091669c x26: ffffffc00096e000
>> Jan 01 00:03:32 Linux kernel: x25: ffffffc000804000 x24: ffffffc000913f30
>> Jan 01 00:03:32 Linux kernel: x23: 0000000000000001 x22: ffffffc0006817f8
>> Jan 01 00:03:32 Linux kernel: x21: ffffffc0008fdb00 x20: ffffffc000916618
>> Jan 01 00:03:32 Linux kernel: x19: ffffffc000910000 x18: 00000000ffffffff
>> Jan 01 00:03:32 Linux kernel: x17: 0000007f9d6f682c x16: ffffffc0001e19d0
>> Jan 01 00:03:32 Linux kernel: x15: 0000000000000061 x14: 0000000000000072
>> Jan 01 00:03:32 Linux kernel: x13: 0000000000000067 x12: ffffffc000682528
>> Jan 01 00:03:32 Linux kernel: x11: 0000000000000005 x10: 00000001000faf9a
>> Jan 01 00:03:32 Linux kernel: x9 : ffffffc000913e60 x8 : ffffffc00091d350
>> Jan 01 00:03:32 Linux kernel: x7 : 0000000000000000 x6 : 002b24c4f00026aa
>> Jan 01 00:03:32 Linux kernel: x5 : 0000001ffd5c6000 x4 : ffffffc000913ea0
>> Jan 01 00:03:32 Linux kernel: x3 : ffffffdffdec3b44 x2 : ffffffdffdec3b44
>> Jan 01 00:03:32 Linux kernel: x1 : 0000000000000000 x0 : 0000000000000000
> 
> In this case was CNTVOFF uniform on all CPUs?
> 
> Was the kernel booted at EL2 or EL1N? Was it booted under a hypervisor?
> 
>> CPU-A updates the cycle_last in do_settimeofday64() under lock and CPU-B
>> reads the current cycles which is slightly behind CPU-A to substract the
>> cycle_last after unlock, then we get a negative result, after masking it
>> comes to a extremely huge value and lead to "hang" in hrtimer_interrupt().
> 
> It's possible that we have missing ISBs or DSBs somewhere, and we're
> encountering a re-ordering that we did not expect.
> 
> It would be very helpful to be able to analyse with your reproducer. I
> have a kernel test in a local branch which I've never managed to trigger
> a failure with otehr than on systems with a horrifically skewed CNTVOFF.
> 
>> And multi-core system on X86 had already met such problem and Thomas introduce
>> a fix which is commit 47001d603375 ("x86: tsc prevent time going backwards").
>> And then Thomas moved the fix code into the core code file of time in
>> commit 09ec54429c6d ("clocksource: Move cycle_last validation to core code").
>> Now the validation can be enabled by config CLOCKSOURCE_VALIDATE_LAST_CYCLE.
>> I think we can fix the problem on arm64 by selecting the config. This is no
>> side effect for systems with counters running properly.
> 
> As above, per [1], I'm not sure that the same rationale applies, and
> it's somewhat worrying to mask the issue in this manner.
> 
> Thanks,
> Mark.
> 
> [1] https://lkml.org/lkml/2007/8/23/96
> 
>>
>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  arch/arm64/Kconfig | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 07d1811..6a53926 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -30,6 +30,7 @@ config ARM64
>>  	select GENERIC_ALLOCATOR
>>  	select GENERIC_CLOCKEVENTS
>>  	select GENERIC_CLOCKEVENTS_BROADCAST
>> +	select CLOCKSOURCE_VALIDATE_LAST_CYCLE
>>  	select GENERIC_CPU_AUTOPROBE
>>  	select GENERIC_EARLY_IOREMAP
>>  	select GENERIC_IDLE_POLL_SETUP
>> -- 
>> 2.5.0
>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> .
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last
  2015-10-27 14:03   ` Mark Rutland
  2015-10-28  1:33     ` Ding Tianhong
@ 2015-10-29  7:36     ` Yang Yingliang
  2015-10-29  8:31       ` Yang Yingliang
  1 sibling, 1 reply; 12+ messages in thread
From: Yang Yingliang @ 2015-10-29  7:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, Thomas Gleixner, marc.zyngier,
	will.deacon, catalin.marinas, Hanjun Guo


On 2015/10/27 22:03, Mark Rutland wrote:
> On Tue, Oct 27, 2015 at 09:21:13PM +0800, Yang Yingliang wrote:
>> In multi-core system, if the clock is not sync perfectly, it
>> will make cycle_last that recorded by CPU-A is a little more
>> than cycle_now that read by CPU-B.
>
> If that is happening, that sounds like a hardware and/or firmware bug.
>
> The ARM ARM states the following (where a CPU is a device):
>
> 	The system counter must provide a uniform view of system time. More
> 	precisely, it must be impossible for the following sequence of events
> 	to show system time going backwards:
>
> 	1. Device A reads the time from the system counter.
>
> 	2. Device A communicates with another agent in the system, Device B.
>
> 	3. After recognizing the communication from Device A, Device B reads
> 	   the time from the system counter.
>
> Per [1] it seems like the TSC is not architected to provide this guarantee for
> x86.
>
> Are you certain that the CPUs have clocks which are not in sync?
>
>> With the negative result,
>> hrtimer_update_base() return a huge and wrong time. It leads
>> to the cpu can not finish the while loop in hrtimer_interrupt()
>> until the real nowtime which is returned from ktime_get() catch
>> up with the wrong time on clock monotonic base.
>>
>> I was able to reproudce the problem with calling clock_settime
>> and clock_adjtime repeatedly on each cpu. The params of the calls
>> is random.
>
> Could you share your reproducer?

https://github.com/kernelslacker/trinity

I use this testsuite and I make it call clock_settime and
clock_adjtime alternately with random params on each core.

>
> How long does it take to trigger the issue?

It's not certain. It would take half an hour or several hours or more 
longer.

>
>> Here is the calltrace:
>>
>> Jan 01 00:02:29 Linux kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
>> Jan 01 00:02:29 Linux kernel:         0: (2 GPs behind) idle=913/1/0 softirq=59289/59291 fqs=488
>> Jan 01 00:02:29 Linux kernel:         (detected by 20, t=5252 jiffies, g=35769, c=35768, q=567)
>> Jan 01 00:02:29 Linux kernel: Task dump for CPU 0:
>> Jan 01 00:02:29 Linux kernel: swapper/0       R  running task        0   0      0 0x00000002
>> Jan 01 00:02:29 Linux kernel: Call trace:
>> Jan 01 00:02:29 Linux kernel: [<ffffffc000086c5c>] __switch_to+0x74/0x8c
>> Jan 01 00:02:29 Linux kernel: rcu_sched kthread starved for 4764 jiffies!
>> Jan 01 00:03:32 Linux kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
>> Jan 01 00:03:32 Linux kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.6+ #184
>> Jan 01 00:03:32 Linux kernel: task: ffffffc00091cdf0 ti: ffffffc000910000 task.ti: ffffffc000910000
>> Jan 01 00:03:32 Linux kernel: PC is at arch_cpu_idle+0x10/0x18
>> Jan 01 00:03:32 Linux kernel: LR is at arch_cpu_idle+0xc/0x18
>> Jan 01 00:03:32 Linux kernel: pc : [<ffffffc00008686c>] lr : [<ffffffc000086868>] pstate: 60000145
>> Jan 01 00:03:32 Linux kernel: sp : ffffffc000913f20
>> Jan 01 00:03:32 Linux kernel: x29: ffffffc000913f20 x28: 000000003f4bbab0
>> Jan 01 00:03:32 Linux kernel: x27: ffffffc00091669c x26: ffffffc00096e000
>> Jan 01 00:03:32 Linux kernel: x25: ffffffc000804000 x24: ffffffc000913f30
>> Jan 01 00:03:32 Linux kernel: x23: 0000000000000001 x22: ffffffc0006817f8
>> Jan 01 00:03:32 Linux kernel: x21: ffffffc0008fdb00 x20: ffffffc000916618
>> Jan 01 00:03:32 Linux kernel: x19: ffffffc000910000 x18: 00000000ffffffff
>> Jan 01 00:03:32 Linux kernel: x17: 0000007f9d6f682c x16: ffffffc0001e19d0
>> Jan 01 00:03:32 Linux kernel: x15: 0000000000000061 x14: 0000000000000072
>> Jan 01 00:03:32 Linux kernel: x13: 0000000000000067 x12: ffffffc000682528
>> Jan 01 00:03:32 Linux kernel: x11: 0000000000000005 x10: 00000001000faf9a
>> Jan 01 00:03:32 Linux kernel: x9 : ffffffc000913e60 x8 : ffffffc00091d350
>> Jan 01 00:03:32 Linux kernel: x7 : 0000000000000000 x6 : 002b24c4f00026aa
>> Jan 01 00:03:32 Linux kernel: x5 : 0000001ffd5c6000 x4 : ffffffc000913ea0
>> Jan 01 00:03:32 Linux kernel: x3 : ffffffdffdec3b44 x2 : ffffffdffdec3b44
>> Jan 01 00:03:32 Linux kernel: x1 : 0000000000000000 x0 : 0000000000000000
>
> In this case was CNTVOFF uniform on all CPUs?
>
> Was the kernel booted at EL2 or EL1N? Was it booted under a hypervisor?

At EL1N without a hypervisor.

>
>> CPU-A updates the cycle_last in do_settimeofday64() under lock and CPU-B
>> reads the current cycles which is slightly behind CPU-A to substract the
>> cycle_last after unlock, then we get a negative result, after masking it
>> comes to a extremely huge value and lead to "hang" in hrtimer_interrupt().
>
> It's possible that we have missing ISBs or DSBs somewhere, and we're
> encountering a re-ordering that we did not expect.
>
> It would be very helpful to be able to analyse with your reproducer. I
> have a kernel test in a local branch which I've never managed to trigger
> a failure with otehr than on systems with a horrifically skewed CNTVOFF.
>
>> And multi-core system on X86 had already met such problem and Thomas introduce
>> a fix which is commit 47001d603375 ("x86: tsc prevent time going backwards").
>> And then Thomas moved the fix code into the core code file of time in
>> commit 09ec54429c6d ("clocksource: Move cycle_last validation to core code").
>> Now the validation can be enabled by config CLOCKSOURCE_VALIDATE_LAST_CYCLE.
>> I think we can fix the problem on arm64 by selecting the config. This is no
>> side effect for systems with counters running properly.
>
> As above, per [1], I'm not sure that the same rationale applies, and
> it's somewhat worrying to mask the issue in this manner.
>
> Thanks,
> Mark.
>
> [1] https://lkml.org/lkml/2007/8/23/96
>
>>
>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>   arch/arm64/Kconfig | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 07d1811..6a53926 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -30,6 +30,7 @@ config ARM64
>>   	select GENERIC_ALLOCATOR
>>   	select GENERIC_CLOCKEVENTS
>>   	select GENERIC_CLOCKEVENTS_BROADCAST
>> +	select CLOCKSOURCE_VALIDATE_LAST_CYCLE
>>   	select GENERIC_CPU_AUTOPROBE
>>   	select GENERIC_EARLY_IOREMAP
>>   	select GENERIC_IDLE_POLL_SETUP
>> --
>> 2.5.0
>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>
> .
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last
  2015-10-29  7:36     ` Yang Yingliang
@ 2015-10-29  8:31       ` Yang Yingliang
  0 siblings, 0 replies; 12+ messages in thread
From: Yang Yingliang @ 2015-10-29  8:31 UTC (permalink / raw)
  To: Mark Rutland
  Cc: marc.zyngier, catalin.marinas, will.deacon, linux-kernel,
	Hanjun Guo, Thomas Gleixner, linux-arm-kernel


On 2015/10/29 15:36, Yang Yingliang wrote:
>
> On 2015/10/27 22:03, Mark Rutland wrote:
>> On Tue, Oct 27, 2015 at 09:21:13PM +0800, Yang Yingliang wrote:
>>> In multi-core system, if the clock is not sync perfectly, it
>>> will make cycle_last that recorded by CPU-A is a little more
>>> than cycle_now that read by CPU-B.
>>
>> If that is happening, that sounds like a hardware and/or firmware bug.
>>
>> The ARM ARM states the following (where a CPU is a device):
>>
>>     The system counter must provide a uniform view of system time. More
>>     precisely, it must be impossible for the following sequence of events
>>     to show system time going backwards:
>>
>>     1. Device A reads the time from the system counter.
>>
>>     2. Device A communicates with another agent in the system, Device B.
>>
>>     3. After recognizing the communication from Device A, Device B reads
>>        the time from the system counter.
>>
>> Per [1] it seems like the TSC is not architected to provide this
>> guarantee for
>> x86.
>>
>> Are you certain that the CPUs have clocks which are not in sync?
>>
>>> With the negative result,
>>> hrtimer_update_base() return a huge and wrong time. It leads
>>> to the cpu can not finish the while loop in hrtimer_interrupt()
>>> until the real nowtime which is returned from ktime_get() catch
>>> up with the wrong time on clock monotonic base.
>>>
>>> I was able to reproudce the problem with calling clock_settime
>>> and clock_adjtime repeatedly on each cpu. The params of the calls
>>> is random.
>>
>> Could you share your reproducer?
>
> https://github.com/kernelslacker/trinity
>
> I use this testsuite and I make it call clock_settime and
> clock_adjtime alternately with random params on each core.
>
>>
>> How long does it take to trigger the issue?
>
> It's not certain. It would take half an hour or several hours or more
> longer.
>
>>
>>> Here is the calltrace:
>>>
>>> Jan 01 00:02:29 Linux kernel: INFO: rcu_sched detected stalls on
>>> CPUs/tasks:
>>> Jan 01 00:02:29 Linux kernel:         0: (2 GPs behind) idle=913/1/0
>>> softirq=59289/59291 fqs=488
>>> Jan 01 00:02:29 Linux kernel:         (detected by 20, t=5252
>>> jiffies, g=35769, c=35768, q=567)
>>> Jan 01 00:02:29 Linux kernel: Task dump for CPU 0:
>>> Jan 01 00:02:29 Linux kernel: swapper/0       R  running task
>>> 0   0      0 0x00000002
>>> Jan 01 00:02:29 Linux kernel: Call trace:
>>> Jan 01 00:02:29 Linux kernel: [<ffffffc000086c5c>] __switch_to+0x74/0x8c
>>> Jan 01 00:02:29 Linux kernel: rcu_sched kthread starved for 4764
>>> jiffies!
>>> Jan 01 00:03:32 Linux kernel: NMI watchdog: BUG: soft lockup - CPU#0
>>> stuck for 23s! [swapper/0:0]
>>> Jan 01 00:03:32 Linux kernel: CPU: 0 PID: 0 Comm: swapper/0 Not
>>> tainted 4.1.6+ #184
>>> Jan 01 00:03:32 Linux kernel: task: ffffffc00091cdf0 ti:
>>> ffffffc000910000 task.ti: ffffffc000910000
>>> Jan 01 00:03:32 Linux kernel: PC is at arch_cpu_idle+0x10/0x18
>>> Jan 01 00:03:32 Linux kernel: LR is at arch_cpu_idle+0xc/0x18
>>> Jan 01 00:03:32 Linux kernel: pc : [<ffffffc00008686c>] lr :
>>> [<ffffffc000086868>] pstate: 60000145
>>> Jan 01 00:03:32 Linux kernel: sp : ffffffc000913f20
>>> Jan 01 00:03:32 Linux kernel: x29: ffffffc000913f20 x28:
>>> 000000003f4bbab0
>>> Jan 01 00:03:32 Linux kernel: x27: ffffffc00091669c x26:
>>> ffffffc00096e000
>>> Jan 01 00:03:32 Linux kernel: x25: ffffffc000804000 x24:
>>> ffffffc000913f30
>>> Jan 01 00:03:32 Linux kernel: x23: 0000000000000001 x22:
>>> ffffffc0006817f8
>>> Jan 01 00:03:32 Linux kernel: x21: ffffffc0008fdb00 x20:
>>> ffffffc000916618
>>> Jan 01 00:03:32 Linux kernel: x19: ffffffc000910000 x18:
>>> 00000000ffffffff
>>> Jan 01 00:03:32 Linux kernel: x17: 0000007f9d6f682c x16:
>>> ffffffc0001e19d0
>>> Jan 01 00:03:32 Linux kernel: x15: 0000000000000061 x14:
>>> 0000000000000072
>>> Jan 01 00:03:32 Linux kernel: x13: 0000000000000067 x12:
>>> ffffffc000682528
>>> Jan 01 00:03:32 Linux kernel: x11: 0000000000000005 x10:
>>> 00000001000faf9a
>>> Jan 01 00:03:32 Linux kernel: x9 : ffffffc000913e60 x8 :
>>> ffffffc00091d350
>>> Jan 01 00:03:32 Linux kernel: x7 : 0000000000000000 x6 :
>>> 002b24c4f00026aa
>>> Jan 01 00:03:32 Linux kernel: x5 : 0000001ffd5c6000 x4 :
>>> ffffffc000913ea0
>>> Jan 01 00:03:32 Linux kernel: x3 : ffffffdffdec3b44 x2 :
>>> ffffffdffdec3b44
>>> Jan 01 00:03:32 Linux kernel: x1 : 0000000000000000 x0 :
>>> 0000000000000000
>>
>> In this case was CNTVOFF uniform on all CPUs?
>>
>> Was the kernel booted at EL2 or EL1N? Was it booted under a hypervisor?
>
> At EL1N without a hypervisor.

I was wrong, It booted at EL2 without hypervisor.

>
>>
>>> CPU-A updates the cycle_last in do_settimeofday64() under lock and CPU-B
>>> reads the current cycles which is slightly behind CPU-A to substract the
>>> cycle_last after unlock, then we get a negative result, after masking it
>>> comes to a extremely huge value and lead to "hang" in
>>> hrtimer_interrupt().
>>
>> It's possible that we have missing ISBs or DSBs somewhere, and we're
>> encountering a re-ordering that we did not expect.
>>
>> It would be very helpful to be able to analyse with your reproducer. I
>> have a kernel test in a local branch which I've never managed to trigger
>> a failure with otehr than on systems with a horrifically skewed CNTVOFF.
>>
>>> And multi-core system on X86 had already met such problem and Thomas
>>> introduce
>>> a fix which is commit 47001d603375 ("x86: tsc prevent time going
>>> backwards").
>>> And then Thomas moved the fix code into the core code file of time in
>>> commit 09ec54429c6d ("clocksource: Move cycle_last validation to core
>>> code").
>>> Now the validation can be enabled by config
>>> CLOCKSOURCE_VALIDATE_LAST_CYCLE.
>>> I think we can fix the problem on arm64 by selecting the config. This
>>> is no
>>> side effect for systems with counters running properly.
>>
>> As above, per [1], I'm not sure that the same rationale applies, and
>> it's somewhat worrying to mask the issue in this manner.
>>
>> Thanks,
>> Mark.
>>
>> [1] https://lkml.org/lkml/2007/8/23/96
>>
>>>
>>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> ---
>>>   arch/arm64/Kconfig | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 07d1811..6a53926 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -30,6 +30,7 @@ config ARM64
>>>       select GENERIC_ALLOCATOR
>>>       select GENERIC_CLOCKEVENTS
>>>       select GENERIC_CLOCKEVENTS_BROADCAST
>>> +    select CLOCKSOURCE_VALIDATE_LAST_CYCLE
>>>       select GENERIC_CPU_AUTOPROBE
>>>       select GENERIC_EARLY_IOREMAP
>>>       select GENERIC_IDLE_POLL_SETUP
>>> --
>>> 2.5.0
>>>
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>
>> .
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] clocksource: replace cycle_last validation with an equal way
  2015-10-27 13:21 ` [PATCH 1/2] clocksource: replace cycle_last validation with an equal way Yang Yingliang
@ 2015-10-30 14:56   ` Thomas Gleixner
  2015-10-31 10:07     ` Yang Yingliang
  2015-10-31 10:20     ` [PATCH resend] clocksource: modify the cycle_last validation to fit for non-64bit clocksourece mask Yang Yingliang
  0 siblings, 2 replies; 12+ messages in thread
From: Thomas Gleixner @ 2015-10-30 14:56 UTC (permalink / raw)
  To: Yang Yingliang; +Cc: linux-arm-kernel, linux-kernel

Yang,

On Tue, 27 Oct 2015, Yang Yingliang wrote:

> Mask the cycle values before subtraction. So we can use this
> validation while the clocksource mask is not 64-bits.
> 
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/time/timekeeping_internal.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
> index 4ea005a..984f02e 100644
> --- a/kernel/time/timekeeping_internal.h
> +++ b/kernel/time/timekeeping_internal.h
> @@ -15,7 +15,7 @@ extern void tk_debug_account_sleep_time(struct timespec64 *t);
>  #ifdef CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE
>  static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask)
>  {
> -	cycle_t ret = (now - last) & mask;
> +	cycle_t ret = (now & mask) - (last & mask);

I agree the original code is broken for all masks which are !=
CLOCKSOURCE_MASK(64).

But your change does not work for actual wraparounds. You probably
cannot trigger it for the 56bits of the arm architected timer, but
that does not make it more correct.

Assume a CLOCKSOURCE_MASK(32) and that the timer wrapped around since
we last read it.

last = 0xffffffff
now = 0x01

So:

	ret = (0x01 & 0xffffffff) - (0xffffffff & 0xffffffff);
-->	ret = 0x01 - 0xffffffff;
-->	ret = ffffffff00000002;

-->	(s64) ret is < 0 !!!

This is wrong as the clocksource legitimately wrapped around since we
accessed it last.

The correct solution to this is:

    	ret = (now - last) & mask;
	
	negative = ret & ~(mask >> 1);

	return negative ? 0 : ret;

So in the above case this will be:

-->	ret = (0x01 - 0xffffffff) & 0xffffffff;
-->	ret = 0x02;

-->	negative = 0x02 & ~(0x0fffffff);
-->	negative = 0;

But for 

last = 0x1000
now = 0x01

-->	ret = (0x01 - 0x1000) & 0xffffffff;
-->	ret = 0xfffff001;

-->	negative = 0xfffff001 & ~(0x0fffffff);
-->	negative = 0x80000000;

And this is the case which we actually need to catch.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] clocksource: replace cycle_last validation with an equal way
  2015-10-30 14:56   ` Thomas Gleixner
@ 2015-10-31 10:07     ` Yang Yingliang
  2015-10-31 10:20     ` [PATCH resend] clocksource: modify the cycle_last validation to fit for non-64bit clocksourece mask Yang Yingliang
  1 sibling, 0 replies; 12+ messages in thread
From: Yang Yingliang @ 2015-10-31 10:07 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-arm-kernel, linux-kernel



On 2015/10/30 22:56, Thomas Gleixner wrote:
> Yang,
>
> On Tue, 27 Oct 2015, Yang Yingliang wrote:
>
>> Mask the cycle values before subtraction. So we can use this
>> validation while the clocksource mask is not 64-bits.
>>
>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>   kernel/time/timekeeping_internal.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
>> index 4ea005a..984f02e 100644
>> --- a/kernel/time/timekeeping_internal.h
>> +++ b/kernel/time/timekeeping_internal.h
>> @@ -15,7 +15,7 @@ extern void tk_debug_account_sleep_time(struct timespec64 *t);
>>   #ifdef CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE
>>   static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask)
>>   {
>> -	cycle_t ret = (now - last) & mask;
>> +	cycle_t ret = (now & mask) - (last & mask);
>
> I agree the original code is broken for all masks which are !=
> CLOCKSOURCE_MASK(64).
>
> But your change does not work for actual wraparounds. You probably
> cannot trigger it for the 56bits of the arm architected timer, but
> that does not make it more correct.
>
> Assume a CLOCKSOURCE_MASK(32) and that the timer wrapped around since
> we last read it.
>
> last = 0xffffffff
> now = 0x01
>
> So:
>
> 	ret = (0x01 & 0xffffffff) - (0xffffffff & 0xffffffff);
> -->	ret = 0x01 - 0xffffffff;
> -->	ret = ffffffff00000002;
>
> -->	(s64) ret is < 0 !!!
>
> This is wrong as the clocksource legitimately wrapped around since we
> accessed it last.
>
> The correct solution to this is:
>
>      	ret = (now - last) & mask;
> 	
> 	negative = ret & ~(mask >> 1);
>
> 	return negative ? 0 : ret;

Thanks for your advise.
I will resend this patch.

Regards,
Yang




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH resend] clocksource: modify the cycle_last validation to fit for non-64bit clocksourece mask
  2015-10-30 14:56   ` Thomas Gleixner
  2015-10-31 10:07     ` Yang Yingliang
@ 2015-10-31 10:20     ` Yang Yingliang
  2015-12-19 15:03       ` [tip:timers/core] clocksource: Make clocksource validation work for all clocksources tip-bot for Yang Yingliang
  1 sibling, 1 reply; 12+ messages in thread
From: Yang Yingliang @ 2015-10-31 10:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-arm-kernel, linux-kernel

From: Yang Yingliang <yangyingliang@huawei.com>

Check the delta of now and last to make sure it's not
negative while the clocksource mask is not 64-bits.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
  kernel/time/timekeeping_internal.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/timekeeping_internal.h 
b/kernel/time/timekeeping_internal.h
index 4ea005a..cbfcd2d 100644
--- a/kernel/time/timekeeping_internal.h
+++ b/kernel/time/timekeeping_internal.h
@@ -16,8 +16,9 @@ extern void tk_debug_account_sleep_time(struct 
timespec64 *t);
  static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, 
cycle_t mask)
  {
  	cycle_t ret = (now - last) & mask;
+	cycle_t negative = ret & ~(mask >> 1);

-	return (s64) ret > 0 ? ret : 0;
+	return negative ? 0 : ret;
  }
  #else
  static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, 
cycle_t mask)
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:timers/core] clocksource: Make clocksource validation work for all clocksources
  2015-10-31 10:20     ` [PATCH resend] clocksource: modify the cycle_last validation to fit for non-64bit clocksourece mask Yang Yingliang
@ 2015-12-19 15:03       ` tip-bot for Yang Yingliang
  2016-01-04 17:13         ` John Stultz
  0 siblings, 1 reply; 12+ messages in thread
From: tip-bot for Yang Yingliang @ 2015-12-19 15:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, mingo, yangyingliang, linux-arm-kernel, linux-kernel, hpa

Commit-ID:  1f45f1f33c8c8b96722dbc5e6b7acf74eaa721f7
Gitweb:     http://git.kernel.org/tip/1f45f1f33c8c8b96722dbc5e6b7acf74eaa721f7
Author:     Yang Yingliang <yangyingliang@huawei.com>
AuthorDate: Sat, 31 Oct 2015 18:20:55 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 19 Dec 2015 15:59:57 +0100

clocksource: Make clocksource validation work for all clocksources

The clocksource validation which makes sure that the newly read value
is not smaller than the last value only works if the clocksource mask
is 64bit, i.e. the counter is 64bit wide. But we want to use that
mechanism also for clocksources which are less than 64bit wide.

So instead of checking whether bit 63 is set, we check whether the
most significant bit of the clocksource mask is set in the delta
result. If it is set, we return 0.

[ tglx: Simplified the implementation, added a comment and massaged
  	the commit message ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Link: http://lkml.kernel.org/r/56349607.6070708@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/timekeeping_internal.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
index e20466f..5be7627 100644
--- a/kernel/time/timekeeping_internal.h
+++ b/kernel/time/timekeeping_internal.h
@@ -17,7 +17,11 @@ static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask)
 {
 	cycle_t ret = (now - last) & mask;
 
-	return (s64) ret > 0 ? ret : 0;
+	/*
+	 * Prevent time going backwards by checking the MSB of mask in
+	 * the result. If set, return 0.
+	 */
+	return ret & ~(mask >> 1) ? 0 : ret;
 }
 #else
 static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:timers/core] clocksource: Make clocksource validation work for all clocksources
  2015-12-19 15:03       ` [tip:timers/core] clocksource: Make clocksource validation work for all clocksources tip-bot for Yang Yingliang
@ 2016-01-04 17:13         ` John Stultz
  0 siblings, 0 replies; 12+ messages in thread
From: John Stultz @ 2016-01-04 17:13 UTC (permalink / raw)
  To: yangyingliang, linux-arm-kernel, Thomas Gleixner, Ingo Molnar,
	Linux Kernel Mailing List, H. Peter Anvin
  Cc: linux-tip-commits

On Sat, Dec 19, 2015 at 7:03 AM, tip-bot for Yang Yingliang
<tipbot@zytor.com> wrote:
> Commit-ID:  1f45f1f33c8c8b96722dbc5e6b7acf74eaa721f7
> Gitweb:     http://git.kernel.org/tip/1f45f1f33c8c8b96722dbc5e6b7acf74eaa721f7
> Author:     Yang Yingliang <yangyingliang@huawei.com>
> AuthorDate: Sat, 31 Oct 2015 18:20:55 +0800
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Sat, 19 Dec 2015 15:59:57 +0100
>
> clocksource: Make clocksource validation work for all clocksources
>
> The clocksource validation which makes sure that the newly read value
> is not smaller than the last value only works if the clocksource mask
> is 64bit, i.e. the counter is 64bit wide. But we want to use that
> mechanism also for clocksources which are less than 64bit wide.
>
> So instead of checking whether bit 63 is set, we check whether the
> most significant bit of the clocksource mask is set in the delta
> result. If it is set, we return 0.
>
> [ tglx: Simplified the implementation, added a comment and massaged
>         the commit message ]
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> Cc: <linux-arm-kernel@lists.infradead.org>
> Link: http://lkml.kernel.org/r/56349607.6070708@huawei.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/time/timekeeping_internal.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h
> index e20466f..5be7627 100644
> --- a/kernel/time/timekeeping_internal.h
> +++ b/kernel/time/timekeeping_internal.h
> @@ -17,7 +17,11 @@ static inline cycle_t clocksource_delta(cycle_t now, cycle_t last, cycle_t mask)
>  {
>         cycle_t ret = (now - last) & mask;
>
> -       return (s64) ret > 0 ? ret : 0;
> +       /*
> +        * Prevent time going backwards by checking the MSB of mask in
> +        * the result. If set, return 0.
> +        */
> +       return ret & ~(mask >> 1) ? 0 : ret;

Hey Thomas, Yang,
   I just noticed this patch, and I'm a little worried that its not
totally safe.  When testing 64bit values, if the msb is set, we know
its negative because no system going to get through more then 63 bits
of cycles between updates.

However, for systems with much shorter masks (ACPI PM for example is
only 24 bits), the MSB of the mask bit being set doesn't necessarily
indicate a invalid negative interval. This patch would possibly cause
time inconsistencies (time going backwards) on those systems.

So if something like this is added, we probably need some additional
smarts to ensure we don't cause problems with quick-wrapping
clocksources with the same fix to try to avoid negative intervals with
unsynced clocksources.

That said, back when we added the TIMEKEEPING_DEBUGGING work, we did
add non-terminal warnings if we saw cycle offsets larger then 50% of
the mask, so the intent of this patch isn't totally off base, but it
seems like leaning heavily on an assumption that isn't quite a hard
design rule of the subsystem (as with a warning, we can continue
working fine, but with this patch we might introduce inconsistencies).

One potential hack that could be tried: If unsynced clocksource has a
long enough interval that we can be sure the MSB being set always
means its an invalid negative interval, you could have the clocksource
shift the read value up by however many bits the mask length is short
of 64 (similarly adjusting the frequency up proportionally), and then
set its mask to all 64 bits. That way negative reads would be caught
by the pre-existing generic logic.

Sorry I didn't see this earlier, I wasn't on the cc list.

thanks
-john

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-01-04 17:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-27 13:21 [PATCH 0/2] validate the delta of cycle_now and cycle_last on arm64 Yang Yingliang
2015-10-27 13:21 ` [PATCH 1/2] clocksource: replace cycle_last validation with an equal way Yang Yingliang
2015-10-30 14:56   ` Thomas Gleixner
2015-10-31 10:07     ` Yang Yingliang
2015-10-31 10:20     ` [PATCH resend] clocksource: modify the cycle_last validation to fit for non-64bit clocksourece mask Yang Yingliang
2015-12-19 15:03       ` [tip:timers/core] clocksource: Make clocksource validation work for all clocksources tip-bot for Yang Yingliang
2016-01-04 17:13         ` John Stultz
2015-10-27 13:21 ` [PATCH 2/2] arm64: validate the delta of cycle_now and cycle_last Yang Yingliang
2015-10-27 14:03   ` Mark Rutland
2015-10-28  1:33     ` Ding Tianhong
2015-10-29  7:36     ` Yang Yingliang
2015-10-29  8:31       ` Yang Yingliang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).