All of lore.kernel.org
 help / color / mirror / Atom feed
* rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
@ 2016-06-27 20:06 Borislav Petkov
  2016-06-29 17:16 ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2016-06-27 20:06 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Peter Zijlstra, x86-ml, lkml

Hey Thomas,

Just started seeing this now during testing of Rafael's s/r fix:

[   24.973955] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
[   24.987744] clocksource:                       'acpi_pm' wd_now: 2df835 wd_last: a6bb64 mask: ffffff
[   24.999587] clocksource:                       'tsc' cs_now: 2bbe793d1e cs_last: 296e28763c mask: ffffffffffffffff
[   25.013400] clocksource: Switched to clocksource acpi_pm

In the previous boot it was CPU1.

And kernel is rc5+tip/master.

Any suggestions are always welcome.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-27 20:06 rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large: Borislav Petkov
@ 2016-06-29 17:16 ` Borislav Petkov
  2016-06-30  7:57   ` Mike Galbraith
  2016-06-30  8:40   ` Thomas Gleixner
  0 siblings, 2 replies; 17+ messages in thread
From: Borislav Petkov @ 2016-06-29 17:16 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Peter Zijlstra, x86-ml, lkml

On Mon, Jun 27, 2016 at 10:06:31PM +0200, Borislav Petkov wrote:
> Hey Thomas,
> 
> Just started seeing this now during testing of Rafael's s/r fix:
> 
> [   24.973955] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
> [   24.987744] clocksource:                       'acpi_pm' wd_now: 2df835 wd_last: a6bb64 mask: ffffff
> [   24.999587] clocksource:                       'tsc' cs_now: 2bbe793d1e cs_last: 296e28763c mask: ffffffffffffffff
> [   25.013400] clocksource: Switched to clocksource acpi_pm
> 
> In the previous boot it was CPU1.

And here it is again:

...
[   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   32.883077] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
[   32.896986] clocksource:                       'acpi_pm' wd_now: 8e147 wd_last: 531b43 mask: ffffff
[   32.908834] clocksource:                       'tsc' cs_now: 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
[   32.922293] clocksource: Switched to clocksource acpi_pm

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-29 17:16 ` Borislav Petkov
@ 2016-06-30  7:57   ` Mike Galbraith
  2016-06-30  8:41     ` Thomas Gleixner
  2016-06-30  8:40   ` Thomas Gleixner
  1 sibling, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2016-06-30  7:57 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner; +Cc: Peter Zijlstra, x86-ml, lkml

On Wed, 2016-06-29 at 19:16 +0200, Borislav Petkov wrote:

> And here it is again:
> 
> ...
> [   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [   32.883077] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
> [   32.896986] clocksource:                       'acpi_pm' wd_now: 8e147 wd_last: 531b43 mask: ffffff
> [   32.908834] clocksource:                       'tsc' cs_now: 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
> [   32.922293] clocksource: Switched to clocksource acpi_pm

I met that too when testing, but only on my 8 socket box for whatever
reason.  Putting tip back on the box and poking a bit while I beat up
rt kernels elsewhere, cs_nsec: 167001172874 wd_nsec: 0.  (poke++)

	-Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-29 17:16 ` Borislav Petkov
  2016-06-30  7:57   ` Mike Galbraith
@ 2016-06-30  8:40   ` Thomas Gleixner
  2016-06-30  9:49     ` Borislav Petkov
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2016-06-30  8:40 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Peter Zijlstra, x86-ml, lkml

On Wed, 29 Jun 2016, Borislav Petkov wrote:
> On Mon, Jun 27, 2016 at 10:06:31PM +0200, Borislav Petkov wrote:
> > Hey Thomas,
> > 
> > Just started seeing this now during testing of Rafael's s/r fix:
> > 
> > [   24.973955] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
> > [   24.987744] clocksource:                       'acpi_pm' wd_now: 2df835 wd_last: a6bb64 mask: ffffff
> > [   24.999587] clocksource:                       'tsc' cs_now: 2bbe793d1e cs_last: 296e28763c mask: ffffffffffffffff
> > [   25.013400] clocksource: Switched to clocksource acpi_pm
> > 
> > In the previous boot it was CPU1.
> 
> And here it is again:
> 
> ...
> [   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [   32.883077] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
> [   32.896986] clocksource:                       'acpi_pm' wd_now: 8e147 wd_last: 531b43 mask: ffffff
> [   32.908834] clocksource:                       'tsc' cs_now: 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
> [   32.922293] clocksource: Switched to clocksource acpi_pm

What's the tsc freqeuency on your machine?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  7:57   ` Mike Galbraith
@ 2016-06-30  8:41     ` Thomas Gleixner
  2016-06-30  9:15       ` Mike Galbraith
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2016-06-30  8:41 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 30 Jun 2016, Mike Galbraith wrote:

> On Wed, 2016-06-29 at 19:16 +0200, Borislav Petkov wrote:
> 
> > And here it is again:
> > 
> > ...
> > [   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> > [   32.883077] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
> > [   32.896986] clocksource:                       'acpi_pm' wd_now: 8e147 wd_last: 531b43 mask: ffffff
> > [   32.908834] clocksource:                       'tsc' cs_now: 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
> > [   32.922293] clocksource: Switched to clocksource acpi_pm
> 
> I met that too when testing, but only on my 8 socket box for whatever
> reason.  Putting tip back on the box and poking a bit while I beat up
> rt kernels elsewhere, cs_nsec: 167001172874 wd_nsec: 0.  (poke++)

can i get the full data please including tsc frequency?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  8:41     ` Thomas Gleixner
@ 2016-06-30  9:15       ` Mike Galbraith
  2016-06-30  9:42         ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2016-06-30  9:15 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 2016-06-30 at 10:41 +0200, Thomas Gleixner wrote:
> On Thu, 30 Jun 2016, Mike Galbraith wrote:
> 
> > On Wed, 2016-06-29 at 19:16 +0200, Borislav Petkov wrote:
> > 
> > > And here it is again:
> > > 
> > > ...
> > > [   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
> ready
> > > [   32.883077] clocksource: timekeeping watchdog on CPU1: Marking
> clocksource 'tsc' as unstable because the skew is too large:
> > > [   32.896986] clocksource:                       'acpi_pm'
> wd_now: 8e147 wd_last: 531b43 mask: ffffff
> > > [   32.908834] clocksource:                       'tsc' cs_now:
> 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
> > > [   32.922293] clocksource: Switched to clocksource acpi_pm
> > 
> > I met that too when testing, but only on my 8 socket box for
> whatever
> > reason.  Putting tip back on the box and poking a bit while I beat
> up
> > rt kernels elsewhere, cs_nsec: 167001172874 wd_nsec: 0.  (poke++)
> 
> can i get the full data please including tsc frequency?

[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2260.928 MHz processor
[   13.205453] tsc: Refined TSC clocksource calibration: 2260.999 MHz
[   13.205457] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x20974a4d8bb, max_idle_ns: 440795246623 ns
[   14.592866] clocksource: Switched to clocksource tsc
[  575.781329] clocksource: timekeeping watchdog on CPU23: Marking clocksource 'tsc' as unstable because the skew is too large:
[  575.998835] clocksource:                       'tsc' cs_now: 1c1effead80 cs_last: 139efdabfa5 mask: ffffffffffffffff

trace_printk() leading to splat (hm, mask).
          <idle>-0     [023] ..s.    25.233237: clocksource_watchdog: cs_nsec: 511311920 wd_nsec: 511311599 delta: 321 wdnow: 362266206  cs->wd_last: 354945155 watchdog->mask: ffffffff
          <idle>-0     [023] ..s.    57.234242: clocksource_watchdog: cs_nsec: 511999944 wd_nsec: 511999396 delta: 548 wdnow: 820447848  cs->wd_last: 813116949 watchdog->mask: ffffffff
          <idle>-0     [023] ..s.   575.781321: clocksource_watchdog: cs_nsec: 258345057292 wd_nsec: 0 delta: 258345057292 wdnow: 3949895484  cs->wd_last: 250865676 watchdog->mask: ffffffff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  9:15       ` Mike Galbraith
@ 2016-06-30  9:42         ` Thomas Gleixner
  2016-06-30  9:49           ` Mike Galbraith
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2016-06-30  9:42 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 30 Jun 2016, Mike Galbraith wrote:
> On Thu, 2016-06-30 at 10:41 +0200, Thomas Gleixner wrote:
> > On Thu, 30 Jun 2016, Mike Galbraith wrote:
> > 
> > > On Wed, 2016-06-29 at 19:16 +0200, Borislav Petkov wrote:
> > > 
> > > > And here it is again:
> > > > 
> > > > ...
> > > > [   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes
> > ready
> > > > [   32.883077] clocksource: timekeeping watchdog on CPU1: Marking
> > clocksource 'tsc' as unstable because the skew is too large:
> > > > [   32.896986] clocksource:                       'acpi_pm'
> > wd_now: 8e147 wd_last: 531b43 mask: ffffff
> > > > [   32.908834] clocksource:                       'tsc' cs_now:
> > 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
> > > > [   32.922293] clocksource: Switched to clocksource acpi_pm
> > > 
> > > I met that too when testing, but only on my 8 socket box for
> > whatever
> > > reason.  Putting tip back on the box and poking a bit while I beat
> > up
> > > rt kernels elsewhere, cs_nsec: 167001172874 wd_nsec: 0.  (poke++)
> > 
> > can i get the full data please including tsc frequency?
> 
> [    0.000000] tsc: Fast TSC calibration using PIT
> [    0.000000] tsc: Detected 2260.928 MHz processor
> [   13.205453] tsc: Refined TSC clocksource calibration: 2260.999 MHz
> [   13.205457] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x20974a4d8bb, max_idle_ns: 440795246623 ns
> [   14.592866] clocksource: Switched to clocksource tsc
> [  575.781329] clocksource: timekeeping watchdog on CPU23: Marking clocksource 'tsc' as unstable because the skew is too large:
> [  575.998835] clocksource:                       'tsc' cs_now: 1c1effead80 cs_last: 139efdabfa5 mask: ffffffffffffffff
> 
> trace_printk() leading to splat (hm, mask).
>           <idle>-0     [023] ..s.    25.233237: clocksource_watchdog: cs_nsec: 511311920 wd_nsec: 511311599 delta: 321 wdnow: 362266206  cs->wd_last: 354945155 watchdog->mask: ffffffff
>           <idle>-0     [023] ..s.    57.234242: clocksource_watchdog: cs_nsec: 511999944 wd_nsec: 511999396 delta: 548 wdnow: 820447848  cs->wd_last: 813116949 watchdog->mask: ffffffff
>           <idle>-0     [023] ..s.   575.781321: clocksource_watchdog: cs_nsec: 258345057292 wd_nsec: 0 delta: 258345057292 wdnow: 3949895484  cs->wd_last: 250865676 watchdog->mask: ffffffff
> 

So there was no watchdog firing for 525 seconds ????

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  8:40   ` Thomas Gleixner
@ 2016-06-30  9:49     ` Borislav Petkov
  0 siblings, 0 replies; 17+ messages in thread
From: Borislav Petkov @ 2016-06-30  9:49 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Peter Zijlstra, x86-ml, lkml

On Thu, Jun 30, 2016 at 10:40:47AM +0200, Thomas Gleixner wrote:
> On Wed, 29 Jun 2016, Borislav Petkov wrote:
> > On Mon, Jun 27, 2016 at 10:06:31PM +0200, Borislav Petkov wrote:
> > > Hey Thomas,
> > > 
> > > Just started seeing this now during testing of Rafael's s/r fix:
> > > 
> > > [   24.973955] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > [   24.987744] clocksource:                       'acpi_pm' wd_now: 2df835 wd_last: a6bb64 mask: ffffff
> > > [   24.999587] clocksource:                       'tsc' cs_now: 2bbe793d1e cs_last: 296e28763c mask: ffffffffffffffff
> > > [   25.013400] clocksource: Switched to clocksource acpi_pm
> > > 
> > > In the previous boot it was CPU1.
> > 
> > And here it is again:
> > 
> > ...
> > [   15.720833] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> > [   32.883077] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
> > [   32.896986] clocksource:                       'acpi_pm' wd_now: 8e147 wd_last: 531b43 mask: ffffff
> > [   32.908834] clocksource:                       'tsc' cs_now: 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff
> > [   32.922293] clocksource: Switched to clocksource acpi_pm
> 
> What's the tsc freqeuency on your machine?

$ dmesg | grep -i tsc
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 4013.406 MHz processor
[    5.034357] tsc: Refined TSC clocksource calibration: 4013.511 MHz
[    5.034359] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x39da38233dd, max_idle_ns: 440795253361 ns
[    6.068153] clocksource: Switched to clocksource tsc
[   32.883077] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
[   32.908834] clocksource:                       'tsc' cs_now: 385f4b6d1e cs_last: 354328bcea mask: ffffffffffffffff

P0 is 4Ghz and TSC is ticking at P0 so 4GHz too. 4-ish, it seems :)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  9:42         ` Thomas Gleixner
@ 2016-06-30  9:49           ` Mike Galbraith
  2016-06-30 10:24             ` Thomas Gleixner
  2016-06-30 10:29             ` Mike Galbraith
  0 siblings, 2 replies; 17+ messages in thread
From: Mike Galbraith @ 2016-06-30  9:49 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 2016-06-30 at 11:42 +0200, Thomas Gleixner wrote:

> So there was no watchdog firing for 525 seconds ????

Yup, seems so.

	-Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  9:49           ` Mike Galbraith
@ 2016-06-30 10:24             ` Thomas Gleixner
  2016-06-30 13:39               ` Mike Galbraith
  2016-06-30 10:29             ` Mike Galbraith
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2016-06-30 10:24 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 30 Jun 2016, Mike Galbraith wrote:
> On Thu, 2016-06-30 at 11:42 +0200, Thomas Gleixner wrote:
> 
> > So there was no watchdog firing for 525 seconds ????
> 
> Yup, seems so.

if that's fully reproducible can you please enable the timer and softirq
tracepoints along with your trace printks and stop the trace when that
watchdog triggers?

extra trace printk below.

thanks

	tglx

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 3e0c4e60bf6a..418c45ab5ad5 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1438,6 +1438,7 @@ static unsigned long __next_timer_interrupt(struct timer_base *base)
 		clk >>= LVL_CLK_SHIFT;
 		clk += adj;
 	}
+	trace_printk("next %lu\n", next);
 	return next;
 }
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30  9:49           ` Mike Galbraith
  2016-06-30 10:24             ` Thomas Gleixner
@ 2016-06-30 10:29             ` Mike Galbraith
  2016-06-30 10:31               ` Thomas Gleixner
  1 sibling, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2016-06-30 10:29 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 2016-06-30 at 11:49 +0200, Mike Galbraith wrote:
> On Thu, 2016-06-30 at 11:42 +0200, Thomas Gleixner wrote:
> 
> > So there was no watchdog firing for 525 seconds ????
> 
> Yup, seems so.

"fired" at the top, the other above if (atomic_read(&watchdog_reset...

vogelweide:/sys/kernel/debug/tracing/:[0]# tail trace
          <idle>-0     [055] ..s.   324.428369: clocksource_watchdog: fired
          <idle>-0     [055] ..s.   324.428374: clocksource_watchdog: cs_nsec: 16032486 wd_nsec: 16032765 delta: 279 wdnow: 351147182  wdlast: 350917622 watchdog->mask: ffffffff
             nis-5876  [056] ..s.   324.428403: clocksource_watchdog: fired
             nis-5876  [056] ..s.   324.428412: clocksource_watchdog: cs_nsec: 37227 wd_nsec: 37574 delta: 347 wdnow: 351147720  wdlast: 351147182 watchdog->mask: ffffffff
          <idle>-0     [057] ..s.   324.428428: clocksource_watchdog: fired
          <idle>-0     [057] ..s.   324.428433: clocksource_watchdog: cs_nsec: 21938 wd_nsec: 21301 delta: 637 wdnow: 351148025  wdlast: 351147720 watchdog->mask: ffffffff
       automount-5406  [059] ..s.   394.446498: clocksource_watchdog: fired
       automount-5406  [059] ..s.   394.446503: clocksource_watchdog: cs_nsec: 26670906785 wd_nsec: 26670888107 delta: 18678 wdnow: 1353664854  wdlast: 971786304 watchdog->mask: ffffffff
          ypbind-5385  [060] ..s.   561.325932: clocksource_watchdog: fired
          ypbind-5385  [060] ..s.   561.325937: clocksource_watchdog: cs_nsec: 166877142176 wd_nsec: 0 delta: 166877142176 wdnow: 3743040056  wdlast: 1353664854 watchdog->mask: ffffffff
vogelweide:/sys/kernel/debug/tracing/:[0]#

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30 10:29             ` Mike Galbraith
@ 2016-06-30 10:31               ` Thomas Gleixner
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2016-06-30 10:31 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 30 Jun 2016, Mike Galbraith wrote:
>        automount-5406  [059] ..s.   394.446498: clocksource_watchdog: fired
>        automount-5406  [059] ..s.   394.446503: clocksource_watchdog: cs_nsec: 26670906785 wd_nsec: 26670888107 delta: 18678 wdnow: 1353664854  wdlast: 971786304 watchdog->mask: ffffffff
>           ypbind-5385  [060] ..s.   561.325932: clocksource_watchdog: fired
>           ypbind-5385  [060] ..s.   561.325937: clocksource_watchdog: cs_nsec: 166877142176 wd_nsec: 0 delta: 166877142176 wdnow: 3743040056  wdlast: 1353664854 watchdog->mask: ffffffff

So this is a remote enqueue on cpu60. /me goes digging
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30 10:24             ` Thomas Gleixner
@ 2016-06-30 13:39               ` Mike Galbraith
  2016-06-30 13:43                 ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Galbraith @ 2016-06-30 13:39 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 2016-06-30 at 12:24 +0200, Thomas Gleixner wrote:
> On Thu, 30 Jun 2016, Mike Galbraith wrote:
> > On Thu, 2016-06-30 at 11:42 +0200, Thomas Gleixner wrote:
> > 
> > > So there was no watchdog firing for 525 seconds ????
> > 
> > Yup, seems so.
> 
> if that's fully reproducible can you please enable the timer and softirq
> tracepoints along with your trace printks and stop the trace when that
> watchdog triggers?

Lunch rendered me (yawn) comatose, so I'm a bit late, but sent.

	-Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30 13:39               ` Mike Galbraith
@ 2016-06-30 13:43                 ` Thomas Gleixner
  2016-06-30 14:27                   ` Borislav Petkov
                                     ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Thomas Gleixner @ 2016-06-30 13:43 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 30 Jun 2016, Mike Galbraith wrote:
> 
> Lunch rendered me (yawn) comatose, so I'm a bit late, but sent.

Patch below should fix the issue - the timer one, not yours :)

Thanks,

	tglx

8<------------

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 3e0c4e60bf6a..2b72eda3dc79 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -531,28 +531,37 @@ __internal_add_timer(struct timer_base *base, struct timer_list *timer)
 static void
 trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
 {
+	if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+		return;
+
+	/*
+	 * This wants some optimizing similar to the below, but we do that
+	 * when we switch from push to pull for deferrable timers.
+	 */
+	if (timer->flags & TIMER_DEFERRABLE) {
+		if (tick_nohz_full_cpu(base->cpu))
+			wake_up_nohz_cpu(base->cpu);
+		return;
+	}
+
 	/*
 	 * We might have to IPI the remote CPU if the base is idle and the
 	 * timer is not deferrable. If the other cpu is on the way to idle
 	 * then it can't set base->is_idle as we hold base lock.
 	 */
-	if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->is_idle ||
-	    (timer->flags & TIMER_DEFERRABLE))
+	if (!base->is_idle)
 		return;
 
 	/* Check whether this is the new first expiring timer */
 	if (time_after_eq(timer->expires, base->next_expiry))
 		return;
-	base->next_expiry = timer->expires;
 
 	/*
-	 * Check whether the other CPU is in dynticks mode and needs to be
-	 * triggered to reevaluate the timer wheel.  We are protected against
-	 * the other CPU fiddling with the timer by holding the timer base
-	 * lock.
+	 * Set the next expiry time and kick the cpu so it can reevaluate the
+	 * wheel
 	 */
-	if (tick_nohz_full_cpu(base->cpu))
-		wake_up_nohz_cpu(base->cpu);
+	base->next_expiry = timer->expires;
+	wake_up_nohz_cpu(base->cpu);
 }
 
 static void

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30 13:43                 ` Thomas Gleixner
@ 2016-06-30 14:27                   ` Borislav Petkov
  2016-06-30 14:29                   ` Mike Galbraith
  2016-06-30 20:11                   ` Richard Cochran
  2 siblings, 0 replies; 17+ messages in thread
From: Borislav Petkov @ 2016-06-30 14:27 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Mike Galbraith, Peter Zijlstra, x86-ml, lkml

On Thu, Jun 30, 2016 at 03:43:34PM +0200, Thomas Gleixner wrote:
> On Thu, 30 Jun 2016, Mike Galbraith wrote:
> > 
> > Lunch rendered me (yawn) comatose, so I'm a bit late, but sent.
> 
> Patch below should fix the issue - the timer one, not yours :)

Reported-and-tested-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30 13:43                 ` Thomas Gleixner
  2016-06-30 14:27                   ` Borislav Petkov
@ 2016-06-30 14:29                   ` Mike Galbraith
  2016-06-30 20:11                   ` Richard Cochran
  2 siblings, 0 replies; 17+ messages in thread
From: Mike Galbraith @ 2016-06-30 14:29 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Borislav Petkov, Peter Zijlstra, x86-ml, lkml

On Thu, 2016-06-30 at 15:43 +0200, Thomas Gleixner wrote:
> On Thu, 30 Jun 2016, Mike Galbraith wrote:
> > 
> > Lunch rendered me (yawn) comatose, so I'm a bit late, but sent.
> 
> Patch below should fix the issue - the timer one, not yours :)

<twiddle>

Box should have griped by now.. yup, all better.

	-Mike

(mine isn't a wart, it's a.. somewhat noisy [90 dB] beauty mark)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large:
  2016-06-30 13:43                 ` Thomas Gleixner
  2016-06-30 14:27                   ` Borislav Petkov
  2016-06-30 14:29                   ` Mike Galbraith
@ 2016-06-30 20:11                   ` Richard Cochran
  2 siblings, 0 replies; 17+ messages in thread
From: Richard Cochran @ 2016-06-30 20:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mike Galbraith, Borislav Petkov, Peter Zijlstra, x86-ml, lkml

Thomas,

On Thu, Jun 30, 2016 at 03:43:34PM +0200, Thomas Gleixner wrote:
> Patch below should fix the issue - the timer one, not yours :)

This patch also fixes the occasionally delayed timers I reported on
the rt@linutronix.de list.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-06-30 20:11 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-27 20:06 rc5+tip/master: Marking clocksource 'tsc' as unstable because the skew is too large: Borislav Petkov
2016-06-29 17:16 ` Borislav Petkov
2016-06-30  7:57   ` Mike Galbraith
2016-06-30  8:41     ` Thomas Gleixner
2016-06-30  9:15       ` Mike Galbraith
2016-06-30  9:42         ` Thomas Gleixner
2016-06-30  9:49           ` Mike Galbraith
2016-06-30 10:24             ` Thomas Gleixner
2016-06-30 13:39               ` Mike Galbraith
2016-06-30 13:43                 ` Thomas Gleixner
2016-06-30 14:27                   ` Borislav Petkov
2016-06-30 14:29                   ` Mike Galbraith
2016-06-30 20:11                   ` Richard Cochran
2016-06-30 10:29             ` Mike Galbraith
2016-06-30 10:31               ` Thomas Gleixner
2016-06-30  8:40   ` Thomas Gleixner
2016-06-30  9:49     ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.