All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
@ 2012-07-01 18:29 John Stultz
  2012-07-01 18:30 ` [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic John Stultz
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: John Stultz @ 2012-07-01 18:29 UTC (permalink / raw)
  To: Linux Kernel
  Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner, John Stultz

From: John Stultz <john.stultz@linaro.org>

Here's round two on this one. 

As widely reported on the internet, some Linux systems after
the leapsecond was inserted are experiencing futex related load
spikes (usually connected to MySQL, Firefox, Thunderbird, Java, etc).

An apparent  workaround for this issue is running:
$ date -s "`date`"

Credit: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix


To address this issue we do two things:
1) Fix the clock_was_set() call to remove the limitation that kept
us from calling it from update_wall_time().

2) Call clock_was_set() when we add/remove a leapsecond.

I've been able to reproduce the load spike using Thunderbird
when triggering a leap second and with this patch the issue
did not crop up.

NOTE: Some reports have been of a hard hang right at or before
the leapsecond. I've not been able to reproduce or diagnose
this, so this fix does not likely address the reported hard
hangs (unless they end up being connected to the futex/hrtimer
issue).


TODOs:
* Chase down the futex/hrtimer interaction to see if this could
be triggered in any other way.
* Get Tglx's input/ack
* Generate a backport for pre-v3.4 kernels


v2:
* Address the issue w/ calling clock_was_set from atomic context, pointed
out by Prarit and Ben.
* Rework fix so its simpler.


CC: Prarit Bhargava <prarit@redhat.com>
CC: stable@vger.kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>

John Stultz (2):
  [RFC] Fix clock_was_set so it is safe to call from atomic
  [RFC] Fix leapsecond triggered hrtimer/futex load spike issue

 kernel/hrtimer.c          |   16 +++++++++++++++-
 kernel/time/timekeeping.c |    4 ++++
 2 files changed, 19 insertions(+), 1 deletion(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic
  2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
@ 2012-07-01 18:30 ` John Stultz
  2012-07-02  8:53   ` Thomas Gleixner
  2012-07-01 18:30 ` [PATCH 2/2] [RFC] Fix leapsecond triggered hrtimer/futex load spike issue John Stultz
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: John Stultz @ 2012-07-01 18:30 UTC (permalink / raw)
  To: Linux Kernel; +Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner

NOTE:This is a prerequisite patch that's required to
address the widely observed leap-second related futex/hrtimer
issues.

Currently clock_was_set() is unsafe to be called from atomic
context, as it calls on_each_cpu(). This causes problems when
we need to adjust the time from update_wall_time().

To fix this, introduce a work_struct so if we're in_atomic,
we can schedule work to do the necessary update after we're
out of the atomic section.

CC: Prarit Bhargava <prarit@redhat.com>
CC: stable@vger.kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
 kernel/hrtimer.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ae34bf5..ee7a98d 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -746,7 +746,7 @@ static inline void retrigger_next_event(void *arg) { }
  * resolution timer interrupts. On UP we just disable interrupts and
  * call the high resolution interrupt code.
  */
-void clock_was_set(void)
+static void do_clock_was_set(struct work_struct *work)
 {
 #ifdef CONFIG_HIGH_RES_TIMERS
 	/* Retrigger the CPU local events everywhere */
@@ -754,6 +754,20 @@ void clock_was_set(void)
 #endif
 	timerfd_clock_was_set();
 }
+static DECLARE_WORK(clock_was_set_work, do_clock_was_set);
+
+void clock_was_set(void)
+{
+	/*
+	 * We can't call on_each_cpu() from atomic context,
+	 * so if we're in_atomic, schedule the clock_was_set
+	 * for later.
+	 */
+	if (in_atomic())
+		schedule_work(&clock_was_set_work);
+	else
+		do_clock_was_set(NULL);
+}
 
 /*
  * During resume we might have to reprogram the high resolution timer
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] [RFC] Fix leapsecond triggered hrtimer/futex load spike issue
  2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
  2012-07-01 18:30 ` [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic John Stultz
@ 2012-07-01 18:30 ` John Stultz
  2012-07-01 18:34 ` [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: John Stultz @ 2012-07-01 18:30 UTC (permalink / raw)
  To: Linux Kernel; +Cc: John Stultz, Prarit Bhargava, stable, Thomas Gleixner

As widely reported on the internet, some Linux systems after
the leapsecond was inserted are experiencing futex related load
spikes (usually connected to MySQL, Firefox, Thunderbird, Java, etc).

An apparent  workaround for this issue is running:
$ date -s "`date`"

Credit: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix

I believe this issue is due to the leapsecond being added without
calling clock_was_set() to notify the hrtimer subsystem of the
change. (Although I've not yet chased all the way down to the
hrtimer code to validate exactly what's going on there).

The workaround functions as it forces a clock_was_set()
call from settimeofday().

This fix adds the required clock_was_set() calls to where
we adjust for leapseconds.

NOTE: This fix *depends* on the previous fix, which allows
clock_was_set to be called from atomic context. Do not try
to apply just this patch.

CC: Prarit Bhargava <prarit@redhat.com>
CC: stable@vger.kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
 kernel/time/timekeeping.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6f46a00..cc2991d 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -963,6 +963,8 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
 		leap = second_overflow(timekeeper.xtime.tv_sec);
 		timekeeper.xtime.tv_sec += leap;
 		timekeeper.wall_to_monotonic.tv_sec -= leap;
+		if (leap)
+			clock_was_set();
 	}
 
 	/* Accumulate raw time */
@@ -1079,6 +1081,8 @@ static void update_wall_time(void)
 		leap = second_overflow(timekeeper.xtime.tv_sec);
 		timekeeper.xtime.tv_sec += leap;
 		timekeeper.wall_to_monotonic.tv_sec -= leap;
+		if (leap)
+			clock_was_set();
 	}
 
 	timekeeping_update(false);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
  2012-07-01 18:30 ` [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic John Stultz
  2012-07-01 18:30 ` [PATCH 2/2] [RFC] Fix leapsecond triggered hrtimer/futex load spike issue John Stultz
@ 2012-07-01 18:34 ` John Stultz
  2012-07-01 18:47 ` Jan Engelhardt
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: John Stultz @ 2012-07-01 18:34 UTC (permalink / raw)
  To: John Stultz; +Cc: Linux Kernel, Prarit Bhargava, stable, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 341 bytes --]

On 07/01/2012 11:29 AM, John Stultz wrote:
> From: John Stultz <john.stultz@linaro.org>
>
> Here's round two on this one.
And again, attached is the test case I've been using to trigger 
leapseconds, should anyone else want to help with testing the patches or 
reproducing problems.

To build:
gcc leaptest.c -o leaptest -lrt

thanks
-john


[-- Attachment #2: leaptest.c --]
[-- Type: text/x-csrc, Size: 2076 bytes --]

/* Leap second test
 *              by: john stultz (johnstul@us.ibm.com)
 *              (C) Copyright IBM 2012
 *              Licensed under the GPL
 */


#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/timex.h>


#define CALLS_PER_LOOP 64
#define NSEC_PER_SEC 1000000000ULL

/* returns 1 if a <= b, 0 otherwise */
static inline int in_order(struct timespec a, struct timespec b)
{
	if(a.tv_sec < b.tv_sec)
		return 1;
	if(a.tv_sec > b.tv_sec)
		return 0;
	if(a.tv_nsec > b.tv_nsec)
		return 0;
	return 1;
}


int  main(void)
{
	struct timeval tv;
	struct timex tx;
	struct timespec list[CALLS_PER_LOOP];
	int i, inconsistent;
	int clock_type = CLOCK_REALTIME;
	long now, then;

	/* Get the current time */
	gettimeofday(&tv, NULL);

	/* Calculate the next leap second */
	tv.tv_sec += 86400 - tv.tv_sec % 86400;

	/* Set the time to be 10 seconds from that time */
	tv.tv_sec -= 10;
	settimeofday(&tv, NULL);

	/* Set the leap second insert flag */
	tx.modes = ADJ_STATUS;
	tx.status = STA_INS;
	adjtimex(&tx);

	clock_gettime(clock_type, &list[0]);
	now = then = list[0].tv_sec;
	while(now - then < 30){
		inconsistent = 0;

		/* Fill list */
		for(i=0; i < CALLS_PER_LOOP; i++)
			clock_gettime(clock_type, &list[i]);

		/* Check for inconsistencies */
		for(i=0; i < CALLS_PER_LOOP-1; i++)
			if(!in_order(list[i],list[i+1]))
				inconsistent = i;

		/* display inconsistency */
		if(inconsistent){
			unsigned long long delta;
			for(i=0; i < CALLS_PER_LOOP; i++){
				if(i == inconsistent)
					printf("--------------------\n");
				printf("%lu:%lu\n",list[i].tv_sec,
							list[i].tv_nsec);
				if(i == inconsistent + 1 )
					printf("--------------------\n");
			}
			delta = list[inconsistent].tv_sec*NSEC_PER_SEC;
			delta += list[inconsistent].tv_nsec;
			delta -= list[inconsistent+1].tv_sec*NSEC_PER_SEC;
			delta -= list[inconsistent+1].tv_nsec;
			printf("Delta: %llu ns\n", delta);
			fflush(0);
			break;
		}
		now = list[0].tv_sec;
	}

	/* clear TIME_WAIT */
	tx.modes = ADJ_STATUS;
	tx.status = 0;
	adjtimex(&tx);

	return 0;
}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
                   ` (2 preceding siblings ...)
  2012-07-01 18:34 ` [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
@ 2012-07-01 18:47 ` Jan Engelhardt
  2012-07-01 22:05 ` John Stultz
  2012-07-02  4:12 ` John Stultz
  5 siblings, 0 replies; 12+ messages in thread
From: Jan Engelhardt @ 2012-07-01 18:47 UTC (permalink / raw)
  To: John Stultz
  Cc: Linux Kernel, John Stultz, Prarit Bhargava, stable, Thomas Gleixner


On Sunday 2012-07-01 20:29, John Stultz wrote:
>
>An apparent  workaround for this issue is running:
>$ date -s "`date`"
>
>Credit: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix

One can run the date call even while ntpd keeps running.

>TODOs:
>* Chase down the futex/hrtimer interaction to see if this could
>be triggered in any other way.
>* Get Tglx's input/ack
>* Generate a backport for pre-v3.4 kernels

It looks like 2.6.37.6 is not affected (on a system where ntpd is 
running). That is to say, at least the mysqld instance on such a system 
did not jump to high CPU usage. I did not try your testcase.c there, 
since that would likely freak out the webapps - or the management.

Other reports on IRC include that 2.6.32-71.el6 (RHEL) is affected. 
(Backporting gone wrong, *snicker* ;)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
                   ` (3 preceding siblings ...)
  2012-07-01 18:47 ` Jan Engelhardt
@ 2012-07-01 22:05 ` John Stultz
  2012-07-02  4:12 ` John Stultz
  5 siblings, 0 replies; 12+ messages in thread
From: John Stultz @ 2012-07-01 22:05 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Prarit Bhargava, stable, Thomas Gleixner, Jan Engelhardt

[-- Attachment #1: Type: text/plain, Size: 2223 bytes --]

On 07/01/2012 11:29 AM, John Stultz wrote:
> TODOs:
> * Chase down the futex/hrtimer interaction to see if this could
> be triggered in any other way.

Ok, got a little more detailed diagnosis of what is going on figured out:

* Leap second occurs, CLOCK_REALTIME is set back one second.

* As clock_was_set() is not called, the hrtimer base.offset value for 
CLOCK_REALTIME is not updated, thus its sense of wall time is one second 
ahead of the timekeeping core's.

* At interrupt time (T), the hrtimer code expires all CLOCK_REALTIME 
based timers set for T+1s and before, causing early expirations for 
timers between T and T+1s since the hrtimer code's sense of time is one 
second ahead.

* This causes all TIMER_ABSTIME CLOCK_REALTIME timers to expire one 
second early.

* More problematically, all sub-second TIMER_ABSTIME CLOCK_REALTIME 
timers will return immediately.  If any such timer calls are done in a 
loop (as commonly done with futex_wait or other timeouts), this will 
cause load spikes in those applications.

* This state persists until clock_was_set() is called (most easily done 
via settimeofday())


I've used the attached test case to demonstrate triggering a leap-second 
and its effect on CLOCK_REALTIME hrtimers.

The test sets a leapsecond to trigger in 10 seconds, then in a loop 
sleeps for half a second via clock_nanosleep, printing out the current 
time, and the delta from the target wakeup time for 30 seconds.

When the leap second triggers, on affected machines you'll see the 
output streams quickly, with negative diff values, as clock_nanosleep is 
immediately returning.

To build:
gcc leaptest-timer.c -o leaptest-timer -lrt


I've reproduced this behaviour in kernel versions:
     v3.5-rc4
     v2.6.37
     v2.6.32.59
(And quite likely all in-between).

I haven't been able to build or boot anything earlier with the distro on 
my current test boxes, but I'm working to get older distro installed so 
I can do further testing.

Likely has potentially been around 
since:746976a301ac9c9aa10d7d42454f8d6cdad8ff2b in v2.6.22, as Ben Blum 
and Jan Ceuleers already noted.

With my fix to call clock_was_set when we apply a leapsecond, I no 
longer see the issue.

thanks
-john


[-- Attachment #2: leaptest-timer.c --]
[-- Type: text/x-csrc, Size: 2181 bytes --]

/* Leap second timer test
 *              by: john stultz (johnstul@us.ibm.com)
 *              (C) Copyright IBM 2012
 *              Licensed under the GPL
 */


#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/timex.h>


#define CALLS_PER_LOOP 64
#define NSEC_PER_SEC 1000000000ULL

struct timespec timespec_add(struct timespec ts, unsigned long long ns)
{
	ts.tv_nsec += ns;
	while(ts.tv_nsec >= NSEC_PER_SEC) {
		ts.tv_nsec -= NSEC_PER_SEC;
		ts.tv_sec++;
	}
	return ts;
}

struct timespec timespec_diff(struct timespec a, struct timespec b)
{
	long long ns;
	int neg = 0;

	ns = a.tv_sec *NSEC_PER_SEC + a.tv_nsec;
	ns -= b.tv_sec *NSEC_PER_SEC + b.tv_nsec;

	if (ns < 0) {
		neg = 1;
		ns = -ns;
	}
	a.tv_sec = ns/NSEC_PER_SEC;
	a.tv_nsec = ns%NSEC_PER_SEC;

	if (neg) {
		a.tv_sec = -a.tv_sec;
		a.tv_nsec = -a.tv_nsec;
	}

	return a;
}


int  main(void)
{
	struct timeval tv;
	struct timex tx;
	int i, inconsistent;
	long now, then;
	struct timespec ts;

	int clock_type 		= CLOCK_REALTIME;
	int flag 		= TIMER_ABSTIME;
	long long sleeptime	= NSEC_PER_SEC/2;


	/* clear TIME_WAIT */
	tx.modes = ADJ_STATUS;
	tx.status = 0;
	adjtimex(&tx);

	sleep(2);

	/* Get the current time */
	gettimeofday(&tv, NULL);

	/* Calculate the next leap second */
	tv.tv_sec += 86400 - tv.tv_sec % 86400;

	/* Set the time to be 10 seconds from that time */
	tv.tv_sec -= 10;
	settimeofday(&tv, NULL);


	/* Set the leap second insert flag */
	tx.modes = ADJ_STATUS;
	tx.status = STA_INS;
	adjtimex(&tx);

	clock_gettime(clock_type, &ts);
	now = then = ts.tv_sec;
	while(now - then < 30){
		struct timespec target, diff, rem;
		rem.tv_sec = 0;
		rem.tv_nsec = 0;

		if (flag == TIMER_ABSTIME)
			target = timespec_add(ts, sleeptime);
		else
			target = timespec_add(rem, sleeptime);

		clock_nanosleep(clock_type, flag, &target, &rem);
		clock_gettime(clock_type, &ts);

		diff = timespec_diff(ts, target);
		printf("now: %ld:%ld  diff: %ld:%ld rem: %ld:%ld\n",
				ts.tv_sec, ts.tv_nsec,
				diff.tv_sec, diff.tv_nsec,
				rem.tv_sec, rem.tv_nsec);
		now = ts.tv_sec;
	}

	/* clear TIME_WAIT */
	tx.modes = ADJ_STATUS;
	tx.status = 0;
	adjtimex(&tx);

	return 0;
}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
                   ` (4 preceding siblings ...)
  2012-07-01 22:05 ` John Stultz
@ 2012-07-02  4:12 ` John Stultz
  2012-07-02 13:53   ` Prarit Bhargava
  2012-07-02 18:51   ` Dave Jones
  5 siblings, 2 replies; 12+ messages in thread
From: John Stultz @ 2012-07-02  4:12 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Prarit Bhargava, stable, Thomas Gleixner

On 07/01/2012 11:29 AM, John Stultz wrote:
> TODOs:
> * Chase down the futex/hrtimer interaction to see if this could
> be triggered in any other way.
> * Get Tglx's input/ack
> * Generate a backport for pre-v3.4 kernels
So while still waiting for feedback on the clock_was_set() change, I 
went ahead and generated backports for most of the stable kernels on 
kernel.org.

Clearly these shouldn't go anywhere until the fix is upstream, but since 
I assume there's a number of distro developers who are likely under 
pressure to have a fix soon, I wanted to make them available so no one 
is duplicating work.

You can find them here:
http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=summary

I did boot and test each of those kernels with my leaptest-timer.c test 
successfully.

thanks
-john



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic
  2012-07-01 18:30 ` [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic John Stultz
@ 2012-07-02  8:53   ` Thomas Gleixner
  2012-07-02 22:11     ` John Stultz
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2012-07-02  8:53 UTC (permalink / raw)
  To: John Stultz; +Cc: Linux Kernel, Prarit Bhargava, stable

On Sun, 1 Jul 2012, John Stultz wrote:

> NOTE:This is a prerequisite patch that's required to
> address the widely observed leap-second related futex/hrtimer
> issues.
> 
> Currently clock_was_set() is unsafe to be called from atomic
> context, as it calls on_each_cpu(). This causes problems when
> we need to adjust the time from update_wall_time().
> 
> To fix this, introduce a work_struct so if we're in_atomic,
> we can schedule work to do the necessary update after we're
> out of the atomic section.

Shouldn't we queue a timer_list timer with expiry time jiffies + 0
instead. We can call on_each_cpu() from softirq context. And that
ensures that the update happens right away, while a scheduled work
might be delayed arbitrary long.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-02  4:12 ` John Stultz
@ 2012-07-02 13:53   ` Prarit Bhargava
  2012-07-02 18:51   ` Dave Jones
  1 sibling, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2012-07-02 13:53 UTC (permalink / raw)
  To: John Stultz; +Cc: Linux Kernel, stable, Thomas Gleixner



On 07/02/2012 12:12 AM, John Stultz wrote:
> On 07/01/2012 11:29 AM, John Stultz wrote:
>> TODOs:
>> * Chase down the futex/hrtimer interaction to see if this could
>> be triggered in any other way.
>> * Get Tglx's input/ack
>> * Generate a backport for pre-v3.4 kernels
> So while still waiting for feedback on the clock_was_set() change, I went ahead
> and generated backports for most of the stable kernels on kernel.org.
> 

I've tested on a well-known enterprise distro ;), as well as it's more public
variant with the latest top-of-tree kernel + this second patchset across a
fairly wide selection of systems [AMD and Intel, large and small] and don't see
any issues.

(I haven't taken tglx's comments into account yet though ... this is just an
indication that the direction of the patch seems correct)

P.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-02  4:12 ` John Stultz
  2012-07-02 13:53   ` Prarit Bhargava
@ 2012-07-02 18:51   ` Dave Jones
  2012-07-02 19:08     ` John Stultz
  1 sibling, 1 reply; 12+ messages in thread
From: Dave Jones @ 2012-07-02 18:51 UTC (permalink / raw)
  To: John Stultz; +Cc: Linux Kernel, Prarit Bhargava, stable, Thomas Gleixner

On Sun, Jul 01, 2012 at 09:12:59PM -0700, John Stultz wrote:
 > On 07/01/2012 11:29 AM, John Stultz wrote:
 > > TODOs:
 > > * Chase down the futex/hrtimer interaction to see if this could
 > > be triggered in any other way.
 > > * Get Tglx's input/ack
 > > * Generate a backport for pre-v3.4 kernels
 > So while still waiting for feedback on the clock_was_set() change, I 
 > went ahead and generated backports for most of the stable kernels on 
 > kernel.org.
 > 
 > Clearly these shouldn't go anywhere until the fix is upstream, but since 
 > I assume there's a number of distro developers who are likely under 
 > pressure to have a fix soon, I wanted to make them available so no one 
 > is duplicating work.
 > 
 > You can find them here:
 > http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=summary
 > 
 > I did boot and test each of those kernels with my leaptest-timer.c test 
 > successfully.

I'm curious how the test that I did with the kernel patch,
or Richard Cochran's userspace test program didn't trigger this bug
when we tested last week.

any ideas what we missed ?

	Dave
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2)
  2012-07-02 18:51   ` Dave Jones
@ 2012-07-02 19:08     ` John Stultz
  0 siblings, 0 replies; 12+ messages in thread
From: John Stultz @ 2012-07-02 19:08 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Prarit Bhargava, stable, Thomas Gleixner

On 07/02/2012 11:51 AM, Dave Jones wrote:
> On Sun, Jul 01, 2012 at 09:12:59PM -0700, John Stultz wrote:
>   > On 07/01/2012 11:29 AM, John Stultz wrote:
>   > > TODOs:
>   > > * Chase down the futex/hrtimer interaction to see if this could
>   > > be triggered in any other way.
>   > > * Get Tglx's input/ack
>   > > * Generate a backport for pre-v3.4 kernels
>   > So while still waiting for feedback on the clock_was_set() change, I
>   > went ahead and generated backports for most of the stable kernels on
>   > kernel.org.
>   >
>   > Clearly these shouldn't go anywhere until the fix is upstream, but since
>   > I assume there's a number of distro developers who are likely under
>   > pressure to have a fix soon, I wanted to make them available so no one
>   > is duplicating work.
>   >
>   > You can find them here:
>   > http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=summary
>   >
>   > I did boot and test each of those kernels with my leaptest-timer.c test
>   > successfully.
>
> I'm curious how the test that I did with the kernel patch,
> or Richard Cochran's userspace test program didn't trigger this bug
> when we tested last week.

It likely did trigger the issue.

> any ideas what we missed ?

In order to observe this issue, you need to notice that CLOCK_REALTIME 
timers are firing one second early.  The issue does not affect 
CLOCK_MONOTONIC timers.  Its only most visible with applications that 
make sub-second CLOCK_REALTIME timeouts in a loop (most reported cases 
connected with futexs). So if such an application wasn't running, it 
would be easy to overlook.

thanks
-john


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic
  2012-07-02  8:53   ` Thomas Gleixner
@ 2012-07-02 22:11     ` John Stultz
  0 siblings, 0 replies; 12+ messages in thread
From: John Stultz @ 2012-07-02 22:11 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Linux Kernel, Prarit Bhargava, stable

On 07/02/2012 01:53 AM, Thomas Gleixner wrote:
> On Sun, 1 Jul 2012, John Stultz wrote:
>
>> NOTE:This is a prerequisite patch that's required to
>> address the widely observed leap-second related futex/hrtimer
>> issues.
>>
>> Currently clock_was_set() is unsafe to be called from atomic
>> context, as it calls on_each_cpu(). This causes problems when
>> we need to adjust the time from update_wall_time().
>>
>> To fix this, introduce a work_struct so if we're in_atomic,
>> we can schedule work to do the necessary update after we're
>> out of the atomic section.
> Shouldn't we queue a timer_list timer with expiry time jiffies + 0
> instead. We can call on_each_cpu() from softirq context. And that
> ensures that the update happens right away, while a scheduled work
> might be delayed arbitrary long.
Thanks for the feedback.
I've implemented this, but before I send it out, I'm trying to see if 
there's not a way to change hrtimers so it doesn't keep its own per-cpu 
sense of time.  If I don't sort that out shortly, I'll go ahead and send 
your suggestion out for inclusion so the fix is committed and I can try 
to further improve it afterwards.

thanks
-john


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-07-02 22:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-01 18:29 [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
2012-07-01 18:30 ` [PATCH 1/2] [RFC] Fix clock_was_set so it is safe to call from atomic John Stultz
2012-07-02  8:53   ` Thomas Gleixner
2012-07-02 22:11     ` John Stultz
2012-07-01 18:30 ` [PATCH 2/2] [RFC] Fix leapsecond triggered hrtimer/futex load spike issue John Stultz
2012-07-01 18:34 ` [PATCH 0/2][RFC] Potential fix for leapsecond caused futex issue (v2) John Stultz
2012-07-01 18:47 ` Jan Engelhardt
2012-07-01 22:05 ` John Stultz
2012-07-02  4:12 ` John Stultz
2012-07-02 13:53   ` Prarit Bhargava
2012-07-02 18:51   ` Dave Jones
2012-07-02 19:08     ` John Stultz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.