linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] timer: Improve itimers scalability
@ 2015-08-26  3:17 Jason Low
  2015-08-26  3:17 ` [PATCH 1/3] timer: Optimize fastpath_timer_check() Jason Low
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Jason Low @ 2015-08-26  3:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney
  Cc: linux-kernel, Frederic Weisbecker, Linus Torvalds,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton, Jason Low

When running a database workload on a 16 socket machine, there were
scalability issues related to itimers.

Commit 1018016c706f addressed the issue with the thread_group_cputimer
spinlock taking up a significant portion of total run time.

This patch series address the other issue where a lot of time is spent
trying to acquire the sighand lock. It was found in some cases that
200+ threads were simultaneously contending for the same sighand lock,
reducing throughput by more than 30%.

Jason Low (3):
  timer: Optimize fastpath_timer_check()
  timer: Check thread timers only when there are active thread timers
  timer: Reduce unnecessary sighand lock contention

 include/linux/init_task.h      |    1 +
 include/linux/sched.h          |    3 ++
 kernel/time/posix-cpu-timers.c |   44 +++++++++++++++++++++++++++------------
 3 files changed, 34 insertions(+), 14 deletions(-)

-- 
1.7.2.5


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/3] timer: Optimize fastpath_timer_check()
  2015-08-26  3:17 [PATCH 0/3] timer: Improve itimers scalability Jason Low
@ 2015-08-26  3:17 ` Jason Low
  2015-08-26 21:57   ` Frederic Weisbecker
  2015-08-31 15:15   ` Davidlohr Bueso
  2015-08-26  3:17 ` [PATCH 2/3] timer: Check thread timers only when there are active thread timers Jason Low
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 34+ messages in thread
From: Jason Low @ 2015-08-26  3:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney
  Cc: linux-kernel, Frederic Weisbecker, Linus Torvalds,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton, Jason Low

In fastpath_timer_check(), the task_cputime() function is always
called to compute the utime and stime values. However, this is not
necessary if there are no per-thread timers to check for. This patch
modifies the code such that we compute the task_cputime values only
when there are per-thread timers set.

Signed-off-by: Jason Low <jason.low2@hp.com>
---
 kernel/time/posix-cpu-timers.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 892e3da..02596ff 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1117,16 +1117,15 @@ static inline int task_cputime_expired(const struct task_cputime *sample,
 static inline int fastpath_timer_check(struct task_struct *tsk)
 {
 	struct signal_struct *sig;
-	cputime_t utime, stime;
-
-	task_cputime(tsk, &utime, &stime);
 
 	if (!task_cputime_zero(&tsk->cputime_expires)) {
-		struct task_cputime task_sample = {
-			.utime = utime,
-			.stime = stime,
-			.sum_exec_runtime = tsk->se.sum_exec_runtime
-		};
+		struct task_cputime task_sample;
+		cputime_t utime, stime;
+
+		task_cputime(tsk, &utime, &stime);
+		task_sample.utime = utime;
+		task_sample.stime = stime;
+		task_sample.sum_exec_runtime = tsk->se.sum_exec_runtime;
 
 		if (task_cputime_expired(&task_sample, &tsk->cputime_expires))
 			return 1;
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/3] timer: Check thread timers only when there are active thread timers
  2015-08-26  3:17 [PATCH 0/3] timer: Improve itimers scalability Jason Low
  2015-08-26  3:17 ` [PATCH 1/3] timer: Optimize fastpath_timer_check() Jason Low
@ 2015-08-26  3:17 ` Jason Low
  2015-08-26  3:17 ` [PATCH 3/3] timer: Reduce unnecessary sighand lock contention Jason Low
  2015-08-26  3:27 ` [PATCH 0/3] timer: Improve itimers scalability Andrew Morton
  3 siblings, 0 replies; 34+ messages in thread
From: Jason Low @ 2015-08-26  3:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney
  Cc: linux-kernel, Frederic Weisbecker, Linus Torvalds,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton, Jason Low

The fastpath_timer_check() contains logic to check for if any timers
are set by checking if !task_cputime_zero(). Similarly, we can do this
before calling check_thread_timers(). In the case where there
are only process-wide timers, this will skip all the computations for
the per-thread timers when there are no per-thread timers.

Signed-off-by: Jason Low <jason.low2@hp.com>
---
 kernel/time/posix-cpu-timers.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 02596ff..535bef5 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1168,11 +1168,13 @@ void run_posix_cpu_timers(struct task_struct *tsk)
 	if (!lock_task_sighand(tsk, &flags))
 		return;
 	/*
-	 * Here we take off tsk->signal->cpu_timers[N] and
-	 * tsk->cpu_timers[N] all the timers that are firing, and
-	 * put them on the firing list.
+	 * If there are active per-thread timers, take off
+	 * tsk->signal->cpu_timers[N] and tsk->cpu_timers[N] all the
+	 * timers that are firing, and put them on the firing list.
 	 */
-	check_thread_timers(tsk, &firing);
+	if (!task_cputime_zero(&tsk->cputime_expires))
+		check_thread_timers(tsk, &firing);
+
 	/*
 	 * If there are any active process wide timers (POSIX 1.b, itimers,
 	 * RLIMIT_CPU) cputimer must be running.
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26  3:17 [PATCH 0/3] timer: Improve itimers scalability Jason Low
  2015-08-26  3:17 ` [PATCH 1/3] timer: Optimize fastpath_timer_check() Jason Low
  2015-08-26  3:17 ` [PATCH 2/3] timer: Check thread timers only when there are active thread timers Jason Low
@ 2015-08-26  3:17 ` Jason Low
  2015-08-26 17:53   ` Linus Torvalds
  2015-08-26 22:56   ` Frederic Weisbecker
  2015-08-26  3:27 ` [PATCH 0/3] timer: Improve itimers scalability Andrew Morton
  3 siblings, 2 replies; 34+ messages in thread
From: Jason Low @ 2015-08-26  3:17 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney
  Cc: linux-kernel, Frederic Weisbecker, Linus Torvalds,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton, Jason Low

It was found while running a database workload on large systems that
significant time was spent trying to acquire the sighand lock.

The issue was that whenever an itimer expired, many threads ended up
simultaneously trying to send the signal. Most of the time, nothing
happened after acquiring the sighand lock because another thread
had already sent the signal and updated the "next expire" time. The
fastpath_timer_check() didn't help much since the "next expire" time
was updated later.
 
This patch addresses this by having the thread_group_cputimer structure
maintain a boolean to signify when a thread in the group is already
checking for process wide timers, and adds extra logic in the fastpath
to check the boolean.

Signed-off-by: Jason Low <jason.low2@hp.com>
---
 include/linux/init_task.h      |    1 +
 include/linux/sched.h          |    3 +++
 kernel/time/posix-cpu-timers.c |   19 +++++++++++++++++--
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index d0b380e..3350c77 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -53,6 +53,7 @@ extern struct fs_struct init_fs;
 	.cputimer	= { 						\
 		.cputime_atomic	= INIT_CPUTIME_ATOMIC,			\
 		.running	= 0,					\
+		.checking_timer	= 0,					\
 	},								\
 	INIT_PREV_CPUTIME(sig)						\
 	.cred_guard_mutex =						\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 119823d..a6c8334 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -619,6 +619,8 @@ struct task_cputime_atomic {
  * @cputime_atomic:	atomic thread group interval timers.
  * @running:		non-zero when there are timers running and
  * 			@cputime receives updates.
+ * @checking_timer:	non-zero when a thread is in the process of
+ *			checking for thread group timers.
  *
  * This structure contains the version of task_cputime, above, that is
  * used for thread group CPU timer calculations.
@@ -626,6 +628,7 @@ struct task_cputime_atomic {
 struct thread_group_cputimer {
 	struct task_cputime_atomic cputime_atomic;
 	int running;
+	int checking_timer;
 };
 
 #include <linux/rwsem.h>
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 535bef5..f3ddf0e 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -962,6 +962,14 @@ static void check_process_timers(struct task_struct *tsk,
 	unsigned long soft;
 
 	/*
+	 * Signify that a thread is checking for process timers.
+	 * The checking_timer field is only modified in this function,
+	 * which is called with the sighand lock held. Thus, we can
+	 * just use WRITE_ONCE() without any further locking.
+	 */
+	WRITE_ONCE(sig->cputimer.checking_timer, 1);
+
+	/*
 	 * Collect the current process totals.
 	 */
 	thread_group_cputimer(tsk, &cputime);
@@ -1015,6 +1023,8 @@ static void check_process_timers(struct task_struct *tsk,
 	sig->cputime_expires.sched_exp = sched_expires;
 	if (task_cputime_zero(&sig->cputime_expires))
 		stop_process_timers(sig);
+
+	WRITE_ONCE(sig->cputimer.checking_timer, 0);
 }
 
 /*
@@ -1132,8 +1142,13 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
 	}
 
 	sig = tsk->signal;
-	/* Check if cputimer is running. This is accessed without locking. */
-	if (READ_ONCE(sig->cputimer.running)) {
+	/*
+	 * Check if thread group timers expired if the cputimer is running
+	 * and that no other thread in the group is already checking for
+	 * thread group cputimers.
+	 */
+	if (READ_ONCE(sig->cputimer.running) &&
+	    !READ_ONCE(sig->cputimer.checking_timer)) {
 		struct task_cputime group_sample;
 
 		sample_cputime_atomic(&group_sample, &sig->cputimer.cputime_atomic);
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26  3:17 [PATCH 0/3] timer: Improve itimers scalability Jason Low
                   ` (2 preceding siblings ...)
  2015-08-26  3:17 ` [PATCH 3/3] timer: Reduce unnecessary sighand lock contention Jason Low
@ 2015-08-26  3:27 ` Andrew Morton
  2015-08-26 16:33   ` Jason Low
  3 siblings, 1 reply; 34+ messages in thread
From: Andrew Morton @ 2015-08-26  3:27 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Frederic Weisbecker,
	Linus Torvalds, Davidlohr Bueso, Steven Rostedt, Terry Rudd,
	Rik van Riel, Scott J Norton

On Tue, 25 Aug 2015 20:17:45 -0700 Jason Low <jason.low2@hp.com> wrote:

> When running a database workload on a 16 socket machine, there were
> scalability issues related to itimers.
> 
> Commit 1018016c706f addressed the issue with the thread_group_cputimer
> spinlock taking up a significant portion of total run time.
> 
> This patch series address the other issue where a lot of time is spent
> trying to acquire the sighand lock. It was found in some cases that
> 200+ threads were simultaneously contending for the same sighand lock,
> reducing throughput by more than 30%.

Does this imply that the patchset increased the throughput of this
workload by 30%?

And is this test case realistic?  If not, what are the benefits on a
real-world workload?


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26  3:27 ` [PATCH 0/3] timer: Improve itimers scalability Andrew Morton
@ 2015-08-26 16:33   ` Jason Low
  2015-08-26 17:08     ` Oleg Nesterov
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Low @ 2015-08-26 16:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Frederic Weisbecker,
	Linus Torvalds, Davidlohr Bueso, Steven Rostedt, Terry Rudd,
	Rik van Riel, Scott J Norton, jason.low2

Hi Andrew,

On Tue, 2015-08-25 at 20:27 -0700, Andrew Morton wrote:
> On Tue, 25 Aug 2015 20:17:45 -0700 Jason Low <jason.low2@hp.com> wrote:
> 
> > When running a database workload on a 16 socket machine, there were
> > scalability issues related to itimers.
> > 
> > Commit 1018016c706f addressed the issue with the thread_group_cputimer
> > spinlock taking up a significant portion of total run time.
> > 
> > This patch series address the other issue where a lot of time is spent
> > trying to acquire the sighand lock. It was found in some cases that
> > 200+ threads were simultaneously contending for the same sighand lock,
> > reducing throughput by more than 30%.
> 
> Does this imply that the patchset increased the throughput of this
> workload by 30%?
> 
> And is this test case realistic?  If not, what are the benefits on a
> real-world workload?

Yes, the test case with the database workload is realistic. We did write
a simple micro-benchmark that just generates the contention in this code
path to quickly test experimental patches, since the database takes
longer to set up and run. However, the performance issues and numbers
mentioned here are for the database workload.

These patches should also be beneficial for other multi-threaded
applications which uses process-wide timers particularly on systems with
a lot of cores.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26 16:33   ` Jason Low
@ 2015-08-26 17:08     ` Oleg Nesterov
  2015-08-26 22:07       ` Jason Low
  0 siblings, 1 reply; 34+ messages in thread
From: Oleg Nesterov @ 2015-08-26 17:08 UTC (permalink / raw)
  To: Jason Low
  Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, linux-kernel, Frederic Weisbecker,
	Linus Torvalds, Davidlohr Bueso, Steven Rostedt, Terry Rudd,
	Rik van Riel, Scott J Norton

On 08/26, Jason Low wrote:
>
> Hi Andrew,
>
> On Tue, 2015-08-25 at 20:27 -0700, Andrew Morton wrote:
> > On Tue, 25 Aug 2015 20:17:45 -0700 Jason Low <jason.low2@hp.com> wrote:
> >
> > > When running a database workload on a 16 socket machine, there were
> > > scalability issues related to itimers.
> > >
> > > Commit 1018016c706f addressed the issue with the thread_group_cputimer
> > > spinlock taking up a significant portion of total run time.
> > >
> > > This patch series address the other issue where a lot of time is spent
> > > trying to acquire the sighand lock. It was found in some cases that
> > > 200+ threads were simultaneously contending for the same sighand lock,
> > > reducing throughput by more than 30%.
> >
> > Does this imply that the patchset increased the throughput of this
> > workload by 30%?
> >
> > And is this test case realistic?  If not, what are the benefits on a
> > real-world workload?
>
> Yes, the test case with the database workload is realistic.

Can't resists, sorry... to me the very idea to use the process wide posix-
cpu-timers on performance critical application doesn't look realistic ;)

However, I thinks the patches are fine.


Reviewed-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26  3:17 ` [PATCH 3/3] timer: Reduce unnecessary sighand lock contention Jason Low
@ 2015-08-26 17:53   ` Linus Torvalds
  2015-08-26 22:31     ` Frederic Weisbecker
  2015-08-26 22:56   ` Frederic Weisbecker
  1 sibling, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2015-08-26 17:53 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, Linux Kernel Mailing List, Frederic Weisbecker,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton

On Tue, Aug 25, 2015 at 8:17 PM, Jason Low <jason.low2@hp.com> wrote:
>
> This patch addresses this by having the thread_group_cputimer structure
> maintain a boolean to signify when a thread in the group is already
> checking for process wide timers, and adds extra logic in the fastpath
> to check the boolean.

It is not at all obvious why the unlocked read of that variable is
safe, and why there is no race with another thread just about to end
its check_process_timers().

I can well imagine that this is all perfectly safe and fine, but I'd
really like to see that comment about _why_ that's the case, and why a
completely unlocked access without even memory barriers is fine.

                         Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] timer: Optimize fastpath_timer_check()
  2015-08-26  3:17 ` [PATCH 1/3] timer: Optimize fastpath_timer_check() Jason Low
@ 2015-08-26 21:57   ` Frederic Weisbecker
  2015-08-31 15:15   ` Davidlohr Bueso
  1 sibling, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-26 21:57 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton

On Tue, Aug 25, 2015 at 08:17:46PM -0700, Jason Low wrote:
> In fastpath_timer_check(), the task_cputime() function is always
> called to compute the utime and stime values. However, this is not
> necessary if there are no per-thread timers to check for. This patch
> modifies the code such that we compute the task_cputime values only
> when there are per-thread timers set.
> 
> Signed-off-by: Jason Low <jason.low2@hp.com>

Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>

Thanks.

> ---
>  kernel/time/posix-cpu-timers.c |   15 +++++++--------
>  1 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
> index 892e3da..02596ff 100644
> --- a/kernel/time/posix-cpu-timers.c
> +++ b/kernel/time/posix-cpu-timers.c
> @@ -1117,16 +1117,15 @@ static inline int task_cputime_expired(const struct task_cputime *sample,
>  static inline int fastpath_timer_check(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig;
> -	cputime_t utime, stime;
> -
> -	task_cputime(tsk, &utime, &stime);
>  
>  	if (!task_cputime_zero(&tsk->cputime_expires)) {
> -		struct task_cputime task_sample = {
> -			.utime = utime,
> -			.stime = stime,
> -			.sum_exec_runtime = tsk->se.sum_exec_runtime
> -		};
> +		struct task_cputime task_sample;
> +		cputime_t utime, stime;
> +
> +		task_cputime(tsk, &utime, &stime);
> +		task_sample.utime = utime;
> +		task_sample.stime = stime;
> +		task_sample.sum_exec_runtime = tsk->se.sum_exec_runtime;
>  
>  		if (task_cputime_expired(&task_sample, &tsk->cputime_expires))
>  			return 1;
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26 17:08     ` Oleg Nesterov
@ 2015-08-26 22:07       ` Jason Low
  2015-08-26 22:53         ` Hideaki Kimura
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Low @ 2015-08-26 22:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Paul E. McKenney, linux-kernel, Frederic Weisbecker,
	Linus Torvalds, Steven Rostedt, Rik van Riel, Scott J Norton,
	jason.low2, hideaki.kimura

On Wed, 2015-08-26 at 19:08 +0200, Oleg Nesterov wrote:
> On 08/26, Jason Low wrote:
> >
> > Hi Andrew,
> >
> > On Tue, 2015-08-25 at 20:27 -0700, Andrew Morton wrote:
> > > On Tue, 25 Aug 2015 20:17:45 -0700 Jason Low <jason.low2@hp.com> wrote:
> > >
> > > > When running a database workload on a 16 socket machine, there were
> > > > scalability issues related to itimers.
> > > >
> > > > Commit 1018016c706f addressed the issue with the thread_group_cputimer
> > > > spinlock taking up a significant portion of total run time.
> > > >
> > > > This patch series address the other issue where a lot of time is spent
> > > > trying to acquire the sighand lock. It was found in some cases that
> > > > 200+ threads were simultaneously contending for the same sighand lock,
> > > > reducing throughput by more than 30%.
> > >
> > > Does this imply that the patchset increased the throughput of this
> > > workload by 30%?
> > >
> > > And is this test case realistic?  If not, what are the benefits on a
> > > real-world workload?
> >
> > Yes, the test case with the database workload is realistic.
> 
> Can't resists, sorry... to me the very idea to use the process wide posix-
> cpu-timers on performance critical application doesn't look realistic ;)

I will let Hideaki elaborate more regarding the issue at the application
level.

> However, I thinks the patches are fine.
> 
> 
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>

Thanks for reviewing!


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 17:53   ` Linus Torvalds
@ 2015-08-26 22:31     ` Frederic Weisbecker
  2015-08-26 22:57       ` Jason Low
  0 siblings, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-26 22:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jason Low, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Oleg Nesterov, Paul E. McKenney, Linux Kernel Mailing List,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton

On Wed, Aug 26, 2015 at 10:53:35AM -0700, Linus Torvalds wrote:
> On Tue, Aug 25, 2015 at 8:17 PM, Jason Low <jason.low2@hp.com> wrote:
> >
> > This patch addresses this by having the thread_group_cputimer structure
> > maintain a boolean to signify when a thread in the group is already
> > checking for process wide timers, and adds extra logic in the fastpath
> > to check the boolean.
> 
> It is not at all obvious why the unlocked read of that variable is
> safe, and why there is no race with another thread just about to end
> its check_process_timers().

The risk is when a next timer is going to expire soon after we relaxed
the "checking" variable due to a recent expiration. The thread which
expires the next timer may still see a stale value on the "checking"
state and therefore delay the timer firing until the new value is seen.
So the worst that can happen is that the timer firing gets delayed for
X jiffies (I guess in practice it's only 1 jiffy).

That said, posix cpu timers already suffer such race because
sig->cputimer.running itself is checked outside the sighand lock anyway.

> I can well imagine that this is all perfectly safe and fine, but I'd
> really like to see that comment about _why_ that's the case, and why a
> completely unlocked access without even memory barriers is fine.

Agreed, there should be a comment about that in the code (that is already full
of undocumented subtleties).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26 22:07       ` Jason Low
@ 2015-08-26 22:53         ` Hideaki Kimura
  2015-08-26 23:13           ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Hideaki Kimura @ 2015-08-26 22:53 UTC (permalink / raw)
  To: Jason Low, Oleg Nesterov, Andrew Morton
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Paul E. McKenney,
	linux-kernel, Frederic Weisbecker, Linus Torvalds,
	Steven Rostedt, Rik van Riel, Scott J Norton

Sure, let me elaborate.

Executive summary:
  Yes, enabling a process-wide timer in such a large machine is not 
wise, but sometimes users/applications cannot avoid it.


The issue was observed actually not in a database itself but in a common 
library it links to; gperftools.

The database itself is optimized for many-cores/sockets, so surely it 
avoids putting a process-wide timer or other unscalable things. It just 
links to libprofiler for an optional feature to profile performance 
bottleneck only when the user turns it on. We of course avoid turning 
the feature on unless while we debug/tune the database.

However, libprofiler sets the timer even when the client program doesn't 
invoke any of its functions: libprofiler does it when the shared library 
is loaded. We requested the developer of libprofiler to change the 
behavior, but seems like there is a reason to keep that behavior:
   https://code.google.com/p/gperftools/issues/detail?id=133

Based on this, I think there are two reasons why we should ameliorate 
this issue in kernel layer.


1. In the particular case, it's hard to prevent or even detect the issue 
in user space.

We (a team of low-level database and kernel experts) in fact spent huge 
amount of time to just figure out what's the bottleneck there because 
nothing measurable happens in user space. I pulled out countless hairs.

Also, the user has to de-link the library from the application to 
prevent the itimer installation. Imagine a case where the software is 
proprietary. It won't fly.


2. This is just one example. There could be many other such 
binaries/libraries that do similar things somewhere in a complex 
software stack.

Today we haven't heard of many such cases, but people will start hitting 
it once 100s~1,000s of cores become common.


After applying this patchset, we have observed that the performance hit 
almost completely went away at least for 240 cores. So, it's quite 
beneficial in real world.

-- 
Hideaki Kimura

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26  3:17 ` [PATCH 3/3] timer: Reduce unnecessary sighand lock contention Jason Low
  2015-08-26 17:53   ` Linus Torvalds
@ 2015-08-26 22:56   ` Frederic Weisbecker
  2015-08-26 23:32     ` Jason Low
  1 sibling, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-26 22:56 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton

On Tue, Aug 25, 2015 at 08:17:48PM -0700, Jason Low wrote:
> It was found while running a database workload on large systems that
> significant time was spent trying to acquire the sighand lock.
> 
> The issue was that whenever an itimer expired, many threads ended up
> simultaneously trying to send the signal. Most of the time, nothing
> happened after acquiring the sighand lock because another thread
> had already sent the signal and updated the "next expire" time. The
> fastpath_timer_check() didn't help much since the "next expire" time
> was updated later.
>  
> This patch addresses this by having the thread_group_cputimer structure
> maintain a boolean to signify when a thread in the group is already
> checking for process wide timers, and adds extra logic in the fastpath
> to check the boolean.
> 
> Signed-off-by: Jason Low <jason.low2@hp.com>
> ---
>  include/linux/init_task.h      |    1 +
>  include/linux/sched.h          |    3 +++
>  kernel/time/posix-cpu-timers.c |   19 +++++++++++++++++--
>  3 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> index d0b380e..3350c77 100644
> --- a/include/linux/init_task.h
> +++ b/include/linux/init_task.h
> @@ -53,6 +53,7 @@ extern struct fs_struct init_fs;
>  	.cputimer	= { 						\
>  		.cputime_atomic	= INIT_CPUTIME_ATOMIC,			\
>  		.running	= 0,					\
> +		.checking_timer	= 0,					\
>  	},								\
>  	INIT_PREV_CPUTIME(sig)						\
>  	.cred_guard_mutex =						\
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 119823d..a6c8334 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -619,6 +619,8 @@ struct task_cputime_atomic {
>   * @cputime_atomic:	atomic thread group interval timers.
>   * @running:		non-zero when there are timers running and
>   * 			@cputime receives updates.
> + * @checking_timer:	non-zero when a thread is in the process of
> + *			checking for thread group timers.
>   *
>   * This structure contains the version of task_cputime, above, that is
>   * used for thread group CPU timer calculations.
> @@ -626,6 +628,7 @@ struct task_cputime_atomic {
>  struct thread_group_cputimer {
>  	struct task_cputime_atomic cputime_atomic;
>  	int running;
> +	int checking_timer;

How about a flag in the "running" field instead?

1) Space in signal_struct is not as important as in task_strut but it
   still matters.

2) We already read the "running" field locklessly. Adding a new field like
   checking_timer gets even more complicated. Ideally there should be at
   least a paired memory barrier between both. Let's just simplify that
   with a single field.

Now concerning the solution for your problem, I'm a bit uncomfortable with
lockless magics like this. When the thread sets checking_timer to 1, there is
no guarantee that the other threads in the process will see it "fast" enough
to avoid the slow path checks. Then there is also the risk that the threads
don't see "fast" enough that checking_timers has toggled to 0 and as a result
a timer may expire late. Now the lockless access of "running" already induces
such race. So if it really solves issues in practice, why not.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 22:31     ` Frederic Weisbecker
@ 2015-08-26 22:57       ` Jason Low
  0 siblings, 0 replies; 34+ messages in thread
From: Jason Low @ 2015-08-26 22:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Linus Torvalds, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Oleg Nesterov, Paul E. McKenney, Linux Kernel Mailing List,
	Davidlohr Bueso, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton, jason.low2

On Thu, 2015-08-27 at 00:31 +0200, Frederic Weisbecker wrote:
> On Wed, Aug 26, 2015 at 10:53:35AM -0700, Linus Torvalds wrote:
> > On Tue, Aug 25, 2015 at 8:17 PM, Jason Low <jason.low2@hp.com> wrote:
> > >
> > > This patch addresses this by having the thread_group_cputimer structure
> > > maintain a boolean to signify when a thread in the group is already
> > > checking for process wide timers, and adds extra logic in the fastpath
> > > to check the boolean.
> > 
> > It is not at all obvious why the unlocked read of that variable is
> > safe, and why there is no race with another thread just about to end
> > its check_process_timers().
> 
> The risk is when a next timer is going to expire soon after we relaxed
> the "checking" variable due to a recent expiration. The thread which
> expires the next timer may still see a stale value on the "checking"
> state and therefore delay the timer firing until the new value is seen.
> So the worst that can happen is that the timer firing gets delayed for
> X jiffies (I guess in practice it's only 1 jiffy).
> 
> That said, posix cpu timers already suffer such race because
> sig->cputimer.running itself is checked outside the sighand lock anyway.

Right, in the worst case, the next thread that comes along will handle
it. The current code already expects that level of granularity, so this
shouldn't change anything much. For example, there is almost always a
delay between a timer expiring and when a thread actually calls into
check_process_timers().

> > I can well imagine that this is all perfectly safe and fine, but I'd
> > really like to see that comment about _why_ that's the case, and why a
> > completely unlocked access without even memory barriers is fine.
> 
> Agreed, there should be a comment about that in the code (that is already full
> of undocumented subtleties).

Okay, I will add more comments about this.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26 22:53         ` Hideaki Kimura
@ 2015-08-26 23:13           ` Frederic Weisbecker
  2015-08-26 23:45             ` Hideaki Kimura
  0 siblings, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-26 23:13 UTC (permalink / raw)
  To: Hideaki Kimura
  Cc: Jason Low, Oleg Nesterov, Andrew Morton, Peter Zijlstra,
	Ingo Molnar, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	Linus Torvalds, Steven Rostedt, Rik van Riel, Scott J Norton

On Wed, Aug 26, 2015 at 03:53:26PM -0700, Hideaki Kimura wrote:
> Sure, let me elaborate.
> 
> Executive summary:
>  Yes, enabling a process-wide timer in such a large machine is not wise, but
> sometimes users/applications cannot avoid it.
> 
> 
> The issue was observed actually not in a database itself but in a common
> library it links to; gperftools.
> 
> The database itself is optimized for many-cores/sockets, so surely it avoids
> putting a process-wide timer or other unscalable things. It just links to
> libprofiler for an optional feature to profile performance bottleneck only
> when the user turns it on. We of course avoid turning the feature on unless
> while we debug/tune the database.
> 
> However, libprofiler sets the timer even when the client program doesn't
> invoke any of its functions: libprofiler does it when the shared library is
> loaded. We requested the developer of libprofiler to change the behavior,
> but seems like there is a reason to keep that behavior:
>   https://code.google.com/p/gperftools/issues/detail?id=133
> 
> Based on this, I think there are two reasons why we should ameliorate this
> issue in kernel layer.
> 
> 
> 1. In the particular case, it's hard to prevent or even detect the issue in
> user space.
> 
> We (a team of low-level database and kernel experts) in fact spent huge
> amount of time to just figure out what's the bottleneck there because
> nothing measurable happens in user space. I pulled out countless hairs.
> 
> Also, the user has to de-link the library from the application to prevent
> the itimer installation. Imagine a case where the software is proprietary.
> It won't fly.
> 
> 
> 2. This is just one example. There could be many other such
> binaries/libraries that do similar things somewhere in a complex software
> stack.
> 
> Today we haven't heard of many such cases, but people will start hitting it
> once 100s~1,000s of cores become common.
> 
> 
> After applying this patchset, we have observed that the performance hit
> almost completely went away at least for 240 cores. So, it's quite
> beneficial in real world.

I can easily imagine that many code incidentally use posix cpu timers when
it's not strictly required. But it doesn't look right to fix the kernel
for that. For this simple reason: posix cpu timers, even after your fix,
should introduce noticeable overhead. All threads of a process with a timer
enqueued in elapse the cputime in a shared atomic variable. Add to that the
overhead of enqueuing the timer, firing it. There is a bunch of scalability
issue there.

> 
> -- 
> Hideaki Kimura

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 22:56   ` Frederic Weisbecker
@ 2015-08-26 23:32     ` Jason Low
  2015-08-27  4:52       ` Jason Low
  2015-08-27 12:53       ` Frederic Weisbecker
  0 siblings, 2 replies; 34+ messages in thread
From: Jason Low @ 2015-08-26 23:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton, jason.low2

On Thu, 2015-08-27 at 00:56 +0200, Frederic Weisbecker wrote:
> On Tue, Aug 25, 2015 at 08:17:48PM -0700, Jason Low wrote:
> > It was found while running a database workload on large systems that
> > significant time was spent trying to acquire the sighand lock.
> > 
> > The issue was that whenever an itimer expired, many threads ended up
> > simultaneously trying to send the signal. Most of the time, nothing
> > happened after acquiring the sighand lock because another thread
> > had already sent the signal and updated the "next expire" time. The
> > fastpath_timer_check() didn't help much since the "next expire" time
> > was updated later.
> >  
> > This patch addresses this by having the thread_group_cputimer structure
> > maintain a boolean to signify when a thread in the group is already
> > checking for process wide timers, and adds extra logic in the fastpath
> > to check the boolean.
> > 
> > Signed-off-by: Jason Low <jason.low2@hp.com>
> > ---
> >  include/linux/init_task.h      |    1 +
> >  include/linux/sched.h          |    3 +++
> >  kernel/time/posix-cpu-timers.c |   19 +++++++++++++++++--
> >  3 files changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> > index d0b380e..3350c77 100644
> > --- a/include/linux/init_task.h
> > +++ b/include/linux/init_task.h
> > @@ -53,6 +53,7 @@ extern struct fs_struct init_fs;
> >  	.cputimer	= { 						\
> >  		.cputime_atomic	= INIT_CPUTIME_ATOMIC,			\
> >  		.running	= 0,					\
> > +		.checking_timer	= 0,					\
> >  	},								\
> >  	INIT_PREV_CPUTIME(sig)						\
> >  	.cred_guard_mutex =						\
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 119823d..a6c8334 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -619,6 +619,8 @@ struct task_cputime_atomic {
> >   * @cputime_atomic:	atomic thread group interval timers.
> >   * @running:		non-zero when there are timers running and
> >   * 			@cputime receives updates.
> > + * @checking_timer:	non-zero when a thread is in the process of
> > + *			checking for thread group timers.
> >   *
> >   * This structure contains the version of task_cputime, above, that is
> >   * used for thread group CPU timer calculations.
> > @@ -626,6 +628,7 @@ struct task_cputime_atomic {
> >  struct thread_group_cputimer {
> >  	struct task_cputime_atomic cputime_atomic;
> >  	int running;
> > +	int checking_timer;
> 
> How about a flag in the "running" field instead?
> 
> 1) Space in signal_struct is not as important as in task_strut but it
>    still matters.

George Spelvin suggested that we convert them to booleans which would
make them take up 2 bytes.

> 2) We already read the "running" field locklessly. Adding a new field like
>    checking_timer gets even more complicated. Ideally there should be at
>    least a paired memory barrier between both. Let's just simplify that
>    with a single field.

hmmm, so having 1 "flag" where we access bits for the "running" and
"checking_timer"?

> Now concerning the solution for your problem, I'm a bit uncomfortable with
> lockless magics like this. When the thread sets checking_timer to 1, there is
> no guarantee that the other threads in the process will see it "fast" enough
> to avoid the slow path checks. Then there is also the risk that the threads
> don't see "fast" enough that checking_timers has toggled to 0 and as a result
> a timer may expire late. Now the lockless access of "running" already induces
> such race. So if it really solves issues in practice, why not.

Perhaps to be safer, we use something like load_acquire() and
store_release() for accessing both the ->running and ->checking_timer
fields?


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26 23:13           ` Frederic Weisbecker
@ 2015-08-26 23:45             ` Hideaki Kimura
  2015-08-27 13:18               ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Hideaki Kimura @ 2015-08-26 23:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Jason Low, Oleg Nesterov, Andrew Morton, Peter Zijlstra,
	Ingo Molnar, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	Linus Torvalds, Steven Rostedt, Rik van Riel, Scott J Norton



On 08/26/2015 04:13 PM, Frederic Weisbecker wrote:
> On Wed, Aug 26, 2015 at 03:53:26PM -0700, Hideaki Kimura wrote:
>> Sure, let me elaborate.
>>
>> Executive summary:
>>   Yes, enabling a process-wide timer in such a large machine is not wise, but
>> sometimes users/applications cannot avoid it.
>>
>>
>> The issue was observed actually not in a database itself but in a common
>> library it links to; gperftools.
>>
>> The database itself is optimized for many-cores/sockets, so surely it avoids
>> putting a process-wide timer or other unscalable things. It just links to
>> libprofiler for an optional feature to profile performance bottleneck only
>> when the user turns it on. We of course avoid turning the feature on unless
>> while we debug/tune the database.
>>
>> However, libprofiler sets the timer even when the client program doesn't
>> invoke any of its functions: libprofiler does it when the shared library is
>> loaded. We requested the developer of libprofiler to change the behavior,
>> but seems like there is a reason to keep that behavior:
>>    https://code.google.com/p/gperftools/issues/detail?id=133
>>
>> Based on this, I think there are two reasons why we should ameliorate this
>> issue in kernel layer.
>>
>>
>> 1. In the particular case, it's hard to prevent or even detect the issue in
>> user space.
>>
>> We (a team of low-level database and kernel experts) in fact spent huge
>> amount of time to just figure out what's the bottleneck there because
>> nothing measurable happens in user space. I pulled out countless hairs.
>>
>> Also, the user has to de-link the library from the application to prevent
>> the itimer installation. Imagine a case where the software is proprietary.
>> It won't fly.
>>
>>
>> 2. This is just one example. There could be many other such
>> binaries/libraries that do similar things somewhere in a complex software
>> stack.
>>
>> Today we haven't heard of many such cases, but people will start hitting it
>> once 100s~1,000s of cores become common.
>>
>>
>> After applying this patchset, we have observed that the performance hit
>> almost completely went away at least for 240 cores. So, it's quite
>> beneficial in real world.
>
> I can easily imagine that many code incidentally use posix cpu timers when
> it's not strictly required. But it doesn't look right to fix the kernel
> for that. For this simple reason: posix cpu timers, even after your fix,
> should introduce noticeable overhead. All threads of a process with a timer
> enqueued in elapse the cputime in a shared atomic variable. Add to that the
> overhead of enqueuing the timer, firing it. There is a bunch of scalability
> issue there.

I totally agree that this is not a perfect solution. If there are 10x 
more cores and sockets, just the atomic fetch_add might be too expensive.

However, it's comparatively/realistically the best thing we can do 
without any drawbacks. We can't magically force all library developers 
to write the most scalable code always.

My point is: this is a safety net, and a very effective one.

-- 
Hideaki Kimura

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 23:32     ` Jason Low
@ 2015-08-27  4:52       ` Jason Low
  2015-08-27 12:53       ` Frederic Weisbecker
  1 sibling, 0 replies; 34+ messages in thread
From: Jason Low @ 2015-08-27  4:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Rik van Riel, Scott J Norton,
	jason.low2

On Wed, 2015-08-26 at 16:32 -0700, Jason Low wrote:

> Perhaps to be safer, we use something like load_acquire() and
> store_release() for accessing both the ->running and ->checking_timer
> fields?

Regarding using barriers, one option could be to pair them between
sig->cputime_expires and the sig->cputimer.checking_timer accesses.

fastpath_timer_check()
{
	...

        if (READ_ONCE(sig->cputimer.running))
                struct task_cputime group_sample;

                sample_cputime_atomic(&group_sample, &sig->cputimer.cputime_atomic);

                if (task_cputime_expired(&group_sample, &sig->cputime_expires)) {
			/*
			 * Comments
			 */
                        mb();

                        if (!READ_ONCE(sig->cputimer.checking_timer))
                                return 1;
                }
        }
}

check_process_timers()
{
	...

        WRITE_ONCE(sig->cputimer.checking_timer, 0);

	/*
	 * Comments
	 */
        mb();

        sig->cputime_expires.prof_exp = expires_to_cputime(prof_expires);
        sig->cputime_expires.virt_exp = expires_to_cputime(virt_expires);
        sig->cputime_expires.sched_exp = sched_expires;

	...	
}

By the time the cputime_expires fields get updated at the end of
check_process_timers(), other threads in the fastpath_timer_check()
should observe the value 0 for READ_ONCE(sig->cputimer.checking_timer).

In the case where threads in the fastpath don't observe the
WRITE_ONCE(checking_timer, 1) early enough, that's fine, since it will
just (unnecessarily) go through the slowpath which is what we also do in
the current code.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 23:32     ` Jason Low
  2015-08-27  4:52       ` Jason Low
@ 2015-08-27 12:53       ` Frederic Weisbecker
  2015-08-27 20:29         ` Jason Low
  1 sibling, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-27 12:53 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton

On Wed, Aug 26, 2015 at 04:32:34PM -0700, Jason Low wrote:
> On Thu, 2015-08-27 at 00:56 +0200, Frederic Weisbecker wrote:
> > On Tue, Aug 25, 2015 at 08:17:48PM -0700, Jason Low wrote:
> > > It was found while running a database workload on large systems that
> > > significant time was spent trying to acquire the sighand lock.
> > > 
> > > The issue was that whenever an itimer expired, many threads ended up
> > > simultaneously trying to send the signal. Most of the time, nothing
> > > happened after acquiring the sighand lock because another thread
> > > had already sent the signal and updated the "next expire" time. The
> > > fastpath_timer_check() didn't help much since the "next expire" time
> > > was updated later.
> > >  
> > > This patch addresses this by having the thread_group_cputimer structure
> > > maintain a boolean to signify when a thread in the group is already
> > > checking for process wide timers, and adds extra logic in the fastpath
> > > to check the boolean.
> > > 
> > > Signed-off-by: Jason Low <jason.low2@hp.com>
> > > ---
> > >  include/linux/init_task.h      |    1 +
> > >  include/linux/sched.h          |    3 +++
> > >  kernel/time/posix-cpu-timers.c |   19 +++++++++++++++++--
> > >  3 files changed, 21 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> > > index d0b380e..3350c77 100644
> > > --- a/include/linux/init_task.h
> > > +++ b/include/linux/init_task.h
> > > @@ -53,6 +53,7 @@ extern struct fs_struct init_fs;
> > >  	.cputimer	= { 						\
> > >  		.cputime_atomic	= INIT_CPUTIME_ATOMIC,			\
> > >  		.running	= 0,					\
> > > +		.checking_timer	= 0,					\
> > >  	},								\
> > >  	INIT_PREV_CPUTIME(sig)						\
> > >  	.cred_guard_mutex =						\
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > index 119823d..a6c8334 100644
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -619,6 +619,8 @@ struct task_cputime_atomic {
> > >   * @cputime_atomic:	atomic thread group interval timers.
> > >   * @running:		non-zero when there are timers running and
> > >   * 			@cputime receives updates.
> > > + * @checking_timer:	non-zero when a thread is in the process of
> > > + *			checking for thread group timers.
> > >   *
> > >   * This structure contains the version of task_cputime, above, that is
> > >   * used for thread group CPU timer calculations.
> > > @@ -626,6 +628,7 @@ struct task_cputime_atomic {
> > >  struct thread_group_cputimer {
> > >  	struct task_cputime_atomic cputime_atomic;
> > >  	int running;
> > > +	int checking_timer;
> > 
> > How about a flag in the "running" field instead?
> > 
> > 1) Space in signal_struct is not as important as in task_strut but it
> >    still matters.
> 
> George Spelvin suggested that we convert them to booleans which would
> make them take up 2 bytes.
> 
> > 2) We already read the "running" field locklessly. Adding a new field like
> >    checking_timer gets even more complicated. Ideally there should be at
> >    least a paired memory barrier between both. Let's just simplify that
> >    with a single field.
> 
> hmmm, so having 1 "flag" where we access bits for the "running" and
> "checking_timer"?

Sure, like:

#define CPUTIMER_RUNNING 0x1
#define CPUTIMER_CHECKING 0x2

struct thread_group_cputimer {
    struct task_cputime_atomic cputime_atomic;
    int status;
}

So from cputimer_running() you just need to check:

     if (cputimer->status & CPUTIMER_RUNNING)

And from run_posix_cpu_timer() fast-path:

     if (cputimer->status == CPUTIMER_RUNNING)

so that ignores CPUTIMER_CHECKING case.

> 
> > Now concerning the solution for your problem, I'm a bit uncomfortable with
> > lockless magics like this. When the thread sets checking_timer to 1, there is
> > no guarantee that the other threads in the process will see it "fast" enough
> > to avoid the slow path checks. Then there is also the risk that the threads
> > don't see "fast" enough that checking_timers has toggled to 0 and as a result
> > a timer may expire late. Now the lockless access of "running" already induces
> > such race. So if it really solves issues in practice, why not.
> 
> Perhaps to be safer, we use something like load_acquire() and
> store_release() for accessing both the ->running and ->checking_timer
> fields?

Well it depends against what we want to order them. If it's a single field
we don't need to order them together at least.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-26 23:45             ` Hideaki Kimura
@ 2015-08-27 13:18               ` Frederic Weisbecker
  2015-08-27 14:47                 ` Steven Rostedt
  0 siblings, 1 reply; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-27 13:18 UTC (permalink / raw)
  To: Hideaki Kimura
  Cc: Jason Low, Oleg Nesterov, Andrew Morton, Peter Zijlstra,
	Ingo Molnar, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	Linus Torvalds, Steven Rostedt, Rik van Riel, Scott J Norton

On Wed, Aug 26, 2015 at 04:45:44PM -0700, Hideaki Kimura wrote:
> I totally agree that this is not a perfect solution. If there are 10x more
> cores and sockets, just the atomic fetch_add might be too expensive.
> 
> However, it's comparatively/realistically the best thing we can do without
> any drawbacks. We can't magically force all library developers to write the
> most scalable code always.
> 
> My point is: this is a safety net, and a very effective one.

I mean the problem here is that a library uses an unscalable profiling feature,
unconditionally as soon as you load it without even initializing anything. And
this library is used in production.

At first sight, fixing that in the kernel is only a hack that just reduces a bit
the symptoms.

What is the technical issue that prevents from fixing that in the library itself?
Posix timers can be attached anytime.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-27 13:18               ` Frederic Weisbecker
@ 2015-08-27 14:47                 ` Steven Rostedt
  2015-08-27 15:09                   ` Thomas Gleixner
  0 siblings, 1 reply; 34+ messages in thread
From: Steven Rostedt @ 2015-08-27 14:47 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Hideaki Kimura, Jason Low, Oleg Nesterov, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Paul E. McKenney,
	linux-kernel, Linus Torvalds, Rik van Riel, Scott J Norton

On Thu, 27 Aug 2015 15:18:49 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Wed, Aug 26, 2015 at 04:45:44PM -0700, Hideaki Kimura wrote:
> > I totally agree that this is not a perfect solution. If there are 10x more
> > cores and sockets, just the atomic fetch_add might be too expensive.
> > 
> > However, it's comparatively/realistically the best thing we can do without
> > any drawbacks. We can't magically force all library developers to write the
> > most scalable code always.
> > 
> > My point is: this is a safety net, and a very effective one.
> 
> I mean the problem here is that a library uses an unscalable profiling feature,
> unconditionally as soon as you load it without even initializing anything. And
> this library is used in production.
> 
> At first sight, fixing that in the kernel is only a hack that just reduces a bit
> the symptoms.
> 
> What is the technical issue that prevents from fixing that in the library itself?
> Posix timers can be attached anytime.

I'm curious to what the downside of this patch set is? If we can fix a
problem that should be fixed in userspace, but does not harm the kernel
by doing so, is that bad? (an argument for kdbus? ;-)


As Hideaki noted, this could be a problem in other locations as well
that people have yet to find.

-- Steve

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-27 14:47                 ` Steven Rostedt
@ 2015-08-27 15:09                   ` Thomas Gleixner
  2015-08-27 15:17                     ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Thomas Gleixner @ 2015-08-27 15:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Hideaki Kimura, Jason Low, Oleg Nesterov,
	Andrew Morton, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	linux-kernel, Linus Torvalds, Rik van Riel, Scott J Norton

On Thu, 27 Aug 2015, Steven Rostedt wrote:
> On Thu, 27 Aug 2015 15:18:49 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Wed, Aug 26, 2015 at 04:45:44PM -0700, Hideaki Kimura wrote:
> > > I totally agree that this is not a perfect solution. If there are 10x more
> > > cores and sockets, just the atomic fetch_add might be too expensive.
> > > 
> > > However, it's comparatively/realistically the best thing we can do without
> > > any drawbacks. We can't magically force all library developers to write the
> > > most scalable code always.
> > > 
> > > My point is: this is a safety net, and a very effective one.
> > 
> > I mean the problem here is that a library uses an unscalable profiling feature,
> > unconditionally as soon as you load it without even initializing anything. And
> > this library is used in production.
> > 
> > At first sight, fixing that in the kernel is only a hack that just reduces a bit
> > the symptoms.
> > 
> > What is the technical issue that prevents from fixing that in the library itself?
> > Posix timers can be attached anytime.
> 
> I'm curious to what the downside of this patch set is? If we can fix a
> problem that should be fixed in userspace, but does not harm the kernel
> by doing so, is that bad? (an argument for kdbus? ;-)

The patches are not fixing a problem which should be fixed in user
space. They merily avoid lock contention which happens to be prominent
with that particular library. But avoiding lock contention even for 2
threads is a worthwhile exercise if it does not hurt otherwise. And I
can't see anything what hurts with these patches.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 0/3] timer: Improve itimers scalability
  2015-08-27 15:09                   ` Thomas Gleixner
@ 2015-08-27 15:17                     ` Frederic Weisbecker
  0 siblings, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-27 15:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, Hideaki Kimura, Jason Low, Oleg Nesterov,
	Andrew Morton, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	linux-kernel, Linus Torvalds, Rik van Riel, Scott J Norton

On Thu, Aug 27, 2015 at 05:09:21PM +0200, Thomas Gleixner wrote:
> On Thu, 27 Aug 2015, Steven Rostedt wrote:
> > On Thu, 27 Aug 2015 15:18:49 +0200
> > Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > On Wed, Aug 26, 2015 at 04:45:44PM -0700, Hideaki Kimura wrote:
> > > > I totally agree that this is not a perfect solution. If there are 10x more
> > > > cores and sockets, just the atomic fetch_add might be too expensive.
> > > > 
> > > > However, it's comparatively/realistically the best thing we can do without
> > > > any drawbacks. We can't magically force all library developers to write the
> > > > most scalable code always.
> > > > 
> > > > My point is: this is a safety net, and a very effective one.
> > > 
> > > I mean the problem here is that a library uses an unscalable profiling feature,
> > > unconditionally as soon as you load it without even initializing anything. And
> > > this library is used in production.
> > > 
> > > At first sight, fixing that in the kernel is only a hack that just reduces a bit
> > > the symptoms.
> > > 
> > > What is the technical issue that prevents from fixing that in the library itself?
> > > Posix timers can be attached anytime.
> > 
> > I'm curious to what the downside of this patch set is? If we can fix a
> > problem that should be fixed in userspace, but does not harm the kernel
> > by doing so, is that bad? (an argument for kdbus? ;-)
> 
> The patches are not fixing a problem which should be fixed in user
> space. They merily avoid lock contention which happens to be prominent
> with that particular library. But avoiding lock contention even for 2
> threads is a worthwhile exercise if it does not hurt otherwise. And I
> can't see anything what hurts with these patches.

Sure it shouldn't really hurt anyway, since the presense of elapsing timers
itself is checked locklessly.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-27 12:53       ` Frederic Weisbecker
@ 2015-08-27 20:29         ` Jason Low
  2015-08-27 21:12           ` Frederic Weisbecker
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Low @ 2015-08-27 20:29 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton, jason.low2

On Thu, 2015-08-27 at 14:53 +0200, Frederic Weisbecker wrote:
> On Wed, Aug 26, 2015 at 04:32:34PM -0700, Jason Low wrote:
> > On Thu, 2015-08-27 at 00:56 +0200, Frederic Weisbecker wrote:
> > > On Tue, Aug 25, 2015 at 08:17:48PM -0700, Jason Low wrote:
> > > > It was found while running a database workload on large systems that
> > > > significant time was spent trying to acquire the sighand lock.
> > > > 
> > > > The issue was that whenever an itimer expired, many threads ended up
> > > > simultaneously trying to send the signal. Most of the time, nothing
> > > > happened after acquiring the sighand lock because another thread
> > > > had already sent the signal and updated the "next expire" time. The
> > > > fastpath_timer_check() didn't help much since the "next expire" time
> > > > was updated later.
> > > >  
> > > > This patch addresses this by having the thread_group_cputimer structure
> > > > maintain a boolean to signify when a thread in the group is already
> > > > checking for process wide timers, and adds extra logic in the fastpath
> > > > to check the boolean.
> > > > 
> > > > Signed-off-by: Jason Low <jason.low2@hp.com>
> > > > ---
> > > >  include/linux/init_task.h      |    1 +
> > > >  include/linux/sched.h          |    3 +++
> > > >  kernel/time/posix-cpu-timers.c |   19 +++++++++++++++++--
> > > >  3 files changed, 21 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> > > > index d0b380e..3350c77 100644
> > > > --- a/include/linux/init_task.h
> > > > +++ b/include/linux/init_task.h
> > > > @@ -53,6 +53,7 @@ extern struct fs_struct init_fs;
> > > >  	.cputimer	= { 						\
> > > >  		.cputime_atomic	= INIT_CPUTIME_ATOMIC,			\
> > > >  		.running	= 0,					\
> > > > +		.checking_timer	= 0,					\
> > > >  	},								\
> > > >  	INIT_PREV_CPUTIME(sig)						\
> > > >  	.cred_guard_mutex =						\
> > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > index 119823d..a6c8334 100644
> > > > --- a/include/linux/sched.h
> > > > +++ b/include/linux/sched.h
> > > > @@ -619,6 +619,8 @@ struct task_cputime_atomic {
> > > >   * @cputime_atomic:	atomic thread group interval timers.
> > > >   * @running:		non-zero when there are timers running and
> > > >   * 			@cputime receives updates.
> > > > + * @checking_timer:	non-zero when a thread is in the process of
> > > > + *			checking for thread group timers.
> > > >   *
> > > >   * This structure contains the version of task_cputime, above, that is
> > > >   * used for thread group CPU timer calculations.
> > > > @@ -626,6 +628,7 @@ struct task_cputime_atomic {
> > > >  struct thread_group_cputimer {
> > > >  	struct task_cputime_atomic cputime_atomic;
> > > >  	int running;
> > > > +	int checking_timer;
> > > 
> > > How about a flag in the "running" field instead?
> > > 
> > > 1) Space in signal_struct is not as important as in task_strut but it
> > >    still matters.
> > 
> > George Spelvin suggested that we convert them to booleans which would
> > make them take up 2 bytes.
> > 
> > > 2) We already read the "running" field locklessly. Adding a new field like
> > >    checking_timer gets even more complicated. Ideally there should be at
> > >    least a paired memory barrier between both. Let's just simplify that
> > >    with a single field.
> > 
> > hmmm, so having 1 "flag" where we access bits for the "running" and
> > "checking_timer"?
> 
> Sure, like:
> 
> #define CPUTIMER_RUNNING 0x1
> #define CPUTIMER_CHECKING 0x2
> 
> struct thread_group_cputimer {
>     struct task_cputime_atomic cputime_atomic;
>     int status;
> }
> 
> So from cputimer_running() you just need to check:
> 
>      if (cputimer->status & CPUTIMER_RUNNING)
> 
> And from run_posix_cpu_timer() fast-path:
> 
>      if (cputimer->status == CPUTIMER_RUNNING)
> 
> so that ignores CPUTIMER_CHECKING case.

Right, having just 1 "status" field can simply things a bit. The
(cputimer->status == CPUTIMER_RUNNING) check does appear misleading
though, since we're technically not only checking for if the "cputimer
is running".

Maybe something like:

int status = cputimer->status;
if ((status & CPUTIMER_RUNNING) && !(status & CPUTIMER_CHECKING))

makes it more obvious what's going on here.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-27 20:29         ` Jason Low
@ 2015-08-27 21:12           ` Frederic Weisbecker
  0 siblings, 0 replies; 34+ messages in thread
From: Frederic Weisbecker @ 2015-08-27 21:12 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Linus Torvalds, Davidlohr Bueso,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton

On Thu, Aug 27, 2015 at 01:29:50PM -0700, Jason Low wrote:
> On Thu, 2015-08-27 at 14:53 +0200, Frederic Weisbecker wrote:
> > Sure, like:
> > 
> > #define CPUTIMER_RUNNING 0x1
> > #define CPUTIMER_CHECKING 0x2
> > 
> > struct thread_group_cputimer {
> >     struct task_cputime_atomic cputime_atomic;
> >     int status;
> > }
> > 
> > So from cputimer_running() you just need to check:
> > 
> >      if (cputimer->status & CPUTIMER_RUNNING)
> > 
> > And from run_posix_cpu_timer() fast-path:
> > 
> >      if (cputimer->status == CPUTIMER_RUNNING)
> > 
> > so that ignores CPUTIMER_CHECKING case.
> 
> Right, having just 1 "status" field can simply things a bit. The
> (cputimer->status == CPUTIMER_RUNNING) check does appear misleading
> though, since we're technically not only checking for if the "cputimer
> is running".
> 
> Maybe something like:
> 
> int status = cputimer->status;
> if ((status & CPUTIMER_RUNNING) && !(status & CPUTIMER_CHECKING))
> 
> makes it more obvious what's going on here.

The result is the same but it may clarify the code indeed.

Thanks.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] timer: Optimize fastpath_timer_check()
  2015-08-26  3:17 ` [PATCH 1/3] timer: Optimize fastpath_timer_check() Jason Low
  2015-08-26 21:57   ` Frederic Weisbecker
@ 2015-08-31 15:15   ` Davidlohr Bueso
  2015-08-31 19:40     ` Jason Low
  1 sibling, 1 reply; 34+ messages in thread
From: Davidlohr Bueso @ 2015-08-31 15:15 UTC (permalink / raw)
  To: Jason Low
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Frederic Weisbecker,
	Linus Torvalds, Steven Rostedt, Andrew Morton, Terry Rudd,
	Rik van Riel, Scott J Norton

On Tue, 2015-08-25 at 20:17 -0700, Jason Low wrote:
> In fastpath_timer_check(), the task_cputime() function is always
> called to compute the utime and stime values. However, this is not
> necessary if there are no per-thread timers to check for. This patch
> modifies the code such that we compute the task_cputime values only
> when there are per-thread timers set.
> 
> Signed-off-by: Jason Low <jason.low2@hp.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/3] timer: Optimize fastpath_timer_check()
  2015-08-31 15:15   ` Davidlohr Bueso
@ 2015-08-31 19:40     ` Jason Low
  0 siblings, 0 replies; 34+ messages in thread
From: Jason Low @ 2015-08-31 19:40 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Oleg Nesterov,
	Paul E. McKenney, linux-kernel, Frederic Weisbecker,
	Steven Rostedt, Andrew Morton, Terry Rudd, Rik van Riel,
	Scott J Norton, jason.low2

On Mon, 2015-08-31 at 08:15 -0700, Davidlohr Bueso wrote:
> On Tue, 2015-08-25 at 20:17 -0700, Jason Low wrote:
> > In fastpath_timer_check(), the task_cputime() function is always
> > called to compute the utime and stime values. However, this is not
> > necessary if there are no per-thread timers to check for. This patch
> > modifies the code such that we compute the task_cputime values only
> > when there are per-thread timers set.
> > 
> > Signed-off-by: Jason Low <jason.low2@hp.com>
> 
> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

Thanks David and Frederic.

I want to mention that this patch has been slightly changed. As
suggested by George, those extra utime, stime local variables are not
needed anymore, and we should be able to directly call:

task_cputime(tsk, &task_sample.utime, &task_sample.stime);

---
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 892e3da..3a8c5b4 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1117,16 +1117,13 @@ static inline int task_cputime_expired(const struct task_cputime *sample,
 static inline int fastpath_timer_check(struct task_struct *tsk)
 {
        struct signal_struct *sig;
-       cputime_t utime, stime;
-
-       task_cputime(tsk, &utime, &stime);

        if (!task_cputime_zero(&tsk->cputime_expires)) {
-               struct task_cputime task_sample = {
-                       .utime = utime,
-                       .stime = stime,
-                       .sum_exec_runtime = tsk->se.sum_exec_runtime
-               };
+               struct task_cputime task_sample;
+
+               task_cputime(tsk, &task_sample.utime, &task_sample.stime);
+
+               task_sample.sum_exec_runtime = tsk->se.sum_exec_runtime;

                if (task_cputime_expired(&task_sample, &tsk->cputime_expires))
                        return 1;


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-27 22:43       ` George Spelvin
@ 2015-08-28  4:32         ` Jason Low
  0 siblings, 0 replies; 34+ messages in thread
From: Jason Low @ 2015-08-28  4:32 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-kernel, jason.low2

On Thu, 2015-08-27 at 18:43 -0400, George Spelvin wrote:
> Jason Low wrote:
> > Frederic suggested that we just use a single "status" variable and
> > access the bits for the running and checking field. I am leaning towards
> > that method, so I might not include the rest of the boolean changes in
> > this patchset.
> 
> Don't worry, I'm not offended.  I just started editing and figured
> I might as well share it.
> 
> Whichever solution is easier.  My only complaint about bitmapped variables
> is that so many are "unsigned long" because the Linux atomic access
> primitives, originally designed for file system bitmaps, use that type.
> 
> But using that for a non array wastes 4 bytes on 64-bit platforms that
> can't be used if the code is to work on 32-bit ones.
> 
> >> E.g. suppose a process fails to notice that it blew past a CPU time
> >> timeout before blocking.  Does anything guarantee that it will get
> >> the timeout signal in finite real time?
> >
> > Yep, the check_process_timers will get called again during the next
> > scheduler interrupt (approximately after 1 jiffy) and send the signal if
> > it finds that the timer expired then.
> 
> Will it?  I thought it got called on the running process only.
> Which is not the (blocked by assumption) process of interest.
> 
> I don't suspect that this would be a problem in practice, as CPU-time
> timers are used on activities which use a *lot* of it.  But it
> seemed like a flaw worth either acknowledging or disproving.

You're right, this is only called on running processes. If the process
is blocked, sending the signal can get delayed. However, in practice,
this is not really a problem as you mentioned. Also, this "issue" is
also not really avoidable even without this patch. For instance, a timer
may expire and the process can get block before the next scheduler
interrupt.

Thanks,
Jason


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-27 21:55     ` Jason Low
@ 2015-08-27 22:43       ` George Spelvin
  2015-08-28  4:32         ` Jason Low
  0 siblings, 1 reply; 34+ messages in thread
From: George Spelvin @ 2015-08-27 22:43 UTC (permalink / raw)
  To: jason.low2, linux; +Cc: linux-kernel

Jason Low wrote:
> Frederic suggested that we just use a single "status" variable and
> access the bits for the running and checking field. I am leaning towards
> that method, so I might not include the rest of the boolean changes in
> this patchset.

Don't worry, I'm not offended.  I just started editing and figured
I might as well share it.

Whichever solution is easier.  My only complaint about bitmapped variables
is that so many are "unsigned long" because the Linux atomic access
primitives, originally designed for file system bitmaps, use that type.

But using that for a non array wastes 4 bytes on 64-bit platforms that
can't be used if the code is to work on 32-bit ones.

>> E.g. suppose a process fails to notice that it blew past a CPU time
>> timeout before blocking.  Does anything guarantee that it will get
>> the timeout signal in finite real time?
>
> Yep, the check_process_timers will get called again during the next
> scheduler interrupt (approximately after 1 jiffy) and send the signal if
> it finds that the timer expired then.

Will it?  I thought it got called on the running process only.
Which is not the (blocked by assumption) process of interest.

I don't suspect that this would be a problem in practice, as CPU-time
timers are used on activities which use a *lot* of it.  But it
seemed like a flaw worth either acknowledging or disproving.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-27  1:28   ` George Spelvin
@ 2015-08-27 21:55     ` Jason Low
  2015-08-27 22:43       ` George Spelvin
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Low @ 2015-08-27 21:55 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-kernel, jason.low2

On Wed, 2015-08-26 at 21:28 -0400, George Spelvin wrote:
> > I can include your patch in the series and then use boolean for the new
> > checking_timer field. However, it looks like this applies on an old
> > kernel. For example, the spin_lock field has already been removed from
> > the structure.
> 
> Apologies; that was 4.1.6.  A 4.2-rc8 patch is appended (it's a pretty
> trivial merge once you look at it).

Frederic suggested that we just use a single "status" variable and
access the bits for the running and checking field. I am leaning towards
that method, so I might not include the rest of the boolean changes in
this patchset.

> > The spinlock call has already been removed from a previous patch. The
> > issue now is with contention with the sighand lock.
> 
> I'll look some more and try to wrap my head around it.
> 
> >> Or is it basically okay if this is massively racey, since process-wide
> >> CPU timers are inherently sloppy.  A race will just cause an expiration
> >> check to be missed, but it will be retried soon anyway.
> 
> > Yes, the worst case scenario is that we wait until the next thread to
> > come along and handle the next expired timer. However, this "delay"
> > already occurs now (for example: a timer can expire right after a thread
> > exits check_process_timers()).
> 
> Ad is this polled, or is there some non-polled system that will trigger
> another call to check_process_timers().
> 
> E.g. suppose a process fails to notice that it blew past a CPU time
> timeout before blocking.  Does anything guarantee that it will get
> the timeout signal in finite real time?

Yep, the check_process_timers will get called again during the next
scheduler interrupt (approximately after 1 jiffy) and send the signal if
it finds that the timer expired then.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 23:44 ` Jason Low
@ 2015-08-27  1:28   ` George Spelvin
  2015-08-27 21:55     ` Jason Low
  0 siblings, 1 reply; 34+ messages in thread
From: George Spelvin @ 2015-08-27  1:28 UTC (permalink / raw)
  To: jason.low2; +Cc: linux-kernel, linux

> I can include your patch in the series and then use boolean for the new
> checking_timer field. However, it looks like this applies on an old
> kernel. For example, the spin_lock field has already been removed from
> the structure.

Apologies; that was 4.1.6.  A 4.2-rc8 patch is appended (it's a pretty
trivial merge once you look at it).

> The spinlock call has already been removed from a previous patch. The
> issue now is with contention with the sighand lock.

I'll look some more and try to wrap my head around it.

>> Or is it basically okay if this is massively racey, since process-wide
>> CPU timers are inherently sloppy.  A race will just cause an expiration
>> check to be missed, but it will be retried soon anyway.

> Yes, the worst case scenario is that we wait until the next thread to
> come along and handle the next expired timer. However, this "delay"
> already occurs now (for example: a timer can expire right after a thread
> exits check_process_timers()).

Ad is this polled, or is there some non-polled system that will trigger
another call to check_process_timers().

E.g. suppose a process fails to notice that it blew past a CPU time
timeout before blocking.  Does anything guarantee that it will get
the timeout signal in finite real time?



>From 95349f9b16c30aea518ce79d72e8e0f0c5d12069 Mon Sep 17 00:00:00 2001
From: George Spelvin <linux@horizon.com>
Date: Wed, 26 Aug 2015 19:15:54 +0000
Subject: [PATCH] timer: Use bool more in kernel/time/posix-cpu-timers.c

This is mostly function return values, for documentation.

One structure field is changed (from int), but alignment
padding precludes any actual space saving.

Signed-off-by: George Spelvin <linux@horizon.com>

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 04b5ada4..0cd80f76 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -607,7 +607,7 @@ struct task_cputime_atomic {
 /**
  * struct thread_group_cputimer - thread group interval timer counts
  * @cputime_atomic:	atomic thread group interval timers.
- * @running:		non-zero when there are timers running and
+ * @running:		%true when there are timers running and
  * 			@cputime receives updates.
  *
  * This structure contains the version of task_cputime, above, that is
@@ -615,7 +615,7 @@ struct task_cputime_atomic {
  */
 struct thread_group_cputimer {
 	struct task_cputime_atomic cputime_atomic;
-	int running;
+	bool running;
 };
 
 #include <linux/rwsem.h>
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 892e3dae..106368c5 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -113,14 +113,12 @@ static void bump_cpu_timer(struct k_itimer *timer,
  *
  * @cputime:	The struct to compare.
  *
- * Checks @cputime to see if all fields are zero.  Returns true if all fields
- * are zero, false if any field is nonzero.
+ * Checks @cputime to see if all fields are zero.  Returns %true if all fields
+ * are zero, %false if any field is nonzero.
  */
-static inline int task_cputime_zero(const struct task_cputime *cputime)
+static inline bool task_cputime_zero(const struct task_cputime *cputime)
 {
-	if (!cputime->utime && !cputime->stime && !cputime->sum_exec_runtime)
-		return 1;
-	return 0;
+	return !cputime->utime && !cputime->stime && !cputime->sum_exec_runtime;
 }
 
 static inline unsigned long long prof_ticks(struct task_struct *p)
@@ -249,7 +247,7 @@ void thread_group_cputimer(struct task_struct *tsk, struct task_cputime *times)
 		 * but barriers are not required because update_gt_cputime()
 		 * can handle concurrent updates.
 		 */
-		WRITE_ONCE(cputimer->running, 1);
+		WRITE_ONCE(cputimer->running, true);
 	}
 	sample_cputime_atomic(times, &cputimer->cputime_atomic);
 }
@@ -458,8 +456,9 @@ void posix_cpu_timers_exit_group(struct task_struct *tsk)
 	cleanup_timers(tsk->signal->cpu_timers);
 }
 
-static inline int expires_gt(cputime_t expires, cputime_t new_exp)
+static inline bool expires_gt(cputime_t expires, cputime_t new_exp)
 {
+	/* Could also be written "expires - 1 >= new_exp" */
 	return expires == 0 || expires > new_exp;
 }
 
@@ -911,7 +910,7 @@ static inline void stop_process_timers(struct signal_struct *sig)
 	struct thread_group_cputimer *cputimer = &sig->cputimer;
 
 	/* Turn off cputimer->running. This is done without locking. */
-	WRITE_ONCE(cputimer->running, 0);
+	WRITE_ONCE(cputimer->running, false);
 }
 
 static u32 onecputick;
@@ -1088,20 +1087,20 @@ out:
  * @expires:	Expiration times, against which @sample will be checked.
  *
  * Checks @sample against @expires to see if any field of @sample has expired.
- * Returns true if any field of the former is greater than the corresponding
- * field of the latter if the latter field is set.  Otherwise returns false.
+ * Returns %true if any field of the former is greater than the corresponding
+ * field of the latter if the latter field is set.  Otherwise returns %false.
  */
-static inline int task_cputime_expired(const struct task_cputime *sample,
+static inline bool task_cputime_expired(const struct task_cputime *sample,
 					const struct task_cputime *expires)
 {
 	if (expires->utime && sample->utime >= expires->utime)
-		return 1;
+		return true;
 	if (expires->stime && sample->utime + sample->stime >= expires->stime)
-		return 1;
+		return true;
 	if (expires->sum_exec_runtime != 0 &&
 	    sample->sum_exec_runtime >= expires->sum_exec_runtime)
-		return 1;
-	return 0;
+		return true;
+	return false;
 }
 
 /**
@@ -1110,11 +1109,11 @@ static inline int task_cputime_expired(const struct task_cputime *sample,
  * @tsk:	The task (thread) being checked.
  *
  * Check the task and thread group timers.  If both are zero (there are no
- * timers set) return false.  Otherwise snapshot the task and thread group
+ * timers set) return %false.  Otherwise snapshot the task and thread group
  * timers and compare them with the corresponding expiration times.  Return
- * true if a timer has expired, else return false.
+ * %true if a timer has expired, else return %false.
  */
-static inline int fastpath_timer_check(struct task_struct *tsk)
+static inline bool fastpath_timer_check(struct task_struct *tsk)
 {
 	struct signal_struct *sig;
 	cputime_t utime, stime;
@@ -1129,7 +1128,7 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
 		};
 
 		if (task_cputime_expired(&task_sample, &tsk->cputime_expires))
-			return 1;
+			return true;
 	}
 
 	sig = tsk->signal;
@@ -1140,10 +1139,10 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
 		sample_cputime_atomic(&group_sample, &sig->cputimer.cputime_atomic);
 
 		if (task_cputime_expired(&group_sample, &sig->cputime_expires))
-			return 1;
+			return true;
 	}
 
-	return 0;
+	return false;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
  2015-08-26 19:33 [PATCH 3/3] timer: Reduce unnecessary sighand lock contention George Spelvin
@ 2015-08-26 23:44 ` Jason Low
  2015-08-27  1:28   ` George Spelvin
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Low @ 2015-08-26 23:44 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-kernel, jason.low2

On Wed, 2015-08-26 at 15:33 -0400, George Spelvin wrote:
> And some more comments on the series...
> 
> > @@ -626,6 +628,7 @@ struct task_cputime_atomic {
> >  struct thread_group_cputimer {
> > 	struct task_cputime_atomic cputime_atomic;
> > 	int running;
> >+	int checking_timer;
> > };
> 
> Why not make both running and checking_timer actual booleans, so the
> pair is only 16 bits rather than 64?
> 
> I suppose since sum_exec_runtime in struct task_cputime_atomic is 64 bits,
> alignment means there's no actual improvement.  It's just my personal
> preference (Linus disagrees with me!) for the bool type for documentation.
> 
> (Compile-tested patch appended.)

I can include your patch in the series and then use boolean for the new
checking_timer field. However, it looks like this applies on an old
kernel. For example, the spin_lock field has already been removed from
the structure.

> But when it comes to really understanding this code, I'm struggling.
> It seems that this changes the return value of of fastpath_timer_check.
> I'm tryng to wrap my head around exactly why that is safe.
> 
> Any time I see things like READ_ONCE and WRITE_ONCE, I wonder if there
> need to be memory barriers as well.  (Because x86 has strong default
> memory ordering, testing there doesn't prove the code right on other
> platforms.)
> 
> For the writes, the surrounding spinlock does the job.
> 
> But on the read side (fastpath_timer_check), I'm not sure
> what the rules should be.
> 
> Or is it basically okay if this is massively racey, since process-wide
> CPU timers are inherently sloppy.  A race will just cause an expiration
> check to be missed, but it will be retried soon anyway.

Yes, the worst case scenario is that we wait until the next thread to
come along and handle the next expired timer. However, this "delay"
already occurs now (for example: a timer can expire right after a thread
exits check_process_timers()).

> Other half-baked thoughts that went through my head:
> 
> If the problem is that you have contention for read access to
> sig->cputimer.cputime, can that be changed to a seqlock?
> 
> Another possibility is to use a trylock before testing
> checking_timer:
> 
> 	if (sig->cputimer.running) {
> 		struct task_cputime group_sample;
> 
> -		raw_spin_lock(&sig->cputimer.lock);
> +		if (!raw_spin_trylock(&sig->cputimer.lock)) {
> +			if (READ_ONCE(sig->cputimer.checking_timer))
> +				return false;	/* Lock holder's doing it for us */
> +			raw_spin_lock(&sig->cputimer.lock);
> +		}

The spinlock call has already been removed from a previous patch. The
issue now is with contention with the sighand lock.

Thanks,
Jason


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
@ 2015-08-26 21:05 George Spelvin
  0 siblings, 0 replies; 34+ messages in thread
From: George Spelvin @ 2015-08-26 21:05 UTC (permalink / raw)
  To: jason.low2; +Cc: akpm, linux, linux-kernel



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 3/3] timer: Reduce unnecessary sighand lock contention
@ 2015-08-26 19:33 George Spelvin
  2015-08-26 23:44 ` Jason Low
  0 siblings, 1 reply; 34+ messages in thread
From: George Spelvin @ 2015-08-26 19:33 UTC (permalink / raw)
  To: jason.low2; +Cc: akpm, linux-kernel, linux

And some more comments on the series...

> @@ -626,6 +628,7 @@ struct task_cputime_atomic {
>  struct thread_group_cputimer {
> 	struct task_cputime_atomic cputime_atomic;
> 	int running;
>+	int checking_timer;
> };

Why not make both running and checking_timer actual booleans, so the
pair is only 16 bits rather than 64?

I suppose since sum_exec_runtime in struct task_cputime_atomic is 64 bits,
alignment means there's no actual improvement.  It's just my personal
preference (Linus disagrees with me!) for the bool type for documentation.

(Compile-tested patch appended.)


But when it comes to really understanding this code, I'm struggling.
It seems that this changes the return value of of fastpath_timer_check.
I'm tryng to wrap my head around exactly why that is safe.

Any time I see things like READ_ONCE and WRITE_ONCE, I wonder if there
need to be memory barriers as well.  (Because x86 has strong default
memory ordering, testing there doesn't prove the code right on other
platforms.)

For the writes, the surrounding spinlock does the job.

But on the read side (fastpath_timer_check), I'm not sure
what the rules should be.

Or is it basically okay if this is massively racey, since process-wide
CPU timers are inherently sloppy.  A race will just cause an expiration
check to be missed, but it will be retried soon anyway.


Other half-baked thoughts that went through my head:

If the problem is that you have contention for read access to
sig->cputimer.cputime, can that be changed to a seqlock?

Another possibility is to use a trylock before testing
checking_timer:

	if (sig->cputimer.running) {
		struct task_cputime group_sample;

-		raw_spin_lock(&sig->cputimer.lock);
+		if (!raw_spin_trylock(&sig->cputimer.lock)) {
+			if (READ_ONCE(sig->cputimer.checking_timer))
+				return false;	/* Lock holder's doing it for us */
+			raw_spin_lock(&sig->cputimer.lock);
+		}
		group_sample = sig->cputimer.cputime;
		raw_spin_unlock(&sig->cputimer.lock);

		if (task_cputime_expired(&group_sample, &sig->cputime_expires))
			return true;
	}
	return false;
}




----8<---- This is the patch mentioned above ----8<----

>From 0058bf5ed6a9e483ae33746685332e3ef9562ed5 Mon Sep 17 00:00:00 2001
From: George Spelvin <linux@horizon.com>
Date: Wed, 26 Aug 2015 19:15:54 +0000
Subject: [PATCH] timer: Use bool more in kernel/time/posix-cpu-timers.c

This is mostly function return values, for documentation.

One structure field is changed (from int), but alignment
padding precludes any actual space saving.

Signed-off-by: George Spelvin <linux@horizon.com>

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 26a2e612..fac3a7e2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -585,7 +585,7 @@ struct task_cputime {
 /**
  * struct thread_group_cputimer - thread group interval timer counts
  * @cputime:		thread group interval timers.
- * @running:		non-zero when there are timers running and
+ * @running:		%true when there are timers running and
  * 			@cputime receives updates.
  * @lock:		lock for fields in this struct.
  *
@@ -594,7 +594,7 @@ struct task_cputime {
  */
 struct thread_group_cputimer {
 	struct task_cputime cputime;
-	int running;
+	bool running;
 	raw_spinlock_t lock;
 };
 
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 0075da74..d8483ec5 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -113,14 +113,12 @@ static void bump_cpu_timer(struct k_itimer *timer,
  *
  * @cputime:	The struct to compare.
  *
- * Checks @cputime to see if all fields are zero.  Returns true if all fields
- * are zero, false if any field is nonzero.
+ * Checks @cputime to see if all fields are zero.  Returns %true if all fields
+ * are zero, %false if any field is nonzero.
  */
-static inline int task_cputime_zero(const struct task_cputime *cputime)
+static inline bool task_cputime_zero(const struct task_cputime *cputime)
 {
-	if (!cputime->utime && !cputime->stime && !cputime->sum_exec_runtime)
-		return 1;
-	return 0;
+	return !cputime->utime && !cputime->stime && !cputime->sum_exec_runtime;
 }
 
 static inline unsigned long long prof_ticks(struct task_struct *p)
@@ -223,7 +221,7 @@ void thread_group_cputimer(struct task_struct *tsk, struct task_cputime *times)
 		 */
 		thread_group_cputime(tsk, &sum);
 		raw_spin_lock_irqsave(&cputimer->lock, flags);
-		cputimer->running = 1;
+		cputimer->running = true;
 		update_gt_cputime(&cputimer->cputime, &sum);
 	} else
 		raw_spin_lock_irqsave(&cputimer->lock, flags);
@@ -435,8 +433,9 @@ void posix_cpu_timers_exit_group(struct task_struct *tsk)
 	cleanup_timers(tsk->signal->cpu_timers);
 }
 
-static inline int expires_gt(cputime_t expires, cputime_t new_exp)
+static inline bool expires_gt(cputime_t expires, cputime_t new_exp)
 {
+	/* Could also be written "expires - 1 >= new_exp" */
 	return expires == 0 || expires > new_exp;
 }
 
@@ -888,7 +887,7 @@ static void stop_process_timers(struct signal_struct *sig)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&cputimer->lock, flags);
-	cputimer->running = 0;
+	cputimer->running = false;
 	raw_spin_unlock_irqrestore(&cputimer->lock, flags);
 }
 
@@ -1066,20 +1065,20 @@ out:
  * @expires:	Expiration times, against which @sample will be checked.
  *
  * Checks @sample against @expires to see if any field of @sample has expired.
- * Returns true if any field of the former is greater than the corresponding
- * field of the latter if the latter field is set.  Otherwise returns false.
+ * Returns %true if any field of the former is greater than the corresponding
+ * field of the latter if the latter field is set.  Otherwise returns %false.
  */
-static inline int task_cputime_expired(const struct task_cputime *sample,
+static inline bool task_cputime_expired(const struct task_cputime *sample,
 					const struct task_cputime *expires)
 {
 	if (expires->utime && sample->utime >= expires->utime)
-		return 1;
+		return true;
 	if (expires->stime && sample->utime + sample->stime >= expires->stime)
-		return 1;
+		return true;
 	if (expires->sum_exec_runtime != 0 &&
 	    sample->sum_exec_runtime >= expires->sum_exec_runtime)
-		return 1;
-	return 0;
+		return true;
+	return false;
 }
 
 /**
@@ -1088,11 +1087,11 @@ static inline int task_cputime_expired(const struct task_cputime *sample,
  * @tsk:	The task (thread) being checked.
  *
  * Check the task and thread group timers.  If both are zero (there are no
- * timers set) return false.  Otherwise snapshot the task and thread group
+ * timers set) return %false.  Otherwise snapshot the task and thread group
  * timers and compare them with the corresponding expiration times.  Return
- * true if a timer has expired, else return false.
+ * %true if a timer has expired, else return %false.
  */
-static inline int fastpath_timer_check(struct task_struct *tsk)
+static inline bool fastpath_timer_check(struct task_struct *tsk)
 {
 	struct signal_struct *sig;
 	cputime_t utime, stime;
@@ -1107,7 +1106,7 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
 		};
 
 		if (task_cputime_expired(&task_sample, &tsk->cputime_expires))
-			return 1;
+			return true;
 	}
 
 	sig = tsk->signal;
@@ -1119,10 +1118,10 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
 		raw_spin_unlock(&sig->cputimer.lock);
 
 		if (task_cputime_expired(&group_sample, &sig->cputime_expires))
-			return 1;
+			return true;
 	}
 
-	return 0;
+	return false;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-08-31 19:44 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-26  3:17 [PATCH 0/3] timer: Improve itimers scalability Jason Low
2015-08-26  3:17 ` [PATCH 1/3] timer: Optimize fastpath_timer_check() Jason Low
2015-08-26 21:57   ` Frederic Weisbecker
2015-08-31 15:15   ` Davidlohr Bueso
2015-08-31 19:40     ` Jason Low
2015-08-26  3:17 ` [PATCH 2/3] timer: Check thread timers only when there are active thread timers Jason Low
2015-08-26  3:17 ` [PATCH 3/3] timer: Reduce unnecessary sighand lock contention Jason Low
2015-08-26 17:53   ` Linus Torvalds
2015-08-26 22:31     ` Frederic Weisbecker
2015-08-26 22:57       ` Jason Low
2015-08-26 22:56   ` Frederic Weisbecker
2015-08-26 23:32     ` Jason Low
2015-08-27  4:52       ` Jason Low
2015-08-27 12:53       ` Frederic Weisbecker
2015-08-27 20:29         ` Jason Low
2015-08-27 21:12           ` Frederic Weisbecker
2015-08-26  3:27 ` [PATCH 0/3] timer: Improve itimers scalability Andrew Morton
2015-08-26 16:33   ` Jason Low
2015-08-26 17:08     ` Oleg Nesterov
2015-08-26 22:07       ` Jason Low
2015-08-26 22:53         ` Hideaki Kimura
2015-08-26 23:13           ` Frederic Weisbecker
2015-08-26 23:45             ` Hideaki Kimura
2015-08-27 13:18               ` Frederic Weisbecker
2015-08-27 14:47                 ` Steven Rostedt
2015-08-27 15:09                   ` Thomas Gleixner
2015-08-27 15:17                     ` Frederic Weisbecker
2015-08-26 19:33 [PATCH 3/3] timer: Reduce unnecessary sighand lock contention George Spelvin
2015-08-26 23:44 ` Jason Low
2015-08-27  1:28   ` George Spelvin
2015-08-27 21:55     ` Jason Low
2015-08-27 22:43       ` George Spelvin
2015-08-28  4:32         ` Jason Low
2015-08-26 21:05 George Spelvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).