* [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations
@ 2016-03-10 1:55 Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 1/2] kernel/smp: Explicitly inline cds_lock helpers Davidlohr Bueso
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Davidlohr Bueso @ 2016-03-10 1:55 UTC (permalink / raw)
To: mingo; +Cc: peterz, dave, linux-kernel
From: Davidlohr Bueso <dave@stgolabs.net>
Hi,
Justifications are in each patch, there is slight impact (patch 2)
on some tlb flushing intensive benchmarks (albeit using ipi batching
nowadays). Specifically for the pft
benchmark, on a 12-core box:
pft faults
4.4 4.4
vanilla smp
Hmean faults/cpu-1 801432.1608 ( 0.00%) 795719.8859 ( -0.71%)
Hmean faults/cpu-3 702578.6659 ( 0.00%) 752796.6960 ( 7.15%)
Hmean faults/cpu-5 606080.3473 ( 0.00%) 595890.0451 ( -1.68%)
Hmean faults/cpu-7 460369.0724 ( 0.00%) 485283.6343 ( 5.41%)
Hmean faults/cpu-12 294445.4701 ( 0.00%) 298300.6011 ( 1.31%)
Hmean faults/cpu-18 213156.0860 ( 0.00%) 213584.2741 ( 0.20%)
Hmean faults/cpu-24 153104.2995 ( 0.00%) 153198.8473 ( 0.06%)
Hmean faults/sec-1 796329.3184 ( 0.00%) 614222.4594 (-22.87%)
Hmean faults/sec-3 1947806.7372 ( 0.00%) 2169267.1582 ( 11.37%)
Hmean faults/sec-5 2611152.0422 ( 0.00%) 2544652.6871 ( -2.55%)
Hmean faults/sec-7 2493705.4668 ( 0.00%) 2674847.5270 ( 7.26%)
Hmean faults/sec-12 2583139.7724 ( 0.00%) 2614404.6002 ( 1.21%)
Hmean faults/sec-18 2661410.8170 ( 0.00%) 2683427.0703 ( 0.83%)
Hmean faults/sec-24 2670463.4814 ( 0.00%) 2666221.6332 ( -0.16%)
Stddev faults/cpu-1 27537.6676 ( 0.00%) 25753.4945 ( 6.48%)
Stddev faults/cpu-3 62616.8041 ( 0.00%) 44728.0990 ( 28.57%)
Stddev faults/cpu-5 70976.9184 ( 0.00%) 74720.5716 ( -5.27%)
Stddev faults/cpu-7 47426.5952 ( 0.00%) 32758.2705 ( 30.93%)
Stddev faults/cpu-12 6951.8792 ( 0.00%) 9097.0782 (-30.86%)
Stddev faults/cpu-18 4293.1696 ( 0.00%) 5826.9446 (-35.73%)
Stddev faults/cpu-24 3195.0939 ( 0.00%) 3373.7230 ( -5.59%)
Stddev faults/sec-1 27315.3093 ( 0.00%) 148601.7795 (-444.02%)
Stddev faults/sec-3 271560.5941 ( 0.00%) 193681.0177 ( 28.68%)
Stddev faults/sec-5 429633.7378 ( 0.00%) 458426.3306 ( -6.70%)
Stddev faults/sec-7 338229.0746 ( 0.00%) 226146.3450 ( 33.14%)
Stddev faults/sec-12 57766.4604 ( 0.00%) 82734.3638 (-43.22%)
Stddev faults/sec-18 118572.1909 ( 0.00%) 134966.7210 (-13.83%)
Stddev faults/sec-24 57452.7350 ( 0.00%) 57542.7755 ( -0.16%)
4.4 4.4
vanilla smp
User 11.91 11.85
System 197.11 194.69
Elapsed 44.24 40.26
While the single thread is an abnormality, overall we don't seem
to do any harm (noise range). Could be give or take, but overall
the patches at least make some sense afaict.
Thanks!
Davidlohr Bueso (2):
kernel/smp: Explicitly inline cds_lock helpers
kernel/smp: Use make csd_lock_wait be smp_cond_acquire
kernel/smp.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
--
2.1.4
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] kernel/smp: Explicitly inline cds_lock helpers
2016-03-10 1:55 [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Davidlohr Bueso
@ 2016-03-10 1:55 ` Davidlohr Bueso
2016-03-10 11:05 ` [tip:locking/core] locking/csd_lock: Explicitly inline csd_lock*() helpers tip-bot for Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 2/2] kernel/smp: Use make csd_lock_wait be smp_cond_acquire Davidlohr Bueso
2016-03-10 9:17 ` [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Peter Zijlstra
2 siblings, 1 reply; 6+ messages in thread
From: Davidlohr Bueso @ 2016-03-10 1:55 UTC (permalink / raw)
To: mingo; +Cc: peterz, dave, linux-kernel, Davidlohr Bueso
From: Davidlohr Bueso <dave@stgolabs.net>
While the compiler tends to already to it for us (except for
cds_unlock), make it explicit. These helpers mainly deal with
the ->flags, are short-lived and can be called, for example,
from smp_call_function_many().
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
kernel/smp.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index 822ffb1ada3f..c91e00178f8f 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -105,13 +105,13 @@ void __init call_function_init(void)
* previous function call. For multi-cpu calls its even more interesting
* as we'll have to ensure no other cpu is observing our csd.
*/
-static void csd_lock_wait(struct call_single_data *csd)
+static __always_inline void csd_lock_wait(struct call_single_data *csd)
{
while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
cpu_relax();
}
-static void csd_lock(struct call_single_data *csd)
+static __always_inline void csd_lock(struct call_single_data *csd)
{
csd_lock_wait(csd);
csd->flags |= CSD_FLAG_LOCK;
@@ -124,7 +124,7 @@ static void csd_lock(struct call_single_data *csd)
smp_wmb();
}
-static void csd_unlock(struct call_single_data *csd)
+static __always_inline void csd_unlock(struct call_single_data *csd)
{
WARN_ON(!(csd->flags & CSD_FLAG_LOCK));
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] kernel/smp: Use make csd_lock_wait be smp_cond_acquire
2016-03-10 1:55 [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 1/2] kernel/smp: Explicitly inline cds_lock helpers Davidlohr Bueso
@ 2016-03-10 1:55 ` Davidlohr Bueso
2016-03-10 11:06 ` [tip:locking/core] locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait() tip-bot for Davidlohr Bueso
2016-03-10 9:17 ` [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Peter Zijlstra
2 siblings, 1 reply; 6+ messages in thread
From: Davidlohr Bueso @ 2016-03-10 1:55 UTC (permalink / raw)
To: mingo; +Cc: peterz, dave, linux-kernel, Davidlohr Bueso
From: Davidlohr Bueso <dave@stgolabs>
We can micro-optimize this call and mildly relax the
barrier requirements by relying on ctrl + rmb, keeping
the acquire semantics. In addition, this is pretty much
the now standard for busy-waiting under such restraints.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
kernel/smp.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index c91e00178f8f..74165443c240 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -107,8 +107,7 @@ void __init call_function_init(void)
*/
static __always_inline void csd_lock_wait(struct call_single_data *csd)
{
- while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
- cpu_relax();
+ smp_cond_acquire(!(csd->flags & CSD_FLAG_LOCK));
}
static __always_inline void csd_lock(struct call_single_data *csd)
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations
2016-03-10 1:55 [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 1/2] kernel/smp: Explicitly inline cds_lock helpers Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 2/2] kernel/smp: Use make csd_lock_wait be smp_cond_acquire Davidlohr Bueso
@ 2016-03-10 9:17 ` Peter Zijlstra
2 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2016-03-10 9:17 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: mingo, dave, linux-kernel
On Wed, Mar 09, 2016 at 05:55:34PM -0800, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
>
> Hi,
>
> Justifications are in each patch, there is slight impact (patch 2)
> on some tlb flushing intensive benchmarks (albeit using ipi batching
> nowadays). Specifically for the pft
> benchmark, on a 12-core box:
> 4.4 4.4
> vanilla smp
> User 11.91 11.85
> System 197.11 194.69
> Elapsed 44.24 40.26
>
> While the single thread is an abnormality, overall we don't seem
> to do any harm (noise range). Could be give or take, but overall
> the patches at least make some sense afaict.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [tip:locking/core] locking/csd_lock: Explicitly inline csd_lock*() helpers
2016-03-10 1:55 ` [PATCH 1/2] kernel/smp: Explicitly inline cds_lock helpers Davidlohr Bueso
@ 2016-03-10 11:05 ` tip-bot for Davidlohr Bueso
0 siblings, 0 replies; 6+ messages in thread
From: tip-bot for Davidlohr Bueso @ 2016-03-10 11:05 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, tglx, peterz, linux-kernel, dbueso, hpa, torvalds, dave
Commit-ID: 90d1098478fb08a1ef166fe91622d8046869e17b
Gitweb: http://git.kernel.org/tip/90d1098478fb08a1ef166fe91622d8046869e17b
Author: Davidlohr Bueso <dave@stgolabs.net>
AuthorDate: Wed, 9 Mar 2016 17:55:35 -0800
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 10 Mar 2016 10:28:35 +0100
locking/csd_lock: Explicitly inline csd_lock*() helpers
While the compiler tends to already to it for us (except for
csd_unlock), make it explicit. These helpers mainly deal with
the ->flags, are short-lived and can be called, for example,
from smp_call_function_many().
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1457574936-19065-2-git-send-email-dbueso@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/smp.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index d903c02..5099db1 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -105,13 +105,13 @@ void __init call_function_init(void)
* previous function call. For multi-cpu calls its even more interesting
* as we'll have to ensure no other cpu is observing our csd.
*/
-static void csd_lock_wait(struct call_single_data *csd)
+static __always_inline void csd_lock_wait(struct call_single_data *csd)
{
while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
cpu_relax();
}
-static void csd_lock(struct call_single_data *csd)
+static __always_inline void csd_lock(struct call_single_data *csd)
{
csd_lock_wait(csd);
csd->flags |= CSD_FLAG_LOCK;
@@ -124,7 +124,7 @@ static void csd_lock(struct call_single_data *csd)
smp_wmb();
}
-static void csd_unlock(struct call_single_data *csd)
+static __always_inline void csd_unlock(struct call_single_data *csd)
{
WARN_ON(!(csd->flags & CSD_FLAG_LOCK));
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [tip:locking/core] locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
2016-03-10 1:55 ` [PATCH 2/2] kernel/smp: Use make csd_lock_wait be smp_cond_acquire Davidlohr Bueso
@ 2016-03-10 11:06 ` tip-bot for Davidlohr Bueso
0 siblings, 0 replies; 6+ messages in thread
From: tip-bot for Davidlohr Bueso @ 2016-03-10 11:06 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, tglx, dbueso, hpa, torvalds, mingo, peterz
Commit-ID: 38460a2178d225b39ade5ac66586c3733391cf86
Gitweb: http://git.kernel.org/tip/38460a2178d225b39ade5ac66586c3733391cf86
Author: Davidlohr Bueso <dave@stgolabs>
AuthorDate: Wed, 9 Mar 2016 17:55:36 -0800
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 10 Mar 2016 10:28:35 +0100
locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
We can micro-optimize this call and mildly relax the
barrier requirements by relying on ctrl + rmb, keeping
the acquire semantics. In addition, this is pretty much
the now standard for busy-waiting under such restraints.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1457574936-19065-3-git-send-email-dbueso@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/smp.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index 5099db1..300d293 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -107,8 +107,7 @@ void __init call_function_init(void)
*/
static __always_inline void csd_lock_wait(struct call_single_data *csd)
{
- while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
- cpu_relax();
+ smp_cond_acquire(!(csd->flags & CSD_FLAG_LOCK));
}
static __always_inline void csd_lock(struct call_single_data *csd)
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-03-10 11:07 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-10 1:55 [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 1/2] kernel/smp: Explicitly inline cds_lock helpers Davidlohr Bueso
2016-03-10 11:05 ` [tip:locking/core] locking/csd_lock: Explicitly inline csd_lock*() helpers tip-bot for Davidlohr Bueso
2016-03-10 1:55 ` [PATCH 2/2] kernel/smp: Use make csd_lock_wait be smp_cond_acquire Davidlohr Bueso
2016-03-10 11:06 ` [tip:locking/core] locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait() tip-bot for Davidlohr Bueso
2016-03-10 9:17 ` [PATCH -tip 0/2] kernel/smp: Small csd_lock optimizations Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).