* [PATCH 0/2] ipc/sem.c: sem_lock fixes
@ 2016-07-13 5:06 Manfred Spraul
2016-07-13 5:06 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
2016-07-13 22:05 ` [PATCH 0/2] ipc/sem.c: sem_lock fixes Andrew Morton
0 siblings, 2 replies; 10+ messages in thread
From: Manfred Spraul @ 2016-07-13 5:06 UTC (permalink / raw)
To: H. Peter Anvin, Peter Zijlstra, Andrew Morton, Davidlohr Bueso
Cc: LKML, Thomas Gleixner, Ingo Molnar, 1vier1, felixh, Manfred Spraul
Hi Andrew, Hi Peter,
next version of the sem_lock() fixes:
The patches are again vs. tip.
Patch 1 is ready for merging, Patch 2 is for review.
- Patch 1 is the patch as in -next since January
It fixes the race that was found by Felix.
- Patch 2 removes the memory barriers that are part of the qspinlock
code.
- (The hysteresis patch would be patch 3. The risk of regressions
can't be ruled out, thus it must wait for benchmarks from real
workload tests)
Patch 1+2 were one patch, I've split patches so that review and
backporting are simpler.
--
Manfred
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race
2016-07-13 5:06 [PATCH 0/2] ipc/sem.c: sem_lock fixes Manfred Spraul
@ 2016-07-13 5:06 ` Manfred Spraul
2016-07-13 5:06 ` [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers Manfred Spraul
2016-07-16 1:27 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Davidlohr Bueso
2016-07-13 22:05 ` [PATCH 0/2] ipc/sem.c: sem_lock fixes Andrew Morton
1 sibling, 2 replies; 10+ messages in thread
From: Manfred Spraul @ 2016-07-13 5:06 UTC (permalink / raw)
To: H. Peter Anvin, Peter Zijlstra, Andrew Morton, Davidlohr Bueso
Cc: LKML, Thomas Gleixner, Ingo Molnar, 1vier1, felixh,
Manfred Spraul, stable
Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a
race:
sem_lock has a fast path that allows parallel simple operations.
There are two reasons why a simple operation cannot run in parallel:
- a non-simple operations is ongoing (sma->sem_perm.lock held)
- a complex operation is sleeping (sma->complex_count != 0)
As both facts are stored independently, a thread can bypass the current
checks by sleeping in the right positions. See below for more details
(or kernel bugzilla 105651).
The patch fixes that by creating one variable (complex_mode)
that tracks both reasons why parallel operations are not possible.
The patch also updates stale documentation regarding the locking.
With regards to stable kernels:
The patch is required for all kernels that include the
commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") (3.10?)
The alternative is to revert the patch that introduced the race.
The patch is safe for backporting, i.e. it makes no assumptions
about memory barriers in spin_unlock_wait() or that the acquire
within spin_lock() is after setting the lock variable.
Background:
Here is the race of the current implementation:
Thread A: (simple op)
- does the first "sma->complex_count == 0" test
Thread B: (complex op)
- does sem_lock(): This includes an array scan. But the scan can't
find Thread A, because Thread A does not own sem->lock yet.
- the thread does the operation, increases complex_count,
drops sem_lock, sleeps
Thread A:
- spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock)
- sleeps before the complex_count test
Thread C: (complex op)
- does sem_lock (no array scan, complex_count==1)
- wakes up Thread B.
- decrements complex_count
Thread A:
- does the complex_count test
Bug:
Now both thread A and thread C operate on the same array, without
any synchronization.
Full memory barrier are required to synchronize changes of
complex_mode and the lock operations.
Fixes: 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()")
Reported-by: felixh@informatik.uni-bremen.de
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
---
include/linux/sem.h | 1 +
ipc/sem.c | 130 +++++++++++++++++++++++++++++++---------------------
2 files changed, 79 insertions(+), 52 deletions(-)
diff --git a/include/linux/sem.h b/include/linux/sem.h
index 976ce3a..d0efd6e 100644
--- a/include/linux/sem.h
+++ b/include/linux/sem.h
@@ -21,6 +21,7 @@ struct sem_array {
struct list_head list_id; /* undo requests on this array */
int sem_nsems; /* no. of semaphores in array */
int complex_count; /* pending complex operations */
+ bool complex_mode; /* no parallel simple ops */
};
#ifdef CONFIG_SYSVIPC
diff --git a/ipc/sem.c b/ipc/sem.c
index ae72b3c..0da63c8 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -162,14 +162,21 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);
/*
* Locking:
+ * a) global sem_lock() for read/write
* sem_undo.id_next,
* sem_array.complex_count,
- * sem_array.pending{_alter,_cont},
- * sem_array.sem_undo: global sem_lock() for read/write
- * sem_undo.proc_next: only "current" is allowed to read/write that field.
+ * sem_array.complex_mode
+ * sem_array.pending{_alter,_const},
+ * sem_array.sem_undo
*
+ * b) global or semaphore sem_lock() for read/write:
* sem_array.sem_base[i].pending_{const,alter}:
- * global or semaphore sem_lock() for read/write
+ * sem_array.complex_mode (for read)
+ *
+ * c) special:
+ * sem_undo_list.list_proc:
+ * * undo_list->lock for write
+ * * rcu for read
*/
#define sc_semmsl sem_ctls[0]
@@ -260,28 +267,59 @@ static void sem_rcu_free(struct rcu_head *head)
}
/*
- * Wait until all currently ongoing simple ops have completed.
+ * Enter the mode suitable for non-simple operations:
* Caller must own sem_perm.lock.
- * New simple ops cannot start, because simple ops first check
- * that sem_perm.lock is free.
- * that a) sem_perm.lock is free and b) complex_count is 0.
*/
-static void sem_wait_array(struct sem_array *sma)
+static void complexmode_enter(struct sem_array *sma)
{
int i;
struct sem *sem;
- if (sma->complex_count) {
- /* The thread that increased sma->complex_count waited on
- * all sem->lock locks. Thus we don't need to wait again.
- */
+ if (sma->complex_mode) {
+ /* We are already in complex_mode. Nothing to do */
return;
}
+ WRITE_ONCE(sma->complex_mode, true);
+
+ /* We need a full barrier:
+ * The write to complex_mode must be visible
+ * before we read the first sem->lock spinlock state.
+ */
+ smp_mb();
for (i = 0; i < sma->sem_nsems; i++) {
sem = sma->sem_base + i;
spin_unlock_wait(&sem->lock);
}
+ /*
+ * spin_unlock_wait() is not a memory barriers, it is only a
+ * control barrier. The code must pair with spin_unlock(&sem->lock),
+ * thus just the control barrier is insufficient.
+ *
+ * smp_rmb() is sufficient, as writes cannot pass the control barrier.
+ */
+ smp_rmb();
+}
+
+/*
+ * Try to leave the mode that disallows simple operations:
+ * Caller must own sem_perm.lock.
+ */
+static void complexmode_tryleave(struct sem_array *sma)
+{
+ if (sma->complex_count) {
+ /* Complex ops are sleeping.
+ * We must stay in complex mode
+ */
+ return;
+ }
+ /*
+ * Immediately after setting complex_mode to false,
+ * a simple op can start. Thus: all memory writes
+ * performed by the current operation must be visible
+ * before we set complex_mode to false.
+ */
+ smp_store_release(&sma->complex_mode, false);
}
/*
@@ -300,56 +338,40 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
/* Complex operation - acquire a full lock */
ipc_lock_object(&sma->sem_perm);
- /* And wait until all simple ops that are processed
- * right now have dropped their locks.
- */
- sem_wait_array(sma);
+ /* Prevent parallel simple ops */
+ complexmode_enter(sma);
return -1;
}
/*
* Only one semaphore affected - try to optimize locking.
- * The rules are:
- * - optimized locking is possible if no complex operation
- * is either enqueued or processed right now.
- * - The test for enqueued complex ops is simple:
- * sma->complex_count != 0
- * - Testing for complex ops that are processed right now is
- * a bit more difficult. Complex ops acquire the full lock
- * and first wait that the running simple ops have completed.
- * (see above)
- * Thus: If we own a simple lock and the global lock is free
- * and complex_count is now 0, then it will stay 0 and
- * thus just locking sem->lock is sufficient.
+ * Optimized locking is possible if no complex operation
+ * is either enqueued or processed right now.
+ *
+ * Both facts are tracked by complex_mode.
*/
sem = sma->sem_base + sops->sem_num;
- if (sma->complex_count == 0) {
+ /*
+ * Initial check for complex_mode. Just an optimization,
+ * no locking, no memory barrier.
+ */
+ if (!READ_ONCE(sma->complex_mode)) {
/*
* It appears that no complex operation is around.
* Acquire the per-semaphore lock.
*/
spin_lock(&sem->lock);
- /* Then check that the global lock is free */
- if (!spin_is_locked(&sma->sem_perm.lock)) {
- /*
- * We need a memory barrier with acquire semantics,
- * otherwise we can race with another thread that does:
- * complex_count++;
- * spin_unlock(sem_perm.lock);
- */
- smp_acquire__after_ctrl_dep();
+ /*
+ * A full barrier is required: the write of sem->lock
+ * must be visible before the read is executed
+ */
+ smp_mb();
- /*
- * Now repeat the test of complex_count:
- * It can't change anymore until we drop sem->lock.
- * Thus: if is now 0, then it will stay 0.
- */
- if (sma->complex_count == 0) {
- /* fast path successful! */
- return sops->sem_num;
- }
+ if (!smp_load_acquire(&sma->complex_mode)) {
+ /* fast path successful! */
+ return sops->sem_num;
}
spin_unlock(&sem->lock);
}
@@ -369,7 +391,7 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
/* Not a false alarm, thus complete the sequence for a
* full lock.
*/
- sem_wait_array(sma);
+ complexmode_enter(sma);
return -1;
}
}
@@ -378,6 +400,7 @@ static inline void sem_unlock(struct sem_array *sma, int locknum)
{
if (locknum == -1) {
unmerge_queues(sma);
+ complexmode_tryleave(sma);
ipc_unlock_object(&sma->sem_perm);
} else {
struct sem *sem = sma->sem_base + locknum;
@@ -529,6 +552,7 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params)
}
sma->complex_count = 0;
+ sma->complex_mode = true; /* dropped by sem_unlock below */
INIT_LIST_HEAD(&sma->pending_alter);
INIT_LIST_HEAD(&sma->pending_const);
INIT_LIST_HEAD(&sma->list_id);
@@ -2184,10 +2208,10 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)
/*
* The proc interface isn't aware of sem_lock(), it calls
* ipc_lock_object() directly (in sysvipc_find_ipc).
- * In order to stay compatible with sem_lock(), we must wait until
- * all simple semop() calls have left their critical regions.
+ * In order to stay compatible with sem_lock(), we must
+ * enter / leave complex_mode.
*/
- sem_wait_array(sma);
+ complexmode_enter(sma);
sem_otime = get_semotime(sma);
@@ -2204,6 +2228,8 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it)
sem_otime,
sma->sem_ctime);
+ complexmode_tryleave(sma);
+
return 0;
}
#endif
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers.
2016-07-13 5:06 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
@ 2016-07-13 5:06 ` Manfred Spraul
2016-07-13 16:16 ` Davidlohr Bueso
2016-07-16 1:27 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Davidlohr Bueso
1 sibling, 1 reply; 10+ messages in thread
From: Manfred Spraul @ 2016-07-13 5:06 UTC (permalink / raw)
To: H. Peter Anvin, Peter Zijlstra, Andrew Morton, Davidlohr Bueso
Cc: LKML, Thomas Gleixner, Ingo Molnar, 1vier1, felixh, Manfred Spraul
With 2c610022711 (locking/qspinlock: Fix spin_unlock_wait() some more),
memory barriers were added into spin_unlock_wait().
Thus another barrier is not required.
And as explained in 055ce0fd1b8 (locking/qspinlock: Add comments),
spin_lock() provides a barrier so that reads within the critical
section cannot happen before the write for the lock is visible.
i.e. spin_lock provides an acquire barrier after the write of the lock
variable, this barrier pairs with the smp_mb() in complexmode_enter().
Please review!
For x86, the patch is safe. But I don't know enough about all archs
that support SMP.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
ipc/sem.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/ipc/sem.c b/ipc/sem.c
index 0da63c8..d7b4212 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -291,14 +291,6 @@ static void complexmode_enter(struct sem_array *sma)
sem = sma->sem_base + i;
spin_unlock_wait(&sem->lock);
}
- /*
- * spin_unlock_wait() is not a memory barriers, it is only a
- * control barrier. The code must pair with spin_unlock(&sem->lock),
- * thus just the control barrier is insufficient.
- *
- * smp_rmb() is sufficient, as writes cannot pass the control barrier.
- */
- smp_rmb();
}
/*
@@ -363,12 +355,6 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
*/
spin_lock(&sem->lock);
- /*
- * A full barrier is required: the write of sem->lock
- * must be visible before the read is executed
- */
- smp_mb();
-
if (!smp_load_acquire(&sma->complex_mode)) {
/* fast path successful! */
return sops->sem_num;
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers.
2016-07-13 5:06 ` [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers Manfred Spraul
@ 2016-07-13 16:16 ` Davidlohr Bueso
2016-07-13 18:37 ` Manfred Spraul
0 siblings, 1 reply; 10+ messages in thread
From: Davidlohr Bueso @ 2016-07-13 16:16 UTC (permalink / raw)
To: Manfred Spraul
Cc: H. Peter Anvin, Peter Zijlstra, Andrew Morton, LKML,
Thomas Gleixner, Ingo Molnar, 1vier1, felixh
Manfred, shouldn't this patch be part of patch 1 (as you add the unnecessary barriers there?
Iow, can we have a single patch for all this?
Thanks,
Davidlohr
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers.
2016-07-13 16:16 ` Davidlohr Bueso
@ 2016-07-13 18:37 ` Manfred Spraul
0 siblings, 0 replies; 10+ messages in thread
From: Manfred Spraul @ 2016-07-13 18:37 UTC (permalink / raw)
To: Davidlohr Bueso, Michael Ellerman
Cc: H. Peter Anvin, Peter Zijlstra, Andrew Morton, LKML,
Thomas Gleixner, Ingo Molnar, 1vier1, felixh
Hi Davidlohr,
On 07/13/2016 06:16 PM, Davidlohr Bueso wrote:
> Manfred, shouldn't this patch be part of patch 1 (as you add the
> unnecessary barriers there? Iow, can we have a single patch for all this?
>
Two reasons:
- patch 1 is safe for backporting, patch 2 not.
- patch 1 is safe on all architectures, for patch 2 I would like to get
some review feedback.
e.g. I just found/read 51d7d5205d33 ("powerpc: Add smp_mb() to
arch_spin_is_locked()"):
For powerpc, a smp_mb() was added into spin_is_locked(), more or less
for ipc/sem.c.
Patch 1 replaces the spin_is_locked() with smp_load_acquire().
Isn't that the proof that smp_mb() is required?
--
Manfred
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] ipc/sem.c: sem_lock fixes
2016-07-13 5:06 [PATCH 0/2] ipc/sem.c: sem_lock fixes Manfred Spraul
2016-07-13 5:06 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
@ 2016-07-13 22:05 ` Andrew Morton
2016-07-14 16:40 ` Manfred Spraul
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2016-07-13 22:05 UTC (permalink / raw)
To: Manfred Spraul
Cc: H. Peter Anvin, Peter Zijlstra, Davidlohr Bueso, LKML,
Thomas Gleixner, Ingo Molnar, 1vier1, felixh
On Wed, 13 Jul 2016 07:06:50 +0200 Manfred Spraul <manfred@colorfullife.com> wrote:
> Hi Andrew, Hi Peter,
>
> next version of the sem_lock() fixes:
> The patches are again vs. tip.
>
> Patch 1 is ready for merging, Patch 2 is for review.
>
> - Patch 1 is the patch as in -next since January
> It fixes the race that was found by Felix.
> - Patch 2 removes the memory barriers that are part of the qspinlock
> code.
> - (The hysteresis patch would be patch 3. The risk of regressions
> can't be ruled out, thus it must wait for benchmarks from real
> workload tests)
I think you're saying that if these two patches cause performance
regressions, we will need ipc-sem-sem_lock-with-hysteresis.patch?
Is that even necessary? If your testing shows that
ipc-sem-sem_lock-with-hysteresis.patch makes things faster then in it
goes, surely?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] ipc/sem.c: sem_lock fixes
2016-07-13 22:05 ` [PATCH 0/2] ipc/sem.c: sem_lock fixes Andrew Morton
@ 2016-07-14 16:40 ` Manfred Spraul
0 siblings, 0 replies; 10+ messages in thread
From: Manfred Spraul @ 2016-07-14 16:40 UTC (permalink / raw)
To: Andrew Morton
Cc: H. Peter Anvin, Peter Zijlstra, Davidlohr Bueso, LKML,
Thomas Gleixner, Ingo Molnar, 1vier1, felixh
Hi Andrew,
On 07/14/2016 12:05 AM, Andrew Morton wrote:
> On Wed, 13 Jul 2016 07:06:50 +0200 Manfred Spraul <manfred@colorfullife.com> wrote:
>
>> Hi Andrew, Hi Peter,
>>
>> next version of the sem_lock() fixes:
>> The patches are again vs. tip.
>>
>> Patch 1 is ready for merging, Patch 2 is for review.
>>
>> - Patch 1 is the patch as in -next since January
>> It fixes the race that was found by Felix.
>> - Patch 2 removes the memory barriers that are part of the qspinlock
>> code.
>> - (The hysteresis patch would be patch 3. The risk of regressions
>> can't be ruled out, thus it must wait for benchmarks from real
>> workload tests)
> I think you're saying that if these two patches cause performance
> regressions, we will need ipc-sem-sem_lock-with-hysteresis.patch?
No, the two patches will not cause any regressions.
Commit 6062a8dc051 ("ipc,sem: fine grained locking for semtimedop") was
a huge scalability improvement for 99% of the users, but introduced a
regression for one workload.
Noone complained about it, so the workload must be rare.
Patch 3 now fixes the regression.
But we live with the introduced regression for 3 years, so give me (and
Davidlohr, and anyone else who wants to support) some more time.
Also on my TODO list: The queue merge/unmerge logic, i.e. commit
f269f40ad5ae ("ipc/sem.c: always use only one queue for alter
operations") might also win from some hysteresis.
My proposal:
- patch 1: merge towards main tree.
- patch 2: needs update. The barrier in sem_lock() is required. I'm not
yet sure about complexmode_enter(), perhaps with #ifndef
CONFIG_QUEUED_SPINLOCKS
- patch 3 should wait.
--
Manfred
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race
2016-07-13 5:06 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
2016-07-13 5:06 ` [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers Manfred Spraul
@ 2016-07-16 1:27 ` Davidlohr Bueso
1 sibling, 0 replies; 10+ messages in thread
From: Davidlohr Bueso @ 2016-07-16 1:27 UTC (permalink / raw)
To: Manfred Spraul
Cc: H. Peter Anvin, Peter Zijlstra, Andrew Morton, LKML,
Thomas Gleixner, Ingo Molnar, 1vier1, felixh, stable
On Wed, 13 Jul 2016, Manfred Spraul wrote:
>-static void sem_wait_array(struct sem_array *sma)
>+static void complexmode_enter(struct sem_array *sma)
> {
> int i;
> struct sem *sem;
>
>- if (sma->complex_count) {
>- /* The thread that increased sma->complex_count waited on
>- * all sem->lock locks. Thus we don't need to wait again.
>- */
>+ if (sma->complex_mode) {
>+ /* We are already in complex_mode. Nothing to do */
> return;
> }
>+ WRITE_ONCE(sma->complex_mode, true);
So we can actually save those READ/WRITE_ONCE calls for complex_mode as it's
a bool and therefore tearing is not an issue.
>+
>+ /* We need a full barrier:
>+ * The write to complex_mode must be visible
>+ * before we read the first sem->lock spinlock state.
>+ */
>+ smp_mb();
smp_store_mb()?
> /*
>@@ -300,56 +338,40 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
> /* Complex operation - acquire a full lock */
> ipc_lock_object(&sma->sem_perm);
>
>- /* And wait until all simple ops that are processed
>- * right now have dropped their locks.
>- */
>- sem_wait_array(sma);
>+ /* Prevent parallel simple ops */
>+ complexmode_enter(sma);
> return -1;
nit and unrelated: we should probably use some better label here than a raw
-1 (although I don't see it changing, just for nicer reading), ie: SEM_OBJECT_LOCKED
Thanks,
Davidlohr
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 0/2] ipc/sem.c: sem_lock fixes
@ 2016-10-01 18:40 Manfred Spraul
0 siblings, 0 replies; 10+ messages in thread
From: Manfred Spraul @ 2016-10-01 18:40 UTC (permalink / raw)
To: Peter Zijlstra, Andrew Morton, Davidlohr Bueso
Cc: LKML, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, 1vier1,
felixh, Manfred Spraul
Hi Andrew, Hi Peter, Hi Davidlohr,
New idea for ipc/sem:
The ACQUIRE from spin_lock() will continue to apply only for the load,
not for the store.
Thus: If we don't want to add arch dependencies into ipc/sem, the only
safe option is to use spin_lock()/spin_unlock() instead of spin_unlock_wait().
Or we must stay with the current code, which is a ~9% regression.
Thus:
- Patch 1 replaces spin_unlock_wait() with spin_lock()/spin_unlock() and
removes all memory barriers that are then unnecessary.
- Patch 2 adds the hysteresis code.
What do you think?
The patches passed stress-testing.
Andrew: Could you add it into mmots? Perhaps aiming for 4.10.
--
Manfred
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 0/2] ipc/sem.c: sem_lock fixes
@ 2016-06-25 17:37 Manfred Spraul
0 siblings, 0 replies; 10+ messages in thread
From: Manfred Spraul @ 2016-06-25 17:37 UTC (permalink / raw)
To: H. Peter Anvin, Peter Zijlstra, Andrew Morton, Davidlohr Bueso
Cc: LKML, Thomas Gleixner, Ingo Molnar, 1vier1, felixh, Manfred Spraul
Hi Andrew, Hi Peter,
next version of the sem_lock() fixes / improvement:
The patches are now vs. tip.
Patch 1 is ready for merging, patch 2 is new and for discussion.
Patch 1 fixes the race that was found by Felix.
It also adds smp_mb() to fully synchronize
WRITE_ONCE(status, 1);
<<< smp_mb();
spin_unlock_wait();
vs.
spin_lock();
<<< smp_mb();
READ_ONCE(status);
Patch 2 tries to close a performance regression:
sem_wait_array() must perform a spin_unlock_wait() for every
semaphore in the array. If the array is large, then this is
slow.
Two open points:
- Do we need it? Factor 20 improvement is nice, but is the
test realistic?
- COMPLEX_MODE_ENTER=10 is a magic number, without any
rational. (2+sem_nsems/128) would be another option,
but also arbitrary.
Test:
https://github.com/manfred-colorfu/ipcscale
# ./sem-scalebench -o 4 -f -m 0 -h 4 -i 1 -p 2 -c 8 -t 5 -d 500
Before: 214732
After: 4008635
--
Manfred
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-10-01 18:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-13 5:06 [PATCH 0/2] ipc/sem.c: sem_lock fixes Manfred Spraul
2016-07-13 5:06 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
2016-07-13 5:06 ` [PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers Manfred Spraul
2016-07-13 16:16 ` Davidlohr Bueso
2016-07-13 18:37 ` Manfred Spraul
2016-07-16 1:27 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Davidlohr Bueso
2016-07-13 22:05 ` [PATCH 0/2] ipc/sem.c: sem_lock fixes Andrew Morton
2016-07-14 16:40 ` Manfred Spraul
-- strict thread matches above, loose matches on Subject: below --
2016-10-01 18:40 Manfred Spraul
2016-06-25 17:37 Manfred Spraul
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.