From: Manfred Spraul <manfred@colorfullife.com>
To: Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Davidlohr Bueso <dave@stgolabs.net>
Cc: LKML <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
"H. Peter Anvin" <hpa@zytor.com>,
1vier1@web.de, felixh@informatik.uni-bremen.de,
Manfred Spraul <manfred@colorfullife.com>
Subject: [PATCH 2/2] ipc/sem: Add hysteresis.
Date: Sat, 1 Oct 2016 20:40:17 +0200 [thread overview]
Message-ID: <1475347217-2143-3-git-send-email-manfred@colorfullife.com> (raw)
In-Reply-To: <1475347217-2143-1-git-send-email-manfred@colorfullife.com>
sysv sem has two lock modes: One with per-semaphore locks, one lock mode
with a single global lock for the whole array.
When switching from the per-semaphore locks to the global lock, all
per-semaphore locks must be scanned for ongoing operations.
The patch adds a hysteresis for switching from the global lock to the
per semaphore locks. This reduces how often the per-semaphore locks
must be scanned.
Compared to the initial patch, this is a simplified solution:
Setting USE_GLOBAL_LOCK_HYSTERESIS to 1 restores the current behavior.
In theory, a workload with exactly 10 simple sops and then
one complex op now scales a bit worse, but this is pure theory:
If there is concurrency, the it won't be exactly 10:1:10:1:10:1:...
If there is no concurrency, then there is no need for scalability.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
include/linux/sem.h | 2 +-
ipc/sem.c | 87 ++++++++++++++++++++++++++++++++++++++---------------
2 files changed, 63 insertions(+), 26 deletions(-)
diff --git a/include/linux/sem.h b/include/linux/sem.h
index d0efd6e..4fc222f 100644
--- a/include/linux/sem.h
+++ b/include/linux/sem.h
@@ -21,7 +21,7 @@ struct sem_array {
struct list_head list_id; /* undo requests on this array */
int sem_nsems; /* no. of semaphores in array */
int complex_count; /* pending complex operations */
- bool complex_mode; /* no parallel simple ops */
+ unsigned int use_global_lock;/* >0: global lock required */
};
#ifdef CONFIG_SYSVIPC
diff --git a/ipc/sem.c b/ipc/sem.c
index d5f2710..0982c4d 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -161,22 +161,43 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);
#define SEMOPM_FAST 64 /* ~ 372 bytes on stack */
/*
+ * Switching from the mode suitable for simple ops
+ * to the mode for complex ops is costly. Therefore:
+ * use some hysteresis
+ */
+#define USE_GLOBAL_LOCK_HYSTERESIS 10
+
+/*
* Locking:
* a) global sem_lock() for read/write
* sem_undo.id_next,
* sem_array.complex_count,
- * sem_array.complex_mode
* sem_array.pending{_alter,_const},
* sem_array.sem_undo
*
* b) global or semaphore sem_lock() for read/write:
* sem_array.sem_base[i].pending_{const,alter}:
- * sem_array.complex_mode (for read)
+ * sem_array.use_global_lock(for read)
*
* c) special:
* sem_undo_list.list_proc:
* * undo_list->lock for write
* * rcu for read
+ * use_global_lock:
+ * * global sem_lock() for write
+ * * either local or global sem_lock() for read.
+ *
+ * Memory ordering:
+ * Most ordering is enforced by using spin_lock() and spin_unlock().
+ * The special case is use_global_lock:
+ * Setting it from non-zero to 0 is a RELEASE, this is ensured by
+ * using smp_store_release().
+ * Testing if it is non-zero is an ACQUIRE, this is ensured by using
+ * smp_load_acquire().
+ * Setting it from 0 to non-zero must be ordered with regards to
+ * this smp_load_acquire(), this is guaranteed because the smp_load_acquire()
+ * is inside a spin_lock() and after a write from 0 to non-zero a
+ * spin_lock()+spin_unlock() is done.
*/
#define sc_semmsl sem_ctls[0]
@@ -275,12 +296,16 @@ static void complexmode_enter(struct sem_array *sma)
int i;
struct sem *sem;
- if (sma->complex_mode) {
- /* We are already in complex_mode. Nothing to do */
+ if (sma->use_global_lock > 0) {
+ /*
+ * We are already in global lock mode.
+ * Nothing to do, just reset the
+ * counter until we return to simple mode.
+ */
+ sma->use_global_lock = USE_GLOBAL_LOCK_HYSTERESIS;
return;
}
-
- sma->complex_mode = true;
+ sma->use_global_lock = USE_GLOBAL_LOCK_HYSTERESIS;
for (i = 0; i < sma->sem_nsems; i++) {
sem = sma->sem_base + i;
@@ -301,13 +326,17 @@ static void complexmode_tryleave(struct sem_array *sma)
*/
return;
}
- /*
- * Immediately after setting complex_mode to false,
- * a simple op can start. Thus: all memory writes
- * performed by the current operation must be visible
- * before we set complex_mode to false.
- */
- smp_store_release(&sma->complex_mode, false);
+ if (sma->use_global_lock == 1) {
+ /*
+ * Immediately after setting use_global_lock to 0,
+ * a simple op can start. Thus: all memory writes
+ * performed by the current operation must be visible
+ * before we set use_global_lock to 0.
+ */
+ smp_store_release(&sma->use_global_lock, 0);
+ } else {
+ sma->use_global_lock--;
+ }
}
#define SEM_GLOBAL_LOCK (-1)
@@ -337,22 +366,23 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
* Optimized locking is possible if no complex operation
* is either enqueued or processed right now.
*
- * Both facts are tracked by complex_mode.
+ * Both facts are tracked by use_global_mode.
*/
sem = sma->sem_base + sops->sem_num;
/*
- * Initial check for complex_mode. Just an optimization,
+ * Initial check for use_global_lock. Just an optimization,
* no locking, no memory barrier.
*/
- if (!sma->complex_mode) {
+ if (!sma->use_global_lock) {
/*
* It appears that no complex operation is around.
* Acquire the per-semaphore lock.
*/
spin_lock(&sem->lock);
- if (!smp_load_acquire(&sma->complex_mode)) {
+ /* pairs with smp_store_release() */
+ if (!smp_load_acquire(&sma->use_global_lock)) {
/* fast path successful! */
return sops->sem_num;
}
@@ -362,19 +392,26 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
/* slow path: acquire the full lock */
ipc_lock_object(&sma->sem_perm);
- if (sma->complex_count == 0) {
- /* False alarm:
- * There is no complex operation, thus we can switch
- * back to the fast path.
+ if (sma->use_global_lock == 0) {
+ /*
+ * The use_global_lock mode ended while we waited for
+ * sma->sem_perm.lock. Thus we must switch to locking
+ * with sem->lock.
+ * Unlike in the fast path, there is no need to recheck
+ * sma->use_global_lock after we have acquired sem->lock:
+ * We own sma->sem_perm.lock, thus use_global_lock cannot
+ * change.
*/
spin_lock(&sem->lock);
+
ipc_unlock_object(&sma->sem_perm);
return sops->sem_num;
} else {
- /* Not a false alarm, thus complete the sequence for a
- * full lock.
+ /*
+ * Not a false alarm, thus continue to use the global lock
+ * mode. No need for complexmode_enter(), this was done by
+ * the caller that has set use_global_mode to non-zero.
*/
- complexmode_enter(sma);
return SEM_GLOBAL_LOCK;
}
}
@@ -535,7 +572,7 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params)
}
sma->complex_count = 0;
- sma->complex_mode = true; /* dropped by sem_unlock below */
+ WRITE_ONCE(sma->use_global_lock, USE_GLOBAL_LOCK_HYSTERESIS);
INIT_LIST_HEAD(&sma->pending_alter);
INIT_LIST_HEAD(&sma->pending_const);
INIT_LIST_HEAD(&sma->list_id);
--
2.7.4
next prev parent reply other threads:[~2016-10-01 18:58 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-01 18:40 [PATCH 0/2] ipc/sem.c: sem_lock fixes Manfred Spraul
2016-10-01 18:40 ` [PATCH 1/2] ipc/sem.c: Avoid using spin_unlock_wait() Manfred Spraul
2016-10-01 18:40 ` Manfred Spraul [this message]
2016-10-17 2:25 [lkp] [ipc/sem.c] 5864a2fd30: aim9.shared_memory.ops_per_sec -13.0% regression kernel test robot
2016-10-19 4:38 ` [lkp] [ipc/sem.c] 5864a2fd30: aim9.shared_memory.ops_per_sec -13.0% Manfred Spraul
2016-10-19 4:38 ` [PATCH 2/2] ipc/sem: Add hysteresis Manfred Spraul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1475347217-2143-3-git-send-email-manfred@colorfullife.com \
--to=manfred@colorfullife.com \
--cc=1vier1@web.de \
--cc=akpm@linux-foundation.org \
--cc=dave@stgolabs.net \
--cc=felixh@informatik.uni-bremen.de \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).