linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Kogan <alex.kogan@oracle.com>
To: linux@armlinux.org.uk, peterz@infradead.org, mingo@redhat.com,
	will.deacon@arm.com, arnd@arndb.de, longman@redhat.com,
	linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, tglx@linutronix.de, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org
Cc: steven.sistare@oracle.com, daniel.m.jordan@oracle.com,
	alex.kogan@oracle.com, dave.dice@oracle.com,
	rahul.x.yadav@oracle.com
Subject: [PATCH v2 4/5] locking/qspinlock: Introduce starvation avoidance into CNA
Date: Fri, 29 Mar 2019 11:20:05 -0400	[thread overview]
Message-ID: <20190329152006.110370-5-alex.kogan@oracle.com> (raw)
In-Reply-To: <20190329152006.110370-1-alex.kogan@oracle.com>

Choose the next lock holder among spinning threads running on the same
node with high probability rather than always. With small probability,
hand the lock to the first thread in the secondary queue or, if that
queue is empty, to the immediate successor of the current lock holder
in the main queue.  Thus, assuming no failures while threads hold the
lock, every thread would be able to acquire the lock after a bounded
number of lock transitions, with high probability.

Signed-off-by: Alex Kogan <alex.kogan@oracle.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
---
 kernel/locking/qspinlock_cna.h | 55 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/qspinlock_cna.h b/kernel/locking/qspinlock_cna.h
index 5bc5fd9586ea..5addf6439326 100644
--- a/kernel/locking/qspinlock_cna.h
+++ b/kernel/locking/qspinlock_cna.h
@@ -3,6 +3,8 @@
 #error "do not include this file"
 #endif
 
+#include <linux/random.h>
+
 /*
  * Implement a NUMA-aware version of MCS (aka CNA, or compact NUMA-aware lock).
  *
@@ -15,7 +17,9 @@
  * secondary queue, and the lock is passed to T. If such T is not found, the
  * lock is passed to the first node in the secondary queue. Finally, if the
  * secondary queue is empty, the lock is passed to the next thread in the
- * main queue.
+ * main queue. To avoid starvation of threads in the secondary queue,
+ * those threads are moved back to the head of the main queue
+ * after a certain expected number of intra-node lock hand-offs.
  *
  * For details, see https://arxiv.org/abs/1810.05600.
  *
@@ -25,6 +29,18 @@
 
 #define MCS_NODE(ptr) ((struct mcs_spinlock *)(ptr))
 
+/* Per-CPU pseudo-random number seed */
+static DEFINE_PER_CPU(u32, seed);
+
+/*
+ * Controls the probability for intra-node lock hand-off. It can be
+ * tuned and depend, e.g., on the number of CPUs per node. For now,
+ * choose a value that provides reasonable long-term fairness without
+ * sacrificing performance compared to a version that does not have any
+ * fairness guarantees.
+ */
+#define INTRA_NODE_HANDOFF_PROB_ARG 0x10000
+
 static inline __pure int decode_numa_node(u32 node_and_count)
 {
 	int node = (node_and_count >> _Q_NODE_OFFSET) - 1;
@@ -102,6 +118,35 @@ static struct mcs_spinlock *find_successor(struct mcs_spinlock *me)
 	return NULL;
 }
 
+/*
+ * xorshift function for generating pseudo-random numbers:
+ * https://en.wikipedia.org/wiki/Xorshift
+ */
+static inline u32 xor_random(void)
+{
+	u32 v;
+
+	v = this_cpu_read(seed);
+	if (v == 0)
+		get_random_bytes(&v, sizeof(u32));
+
+	v ^= v << 6;
+	v ^= v >> 21;
+	v ^= v << 7;
+	this_cpu_write(seed, v);
+
+	return v;
+}
+
+/*
+ * Return false with probability 1 / @range.
+ * @range must be a power of 2.
+ */
+static bool probably(unsigned int range)
+{
+	return xor_random() & (range - 1);
+}
+
 static __always_inline int get_node_index(struct mcs_spinlock *node)
 {
 	return decode_count(node->node_and_count++);
@@ -151,7 +196,13 @@ static inline void pass_mcs_lock(struct mcs_spinlock *node,
 {
 	struct mcs_spinlock *succ = NULL;
 
-	succ = find_successor(node);
+	/*
+	 * Try to pass the lock to a thread running on the same node.
+	 * For long-term fairness, search for such a thread with high
+	 * probability rather than always.
+	 */
+	if (probably(INTRA_NODE_HANDOFF_PROB_ARG))
+		succ = find_successor(node);
 
 	if (succ) {
 		arch_mcs_spin_unlock_contended(&succ->locked, node->locked);
-- 
2.11.0 (Apple Git-81)


  parent reply	other threads:[~2019-03-29 15:21 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-29 15:20 [PATCH v2 0/5] Add NUMA-awareness to qspinlock Alex Kogan
2019-03-29 15:20 ` [PATCH v2 1/5] locking/qspinlock: Make arch_mcs_spin_unlock_contended more generic Alex Kogan
2019-03-29 15:20 ` [PATCH v2 2/5] locking/qspinlock: Refactor the qspinlock slow path Alex Kogan
2019-03-29 15:20 ` [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock Alex Kogan
2019-04-01  9:06   ` Peter Zijlstra
2019-04-01  9:33     ` Peter Zijlstra
2019-04-03 15:53       ` Alex Kogan
2019-04-03 16:10         ` Peter Zijlstra
2019-04-01  9:21   ` Peter Zijlstra
2019-04-01 14:36   ` Waiman Long
2019-04-02  9:43     ` Peter Zijlstra
2019-04-03 15:39       ` Alex Kogan
2019-04-03 15:48         ` Waiman Long
2019-04-03 16:01         ` Peter Zijlstra
2019-04-04  5:05           ` Juergen Gross
2019-04-04  9:38             ` Peter Zijlstra
2019-04-04 18:03               ` Waiman Long
2019-06-04 23:21           ` Alex Kogan
2019-06-05 20:40             ` Peter Zijlstra
2019-06-06 15:21               ` Alex Kogan
2019-06-06 15:32                 ` Waiman Long
2019-06-06 15:42                   ` Waiman Long
2019-04-03 16:33       ` Waiman Long
2019-04-03 17:16         ` Peter Zijlstra
2019-04-03 17:40           ` Waiman Long
2019-04-04  2:02   ` Hanjun Guo
2019-04-04  3:14     ` Alex Kogan
2019-06-11  4:22   ` liwei (GF)
2019-06-12  4:38     ` Alex Kogan
2019-06-12 15:05       ` Waiman Long
2019-03-29 15:20 ` Alex Kogan [this message]
2019-04-02 10:37   ` [PATCH v2 4/5] locking/qspinlock: Introduce starvation avoidance into CNA Peter Zijlstra
2019-04-03 17:06     ` Alex Kogan
2019-03-29 15:20 ` [PATCH v2 5/5] locking/qspinlock: Introduce the shuffle reduction optimization " Alex Kogan
2019-04-01  9:09 ` [PATCH v2 0/5] Add NUMA-awareness to qspinlock Peter Zijlstra
2019-04-03 17:13   ` Alex Kogan
2019-07-03 11:58 ` Jan Glauber
2019-07-12  8:12   ` Hanjun Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190329152006.110370-5-alex.kogan@oracle.com \
    --to=alex.kogan@oracle.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.dice@oracle.com \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rahul.x.yadav@oracle.com \
    --cc=steven.sistare@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).