All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <Waiman.Long@hp.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xenproject.org, kvm@vger.kernel.org,
	Paolo Bonzini <paolo.bonzini@gmail.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	David Vrabel <david.vrabel@citrix.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Daniel J Blueman <daniel@numascale.com>,
	Scott J Norton <scott.norton@hp.com>,
	Douglas Hatch <doug.hatch@hp.com>,
	Waiman Long <Waiman.Long@hp.com>
Subject: [PATCH v15 14/15] pvqspinlock: Improve slowpath performance by avoiding cmpxchg
Date: Mon,  6 Apr 2015 22:55:49 -0400	[thread overview]
Message-ID: <1428375350-9213-15-git-send-email-Waiman.Long@hp.com> (raw)
In-Reply-To: <1428375350-9213-1-git-send-email-Waiman.Long@hp.com>

In the pv_scan_next() function, the slow cmpxchg atomic operation is
performed even if the other CPU is not even close to being halted. This
extra cmpxchg can harm slowpath performance.

This patch introduces the new mayhalt flag to indicate if the other
spinning CPU is close to being halted or not. The current threshold
for x86 is 2k cpu_relax() calls. If this flag is not set, the other
spinning CPU will have at least 2k more cpu_relax() calls before
it can enter the halt state. This should give enough time for the
setting of the locked flag in struct mcs_spinlock to propagate to
that CPU without using atomic op.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/qspinlock_paravirt.h |   28 +++++++++++++++++++++++++---
 1 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index a210061..a9fe10d 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -16,7 +16,8 @@
  * native_queue_spin_unlock().
  */
 
-#define _Q_SLOW_VAL	(3U << _Q_LOCKED_OFFSET)
+#define _Q_SLOW_VAL		(3U << _Q_LOCKED_OFFSET)
+#define MAYHALT_THRESHOLD	(SPIN_THRESHOLD >> 4)
 
 /*
  * The vcpu_hashed is a special state that is set by the new lock holder on
@@ -36,6 +37,7 @@ struct pv_node {
 
 	int			cpu;
 	u8			state;
+	u8			mayhalt;
 };
 
 /*
@@ -187,6 +189,7 @@ static void pv_init_node(struct mcs_spinlock *node)
 
 	pn->cpu = smp_processor_id();
 	pn->state = vcpu_running;
+	pn->mayhalt = false;
 }
 
 /*
@@ -203,17 +206,27 @@ static void pv_wait_node(struct mcs_spinlock *node)
 		for (loop = SPIN_THRESHOLD; loop; loop--) {
 			if (READ_ONCE(node->locked))
 				return;
+			if (loop == MAYHALT_THRESHOLD)
+				xchg(&pn->mayhalt, true);
 			cpu_relax();
 		}
 
 		/*
-		 * Order pn->state vs pn->locked thusly:
+		 * Order pn->state/pn->mayhalt vs pn->locked thusly:
 		 *
-		 * [S] pn->state = vcpu_halted	  [S] next->locked = 1
+		 * [S] pn->mayhalt = 1		  [S] next->locked = 1
+		 *     MB, delay		      barrier()
+		 * [S] pn->state = vcpu_halted	  [L] pn->mayhalt
 		 *     MB			      MB
 		 * [L] pn->locked		[RmW] pn->state = vcpu_hashed
 		 *
 		 * Matches the cmpxchg() from pv_scan_next().
+		 *
+		 * As the new lock holder may quit (when pn->mayhalt is not
+		 * set) without memory barrier, a sufficiently long delay is
+		 * inserted between the setting of pn->mayhalt and pn->state
+		 * to ensure that there is enough time for the new pn->locked
+		 * value to be propagated here to be checked below.
 		 */
 		(void)xchg(&pn->state, vcpu_halted);
 
@@ -226,6 +239,7 @@ static void pv_wait_node(struct mcs_spinlock *node)
 		 * needs to move on to pv_wait_head().
 		 */
 		(void)cmpxchg(&pn->state, vcpu_halted, vcpu_running);
+		pn->mayhalt = false;
 	}
 
 	/*
@@ -246,6 +260,14 @@ static void pv_scan_next(struct qspinlock *lock, struct mcs_spinlock *node)
 	struct __qspinlock *l = (void *)lock;
 
 	/*
+	 * If mayhalt is not set, there is enough time for the just set value
+	 * in pn->locked to be propagated to the other CPU before it is time
+	 * to halt.
+	 */
+	if (!READ_ONCE(pn->mayhalt))
+		return;
+
+	/*
 	 * Transition CPU state: halted => hashed
 	 * Quit if the transition failed.
 	 */
-- 
1.7.1


  parent reply	other threads:[~2015-04-07  2:57 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-07  2:55 [PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support Waiman Long
2015-04-07  2:55 ` [PATCH v15 01/15] qspinlock: A simple generic 4-byte queue spinlock Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 02/15] qspinlock, x86: Enable x86-64 to use " Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 03/15] qspinlock: Add pending bit Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 04/15] qspinlock: Extract out code snippets for the next patch Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 05/15] qspinlock: Optimize for smaller NR_CPUS Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 06/15] qspinlock: Use a simple write to grab the lock Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 07/15] qspinlock: Revert to test-and-set on hypervisors Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 08/15] lfsr: a simple binary Galois linear feedback shift register Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-09 18:13   ` Peter Zijlstra
2015-04-09 18:13   ` Peter Zijlstra
2015-04-09 18:13     ` Peter Zijlstra
2015-04-09 18:23     ` Peter Zijlstra
2015-04-09 18:23     ` Peter Zijlstra
2015-04-09 18:23       ` Peter Zijlstra
2015-04-09 20:36       ` Waiman Long
2015-04-09 20:36         ` Waiman Long
2015-04-09 20:36       ` Waiman Long
2015-04-09 21:41     ` Waiman Long
2015-04-09 21:41     ` Waiman Long
2015-04-09 21:41     ` Waiman Long
2015-04-13 14:47       ` Peter Zijlstra
2015-04-13 15:45         ` Waiman Long
2015-04-13 15:45           ` Waiman Long
2015-04-13 15:45         ` Waiman Long
2015-04-13 14:47       ` Peter Zijlstra
2015-04-13 14:47       ` Peter Zijlstra
2015-04-13 15:08       ` Peter Zijlstra
2015-04-13 15:08       ` Peter Zijlstra
2015-04-13 15:51         ` Waiman Long
2015-04-13 15:51           ` Waiman Long
2015-04-13 15:51         ` Waiman Long
2015-04-13 15:08       ` Peter Zijlstra
2015-04-13 15:09       ` Peter Zijlstra
2015-04-13 15:09       ` Peter Zijlstra
2015-04-13 15:09       ` Peter Zijlstra
2015-04-13 16:19         ` Waiman Long
2015-04-13 16:19         ` Waiman Long
2015-04-13 16:19         ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 10/15] pvqspinlock: Implement the paravirt qspinlock for x86 Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 11/15] pvqspinlock, x86: Enable PV qspinlock for KVM Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 12/15] pvqspinlock, x86: Enable PV qspinlock for Xen Waiman Long
2015-04-08 12:01   ` [Xen-devel] " David Vrabel
2015-04-08 12:01   ` David Vrabel
2015-04-08 12:01   ` [Xen-devel] " David Vrabel
2015-04-08 12:01     ` David Vrabel
2015-04-08 17:42     ` Waiman Long
2015-04-08 17:42     ` [Xen-devel] " Waiman Long
2015-04-08 17:42     ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long
2015-04-09 19:57   ` Peter Zijlstra
2015-04-09 19:57   ` Peter Zijlstra
2015-04-09 19:57   ` Peter Zijlstra
2015-04-09 20:07     ` Peter Zijlstra
2015-04-09 20:07       ` Peter Zijlstra
2015-04-09 20:07     ` Peter Zijlstra
2015-04-09 22:06     ` Waiman Long
2015-04-09 22:06     ` Waiman Long
2015-04-09 22:06     ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 14/15] pvqspinlock: Improve slowpath performance by avoiding cmpxchg Waiman Long
2015-04-07  2:55 ` Waiman Long [this message]
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55 ` [PATCH v15 15/15] pvqspinlock: Add debug code to check for PV lock hash sanity Waiman Long
2015-04-07  2:55 ` Waiman Long
2015-04-07  2:55   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1428375350-9213-15-git-send-email-Waiman.Long@hp.com \
    --to=waiman.long@hp.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=daniel@numascale.com \
    --cc=david.vrabel@citrix.com \
    --cc=doug.hatch@hp.com \
    --cc=hpa@zytor.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paolo.bonzini@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.