From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751883AbcGRUjl (ORCPT <rfc822;w@1wt.eu>);
	Mon, 18 Jul 2016 16:39:41 -0400
Received: from g4t3425.houston.hpe.com ([15.241.140.78]:54566 "EHLO
	g4t3425.houston.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751801AbcGRUjh (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 18 Jul 2016 16:39:37 -0400
From: Waiman Long <Waiman.Long@hpe.com>
To: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>, linux-kernel@vger.kernel.org,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Ding Tianhong <dingtianhong@huawei.com>,
        Jason Low <jason.low2@hpe.com>, Davidlohr Bueso <dave@stgolabs.net>,
        "Paul E. McKenney" <paulmck@us.ibm.com>,
        Thomas Gleixner <tglx@linutronix.de>,
        Will Deacon <Will.Deacon@arm.com>,
        Tim Chen <tim.c.chen@linux.intel.com>, Imre Deak <imre.deak@intel.com>,
        Waiman Long <Waiman.Long@hpe.com>
Subject: [PATCH v4 3/3] locking/mutex: Ensure forward progress of waiter-spinner
Date: Mon, 18 Jul 2016 16:39:26 -0400
Message-Id: <1468874366-56955-4-git-send-email-Waiman.Long@hpe.com>
X-Mailer: git-send-email 1.7.1
In-Reply-To: <1468874366-56955-1-git-send-email-Waiman.Long@hpe.com>
References: <1468874366-56955-1-git-send-email-Waiman.Long@hpe.com>
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

As both an optimistic spinner and a waiter-spinner (a woken task from
the wait queue spinning) can be spinning on the lock at the same time,
we cannot ensure forward progress for the waiter-spinner. Therefore,
it is possible for the waiter-spinner to be starved of getting the
lock, though not likely.

This patch adds a flag to indicate that a waiter-spinner is
spinning and hence has priority over the acquisition of the lock. A
waiter-spinner sets this flag while spinning. An optimistic spinner
will check this flag and yield if set. This essentially makes the
waiter-spinner jump to the head of the optimistic spinning queue to
acquire the lock.

There will be no increase in size for the mutex structure for 64-bit
architectures. For 32-bit architectures, there will be a size increase
of 4 bytes.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 include/linux/mutex.h  |    1 +
 kernel/locking/mutex.c |   36 +++++++++++++++++++++++++++---------
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2cb7531..f8e91ad 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -57,6 +57,7 @@ struct mutex {
 #endif
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
 	struct optimistic_spin_queue osq; /* Spinner MCS lock */
+	int waiter_spinning;
 #endif
 #ifdef CONFIG_DEBUG_MUTEXES
 	void			*magic;
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index adb21aa..52225bd 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -55,6 +55,7 @@ __mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
 	mutex_clear_owner(lock);
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
 	osq_lock_init(&lock->osq);
+	lock->waiter_spinning = false;
 #endif
 
 	debug_mutex_init(lock, name, key);
@@ -341,9 +342,21 @@ static bool mutex_optimistic_spin(struct mutex *lock,
 		 */
 		if (!osq_lock(&lock->osq))
 			goto done;
+	} else {
+		/*
+		 * Turn on the waiter spinning flag to discourage the spinner
+		 * from getting the lock.
+		 */
+		lock->waiter_spinning = true;
 	}
 
-	while (true) {
+	/*
+	 * The cpu_relax_lowlatency() call is a compiler barrier which forces
+	 * everything in this loop to be re-loaded. We don't need memory
+	 * barriers as we'll eventually observe the right values at the cost
+	 * of a few extra spins.
+	 */
+	for (;; cpu_relax_lowlatency()) {
 		struct task_struct *owner;
 
 		if (use_ww_ctx && ww_ctx->acquired > 0) {
@@ -363,6 +376,17 @@ static bool mutex_optimistic_spin(struct mutex *lock,
 		}
 
 		/*
+		 * For regular opt-spinner, it waits until the waiter_spinning
+		 * flag isn't set. This will ensure forward progress for
+		 * the waiter spinner.
+		 */
+		if (!waiter && READ_ONCE(lock->waiter_spinning)) {
+			if (need_resched())
+				break;
+			continue;
+		}
+
+		/*
 		 * If there's an owner, wait for it to either
 		 * release the lock or go to sleep.
 		 */
@@ -394,18 +418,12 @@ static bool mutex_optimistic_spin(struct mutex *lock,
 		 */
 		if (!owner && (need_resched() || rt_task(task)))
 			break;
-
-		/*
-		 * The cpu_relax() call is a compiler barrier which forces
-		 * everything in this loop to be re-loaded. We don't need
-		 * memory barriers as we'll eventually observe the right
-		 * values at the cost of a few extra spins.
-		 */
-		cpu_relax_lowlatency();
 	}
 
 	if (!waiter)
 		osq_unlock(&lock->osq);
+	else
+		lock->waiter_spinning = false;
 done:
 	/*
 	 * If we fell out of the spin path because of need_resched(),
-- 
1.7.1