linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	andrea.parri@amarulasolutions.com, longman@redhat.com,
	Ingo Molnar <mingo@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.14 29/72] locking/qspinlock, x86: Provide liveness guarantee
Date: Thu, 20 Dec 2018 10:18:28 +0100	[thread overview]
Message-ID: <20181220085923.488666294@linuxfoundation.org> (raw)
In-Reply-To: <20181220085922.332225035@linuxfoundation.org>

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream.

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -> (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -> (0,0,0)

 4)					lock
					  trylock -> (0,0,1)

 5)			  tas-pending -> (0,1,1)
			  load-val <- (0,1,0) from 3

 6)			  clear-pending-set-locked -> (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.

Suggested-by: Will Deacon <will.deacon@arm.com>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bigeasy: GEN_BINARY_RMWcc macro redo]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/include/asm/qspinlock.h | 21 +++++++++++++++++++++
 kernel/locking/qspinlock.c       | 17 ++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index 2cb6624acaec..f784b95e44df 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -5,9 +5,30 @@
 #include <asm/cpufeature.h>
 #include <asm-generic/qspinlock_types.h>
 #include <asm/paravirt.h>
+#include <asm/rmwcc.h>
 
 #define _Q_PENDING_LOOPS	(1 << 9)
 
+#define queued_fetch_set_pending_acquire queued_fetch_set_pending_acquire
+
+static __always_inline bool __queued_RMW_btsl(struct qspinlock *lock)
+{
+	GEN_BINARY_RMWcc(LOCK_PREFIX "btsl", lock->val.counter,
+			 "I", _Q_PENDING_OFFSET, "%0", c);
+}
+
+static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock)
+{
+	u32 val = 0;
+
+	if (__queued_RMW_btsl(lock))
+		val |= _Q_PENDING_VAL;
+
+	val |= atomic_read(&lock->val) & ~_Q_PENDING_MASK;
+
+	return val;
+}
+
 #define	queued_spin_unlock queued_spin_unlock
 /**
  * queued_spin_unlock - release a queued spinlock
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 9ffc2f9af8b8..1011a1b292ac 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -225,6 +225,20 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
 }
 #endif /* _Q_PENDING_BITS == 8 */
 
+/**
+ * queued_fetch_set_pending_acquire - fetch the whole lock value and set pending
+ * @lock : Pointer to queued spinlock structure
+ * Return: The previous lock value
+ *
+ * *,*,* -> *,1,*
+ */
+#ifndef queued_fetch_set_pending_acquire
+static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock)
+{
+	return atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val);
+}
+#endif
+
 /**
  * set_locked - Set the lock bit and own the lock
  * @lock: Pointer to queued spinlock structure
@@ -323,7 +337,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	 * 0,0,0 -> 0,0,1 ; trylock
 	 * 0,0,1 -> 0,1,1 ; pending
 	 */
-	val = atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val);
+	val = queued_fetch_set_pending_acquire(lock);
+
 	/*
 	 * If we observe any contention; undo and queue.
 	 */
-- 
2.19.1




  parent reply	other threads:[~2018-12-20  9:27 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-20  9:17 [PATCH 4.14 00/72] 4.14.90-stable review Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 01/72] timer/debug: Change /proc/timer_list from 0444 to 0400 Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 02/72] pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11 Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 03/72] aio: fix spectre gadget in lookup_ioctx Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 04/72] userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 05/72] arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 06/72] MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310 Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 07/72] mmc: sdhci: fix the timeout check window for clock and reset Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 08/72] fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 09/72] ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 10/72] dm thin: send event about thin-pool state change _after_ making it Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 11/72] dm cache metadata: verify cache has blocks in blocks_are_clean_separate_dirty() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 12/72] tracing: Fix memory leak in set_trigger_filter() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 13/72] tracing: Fix memory leak of instance function hash filters Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 14/72] powerpc/msi: Fix NULL pointer access in teardown code Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 15/72] drm/nouveau/kms: Fix memory leak in nv50_mstm_del() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 16/72] Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec" Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 17/72] drm/i915/execlists: Apply a full mb before execution for Braswell Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 18/72] drm/amdgpu: update SMC firmware image for polaris10 variants Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 19/72] x86/build: Fix compiler support check for CONFIG_RETPOLINE Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 20/72] locking: Remove smp_read_barrier_depends() from queued_spin_lock_slowpath() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 21/72] locking/qspinlock: Ensure node is initialised before updating prev->next Greg Kroah-Hartman
2018-12-20 12:19   ` Sudip Mukherjee
2018-12-20 15:41     ` Greg Kroah-Hartman
2018-12-20 19:20       ` Sudip Mukherjee
2018-12-20  9:18 ` [PATCH 4.14 22/72] locking/qspinlock: Bound spinning on pending->locked transition in slowpath Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 23/72] locking/qspinlock: Merge struct __qspinlock into struct qspinlock Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 24/72] locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 25/72] locking/qspinlock: Remove duplicate clear_pending() function from PV code Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 26/72] locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 27/72] locking/qspinlock: Re-order code Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 28/72] locking/qspinlock/x86: Increase _Q_PENDING_LOOPS upper bound Greg Kroah-Hartman
2018-12-20  9:18 ` Greg Kroah-Hartman [this message]
2018-12-20 12:14   ` [PATCH 4.14 29/72] locking/qspinlock, x86: Provide liveness guarantee Sudip Mukherjee
2018-12-20 15:40     ` Greg Kroah-Hartman
2018-12-20 18:49       ` Sudip Mukherjee
2018-12-20 23:05         ` Sebastian Andrzej Siewior
2018-12-21 12:59           ` Sudip Mukherjee
2018-12-20  9:18 ` [PATCH 4.14 30/72] elevator: lookup mq vs non-mq elevators Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 31/72] netfilter: ipset: Fix wraparound in hash:*net* types Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 32/72] mac80211: dont WARN on bad WMM parameters from buggy APs Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 33/72] mac80211: Fix condition validating WMM IE Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 34/72] IB/hfi1: Remove race conditions in user_sdma send path Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 35/72] locking/qspinlock: Fix build for anonymous union in older GCC compilers Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 36/72] mac80211_hwsim: fix module init error paths for netlink Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 37/72] Input: hyper-v - fix wakeup from suspend-to-idle Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 38/72] scsi: libiscsi: Fix NULL pointer dereference in iscsi_eh_session_reset Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 39/72] scsi: vmw_pscsi: Rearrange code to avoid multiple calls to free_irq during unload Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 40/72] x86/earlyprintk/efi: Fix infinite loop on some screen widths Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 41/72] drm/msm: Grab a vblank reference when waiting for commit_done Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 42/72] ARC: io.h: Implement reads{x}()/writes{x}() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 43/72] bonding: fix 802.3ad state sent to partner when unbinding slave Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 44/72] bpf: Fix verifier log string check for bad alignment Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 45/72] nfs: dont dirty kernel pages read by direct-io Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 46/72] SUNRPC: Fix a potential race in xprt_connect() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 47/72] sbus: char: add of_node_put() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 48/72] drivers/sbus/char: " Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 49/72] drivers/tty: add missing of_node_put() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 50/72] ide: pmac: add of_node_put() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 51/72] drm/msm: Fix error return checking Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 52/72] clk: mvebu: Off by one bugs in cp110_of_clk_get() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 53/72] clk: mmp: Off by one in mmp_clk_add() Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 54/72] Input: synaptics - enable SMBus for HP 15-ay000 Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 55/72] Input: omap-keypad - fix keyboard debounce configuration Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 56/72] libata: whitelist all SAMSUNG MZ7KM* solid-state disks Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 57/72] mv88e6060: disable hardware level MAC learning Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 58/72] net/mlx4_en: Fix build break when CONFIG_INET is off Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 59/72] ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handling Greg Kroah-Hartman
2018-12-20  9:18 ` [PATCH 4.14 60/72] ARM: 8815/1: V7M: align v7m_dma_inv_range() with v7 counterpart Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 61/72] ethernet: fman: fix wrong of_node_put() in probe function Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 62/72] drm/ast: Fix connector leak during driver unload Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 63/72] cifs: In Kconfig CONFIG_CIFS_POSIX needs depends on legacy (insecure cifs) Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 64/72] vhost/vsock: fix reset orphans race with close timeout Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 65/72] mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 66/72] i2c: axxia: properly handle master timeout Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 67/72] i2c: scmi: Fix probe error on devices with an empty SMB0001 ACPI device node Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 68/72] i2c: uniphier: fix violation of tLOW requirement for Fast-mode Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 69/72] i2c: uniphier-f: " Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 70/72] nvmet-rdma: fix response use after free Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 71/72] rtc: snvs: Add timeouts to avoid kernel lockups Greg Kroah-Hartman
2018-12-20  9:19 ` [PATCH 4.14 72/72] bpf, arm: fix emit_ldx_r and emit_mov_i using TMP_REG_1 Greg Kroah-Hartman
2018-12-20 18:29 ` [PATCH 4.14 00/72] 4.14.90-stable review Guenter Roeck
2018-12-20 22:50 ` shuah
2018-12-21  1:13 ` Naresh Kamboju
     [not found] ` <CA+res+QQt7GEL0td1FraSJNnR2yOcDgXGahW+YyiuMSeWTshfw@mail.gmail.com>
2018-12-21  7:54   ` Jinpu Wang
2018-12-21 10:00     ` Greg Kroah-Hartman
2018-12-21  9:27 ` Jon Hunter
2018-12-21 10:00   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181220085923.488666294@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).