stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ali Saidi <alisaidi@amazon.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Steve Capper <steve.capper@arm.com>,
	Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 13/36] locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
Date: Mon, 26 Apr 2021 09:29:55 +0200	[thread overview]
Message-ID: <20210426072819.244069774@linuxfoundation.org> (raw)
In-Reply-To: <20210426072818.777662399@linuxfoundation.org>

From: Ali Saidi <alisaidi@amazon.com>

[ Upstream commit 84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896 ]

While this code is executed with the wait_lock held, a reader can
acquire the lock without holding wait_lock.  The writer side loops
checking the value with the atomic_cond_read_acquire(), but only truly
acquires the lock when the compare-and-exchange is completed
successfully which isn’t ordered. This exposes the window between the
acquire and the cmpxchg to an A-B-A problem which allows reads
following the lock acquisition to observe values speculatively before
the write lock is truly acquired.

We've seen a problem in epoll where the reader does a xchg while
holding the read lock, but the writer can see a value change out from
under it.

  Writer                                | Reader
  --------------------------------------------------------------------------------
  ep_scan_ready_list()                  |
  |- write_lock_irq()                   |
      |- queued_write_lock_slowpath()   |
	|- atomic_cond_read_acquire()   |
				        | read_lock_irqsave(&ep->lock, flags);
     --> (observes value before unlock) |  chain_epi_lockless()
     |                                  |    epi->next = xchg(&ep->ovflist, epi);
     |                                  | read_unlock_irqrestore(&ep->lock, flags);
     |                                  |
     |     atomic_cmpxchg_relaxed()     |
     |-- READ_ONCE(ep->ovflist);        |

A core can order the read of the ovflist ahead of the
atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire
semantics addresses this issue at which point the atomic_cond_read can
be switched to use relaxed semantics.

Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock")
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
[peterz: use try_cmpxchg()]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/locking/qrwlock.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index fe9ca92faa2a..909b0bf22a1e 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -61,6 +61,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
  */
 void queued_write_lock_slowpath(struct qrwlock *lock)
 {
+	int cnts;
+
 	/* Put the writer into the wait queue */
 	arch_spin_lock(&lock->wait_lock);
 
@@ -74,9 +76,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 
 	/* When no more readers or writers, set the locked flag */
 	do {
-		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
-	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
-					_QW_LOCKED) != _QW_WAITING);
+		cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+	} while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED));
 unlock:
 	arch_spin_unlock(&lock->wait_lock);
 }
-- 
2.30.2




  parent reply	other threads:[~2021-04-26  7:42 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26  7:29 [PATCH 5.10 00/36] 5.10.33-rc1 review Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 01/36] vhost-vdpa: protect concurrent access to vhost device iotlb Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 02/36] gpio: omap: Save and restore sysconfig Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 03/36] KEYS: trusted: Fix TPM reservation for seal/unseal Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 04/36] vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 05/36] pinctrl: lewisburg: Update number of pins in community Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 06/36] block: return -EBUSY when there are open partitions in blkdev_reread_part Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 07/36] pinctrl: core: Show pin numbers for the controllers with base = 0 Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 08/36] arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 09/36] bpf: Permits pointers on stack for helper calls Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 10/36] bpf: Allow variable-offset stack access Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 11/36] bpf: Refactor and streamline bounds check into helper Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 12/36] bpf: Tighten speculative pointer arithmetic mask Greg Kroah-Hartman
2021-04-26  7:29 ` Greg Kroah-Hartman [this message]
2021-04-26  7:29 ` [PATCH 5.10 14/36] perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3 Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 15/36] perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[] Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 16/36] perf auxtrace: Fix potential NULL pointer dereference Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 17/36] perf map: Fix error return code in maps__clone() Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 18/36] HID: google: add don USB id Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 19/36] HID: alps: fix error return code in alps_input_configured() Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 20/36] HID cp2112: fix support for multiple gpiochips Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 21/36] HID: wacom: Assign boolean values to a bool variable Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 22/36] soc: qcom: geni: shield geni_icc_get() for ACPI boot Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 23/36] dmaengine: xilinx: dpdma: Fix descriptor issuing on video group Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 24/36] dmaengine: xilinx: dpdma: Fix race condition in done IRQ Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 25/36] ARM: dts: Fix swapped mmc order for omap3 Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 26/36] net: geneve: check skb is large enough for IPv4/IPv6 header Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 27/36] dmaengine: tegra20: Fix runtime PM imbalance on error Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 28/36] s390/entry: save the caller of psw_idle Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 29/36] arm64: kprobes: Restore local irqflag if kprobes is cancelled Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 30/36] xen-netback: Check for hotplug-status existence before watching Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 31/36] cavium/liquidio: Fix duplicate argument Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 32/36] kasan: fix hwasan build for gcc Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 33/36] csky: change a Kconfig symbol name to fix e1000 build error Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 34/36] ia64: fix discontig.c section mismatches Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 35/36] ia64: tools: remove duplicate definition of ia64_mf() on ia64 Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 36/36] x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access Greg Kroah-Hartman
2021-04-26 13:04 ` [PATCH 5.10 00/36] 5.10.33-rc1 review Jon Hunter
2021-04-26 14:24 ` Fox Chen
2021-04-26 15:50 ` Patrick Mccormick
2021-04-26 16:54 ` Florian Fainelli
2021-04-26 18:34 ` Guenter Roeck
2021-04-26 20:33 ` Sudip Mukherjee
2021-04-26 23:46 ` Shuah Khan
2021-04-27  2:13 ` Samuel Zou
2021-04-27  6:14 ` Naresh Kamboju
2021-04-27  7:40 ` Pavel Machek
2021-04-27 15:34 ` Andrei Rabusov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210426072819.244069774@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alisaidi@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=steve.capper@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).