All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ali Saidi <alisaidi@amazon.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Steve Capper <steve.capper@arm.com>,
	Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 13/36] locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
Date: Mon, 26 Apr 2021 09:29:55 +0200	[thread overview]
Message-ID: <20210426072819.244069774@linuxfoundation.org> (raw)
In-Reply-To: <20210426072818.777662399@linuxfoundation.org>

From: Ali Saidi <alisaidi@amazon.com>

[ Upstream commit 84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896 ]

While this code is executed with the wait_lock held, a reader can
acquire the lock without holding wait_lock.  The writer side loops
checking the value with the atomic_cond_read_acquire(), but only truly
acquires the lock when the compare-and-exchange is completed
successfully which isn’t ordered. This exposes the window between the
acquire and the cmpxchg to an A-B-A problem which allows reads
following the lock acquisition to observe values speculatively before
the write lock is truly acquired.

We've seen a problem in epoll where the reader does a xchg while
holding the read lock, but the writer can see a value change out from
under it.

  Writer                                | Reader
  --------------------------------------------------------------------------------
  ep_scan_ready_list()                  |
  |- write_lock_irq()                   |
      |- queued_write_lock_slowpath()   |
	|- atomic_cond_read_acquire()   |
				        | read_lock_irqsave(&ep->lock, flags);
     --> (observes value before unlock) |  chain_epi_lockless()
     |                                  |    epi->next = xchg(&ep->ovflist, epi);
     |                                  | read_unlock_irqrestore(&ep->lock, flags);
     |                                  |
     |     atomic_cmpxchg_relaxed()     |
     |-- READ_ONCE(ep->ovflist);        |

A core can order the read of the ovflist ahead of the
atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire
semantics addresses this issue at which point the atomic_cond_read can
be switched to use relaxed semantics.

Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock")
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
[peterz: use try_cmpxchg()]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/locking/qrwlock.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index fe9ca92faa2a..909b0bf22a1e 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -61,6 +61,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath);
  */
 void queued_write_lock_slowpath(struct qrwlock *lock)
 {
+	int cnts;
+
 	/* Put the writer into the wait queue */
 	arch_spin_lock(&lock->wait_lock);
 
@@ -74,9 +76,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 
 	/* When no more readers or writers, set the locked flag */
 	do {
-		atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING);
-	} while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
-					_QW_LOCKED) != _QW_WAITING);
+		cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+	} while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED));
 unlock:
 	arch_spin_unlock(&lock->wait_lock);
 }
-- 
2.30.2




  parent reply	other threads:[~2021-04-26  7:49 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26  7:29 [PATCH 5.10 00/36] 5.10.33-rc1 review Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 01/36] vhost-vdpa: protect concurrent access to vhost device iotlb Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 02/36] gpio: omap: Save and restore sysconfig Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 03/36] KEYS: trusted: Fix TPM reservation for seal/unseal Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 04/36] vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 05/36] pinctrl: lewisburg: Update number of pins in community Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 06/36] block: return -EBUSY when there are open partitions in blkdev_reread_part Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 07/36] pinctrl: core: Show pin numbers for the controllers with base = 0 Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 08/36] arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 09/36] bpf: Permits pointers on stack for helper calls Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 10/36] bpf: Allow variable-offset stack access Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 11/36] bpf: Refactor and streamline bounds check into helper Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 12/36] bpf: Tighten speculative pointer arithmetic mask Greg Kroah-Hartman
2021-04-26  7:29 ` Greg Kroah-Hartman [this message]
2021-04-26  7:29 ` [PATCH 5.10 14/36] perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3 Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 15/36] perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[] Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 16/36] perf auxtrace: Fix potential NULL pointer dereference Greg Kroah-Hartman
2021-04-26  7:29 ` [PATCH 5.10 17/36] perf map: Fix error return code in maps__clone() Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 18/36] HID: google: add don USB id Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 19/36] HID: alps: fix error return code in alps_input_configured() Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 20/36] HID cp2112: fix support for multiple gpiochips Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 21/36] HID: wacom: Assign boolean values to a bool variable Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 22/36] soc: qcom: geni: shield geni_icc_get() for ACPI boot Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 23/36] dmaengine: xilinx: dpdma: Fix descriptor issuing on video group Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 24/36] dmaengine: xilinx: dpdma: Fix race condition in done IRQ Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 25/36] ARM: dts: Fix swapped mmc order for omap3 Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 26/36] net: geneve: check skb is large enough for IPv4/IPv6 header Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 27/36] dmaengine: tegra20: Fix runtime PM imbalance on error Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 28/36] s390/entry: save the caller of psw_idle Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 29/36] arm64: kprobes: Restore local irqflag if kprobes is cancelled Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 30/36] xen-netback: Check for hotplug-status existence before watching Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 31/36] cavium/liquidio: Fix duplicate argument Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 32/36] kasan: fix hwasan build for gcc Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 33/36] csky: change a Kconfig symbol name to fix e1000 build error Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 34/36] ia64: fix discontig.c section mismatches Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 35/36] ia64: tools: remove duplicate definition of ia64_mf() on ia64 Greg Kroah-Hartman
2021-04-26  7:30 ` [PATCH 5.10 36/36] x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access Greg Kroah-Hartman
2021-04-26 13:04 ` [PATCH 5.10 00/36] 5.10.33-rc1 review Jon Hunter
2021-04-26 14:24 ` Fox Chen
2021-04-26 15:50 ` Patrick Mccormick
2021-04-26 16:54 ` Florian Fainelli
2021-04-26 18:34 ` Guenter Roeck
2021-04-26 20:33 ` Sudip Mukherjee
2021-04-26 23:46 ` Shuah Khan
2021-04-27  2:13 ` Samuel Zou
2021-04-27  6:14 ` Naresh Kamboju
2021-04-27  7:40 ` Pavel Machek
2021-04-27 15:34 ` Andrei Rabusov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210426072819.244069774@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alisaidi@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=steve.capper@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.