From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Mateusz Guzik <mjguzik@gmail.com>,
Tony Luck <tony.luck@intel.com>,
Nicholas Piggin <npiggin@gmail.com>,
Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Sasha Levin <sashal@kernel.org>,
ubizjak@gmail.com
Subject: [PATCH AUTOSEL 5.10 17/17] lockref: stop doing cpu_relax in the cmpxchg loop
Date: Mon, 16 Jan 2023 09:04:48 -0500 [thread overview]
Message-ID: <20230116140448.116034-17-sashal@kernel.org> (raw)
In-Reply-To: <20230116140448.116034-1-sashal@kernel.org>
From: Mateusz Guzik <mjguzik@gmail.com>
[ Upstream commit f5fe24ef17b5fbe6db49534163e77499fb10ae8c ]
On the x86-64 architecture even a failing cmpxchg grants exclusive
access to the cacheline, making it preferable to retry the failed op
immediately instead of stalling with the pause instruction.
To illustrate the impact, below are benchmark results obtained by
running various will-it-scale tests on top of the 6.2-rc3 kernel and
Cascade Lake (2 sockets * 24 cores * 2 threads) CPU.
All results in ops/s. Note there is some variance in re-runs, but the
code is consistently faster when contention is present.
open3 ("Same file open/close"):
proc stock no-pause
1 805603 814942 (+%1)
2 1054980 1054781 (-0%)
8 1544802 1822858 (+18%)
24 1191064 2199665 (+84%)
48 851582 1469860 (+72%)
96 609481 1427170 (+134%)
fstat2 ("Same file fstat"):
proc stock no-pause
1 3013872 3047636 (+1%)
2 4284687 4400421 (+2%)
8 3257721 5530156 (+69%)
24 2239819 5466127 (+144%)
48 1701072 5256609 (+209%)
96 1269157 6649326 (+423%)
Additionally, a kernel with a private patch to help access() scalability:
access2 ("Same file access"):
proc stock patched patched
+nopause
24 2378041 2005501 5370335 (-15% / +125%)
That is, fixing the problems in access itself *reduces* scalability
after the cacheline ping-pong only happens in lockref with the pause
instruction.
Note that fstat and access benchmarks are not currently integrated into
will-it-scale, but interested parties can find them in pull requests to
said project.
Code at hand has a rather tortured history. First modification showed
up in commit d472d9d98b46 ("lockref: Relax in cmpxchg loop"), written
with Itanium in mind. Later it got patched up to use an arch-dependent
macro to stop doing it on s390 where it caused a significant regression.
Said macro had undergone revisions and was ultimately eliminated later,
going back to cpu_relax.
While I intended to only remove cpu_relax for x86-64, I got the
following comment from Linus:
I would actually prefer just removing it entirely and see if
somebody else hollers. You have the numbers to prove it hurts on
real hardware, and I don't think we have any numbers to the
contrary.
So I think it's better to trust the numbers and remove it as a
failure, than say "let's just remove it on x86-64 and leave
everybody else with the potentially broken code"
Additionally, Will Deacon (maintainer of the arm64 port, one of the
architectures previously benchmarked):
So, from the arm64 side of the fence, I'm perfectly happy just
removing the cpu_relax() calls from lockref.
As such, come back full circle in history and whack it altogether.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=WdA@mail.gmail.com/
Acked-by: Tony Luck <tony.luck@intel.com> # ia64
Acked-by: Nicholas Piggin <npiggin@gmail.com> # powerpc
Acked-by: Will Deacon <will@kernel.org> # arm64
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
lib/lockref.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/lib/lockref.c b/lib/lockref.c
index 5b34bbd3eba8..81ac5f355242 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -24,7 +24,6 @@
} \
if (!--retry) \
break; \
- cpu_relax(); \
} \
} while (0)
--
2.35.1
prev parent reply other threads:[~2023-01-16 14:14 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-16 14:04 [PATCH AUTOSEL 5.10 01/17] scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 02/17] cpufreq: Add Tegra234 to cpufreq-dt-platdev blocklist Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 03/17] kcsan: test: don't put the expect array on the stack Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 04/17] ASoC: fsl_micfil: Correct the number of steps on SX controls Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 05/17] drm: Add orientation quirk for Lenovo ideapad D330-10IGL Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 06/17] s390/debug: add _ASM_S390_ prefix to header guard Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 07/17] arm64/mm: Define dummy pud_user_exec() when using 2-level page-table Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 08/17] cpufreq: armada-37xx: stop using 0 as NULL pointer Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 09/17] ASoC: fsl_ssi: Rename AC'97 streams to avoid collisions with AC'97 CODEC Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 10/17] ASoC: fsl-asoc-card: Fix naming of AC'97 CODEC widgets Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 11/17] spi: spidev: fix a race condition when accessing spidev->spi Sasha Levin
2023-01-16 14:40 ` Mark Brown
2023-01-23 0:59 ` Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 12/17] spi: spidev: remove debug messages that access spidev->spi without locking Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 13/17] KVM: s390: interrupt: use READ_ONCE() before cmpxchg() Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 14/17] scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 15/17] platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HD Sasha Levin
2023-01-16 14:04 ` [PATCH AUTOSEL 5.10 16/17] platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCK Sasha Levin
2023-01-16 14:04 ` Sasha Levin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230116140448.116034-17-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mjguzik@gmail.com \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=ubizjak@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).