All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Waiman Long <longman@redhat.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	John Donnelly <john.p.donnelly@oracle.com>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 5.15 68/69] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter
Date: Mon,  1 Aug 2022 13:47:32 +0200	[thread overview]
Message-ID: <20220801114137.213981995@linuxfoundation.org> (raw)
In-Reply-To: <20220801114134.468284027@linuxfoundation.org>

From: Waiman Long <longman@redhat.com>

commit 6eebd5fb20838f5971ba17df9f55cc4f84a31053 upstream.

With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent"), the writer that sets the handoff bit can be interrupted
out without clearing the bit if the wait queue isn't empty. This disables
reader and writer optimistic lock spinning and stealing.

Now if a non-first writer in the queue is somehow woken up or a new
waiter enters the slowpath, it can't acquire the lock.  This is not the
case before commit d257cc8cb8d5 as the writer that set the handoff bit
will clear it when exiting out via the out_nolock path. This is less
efficient as the busy rwsem stays in an unlock state for a longer time.

In some cases, this new behavior may cause lockups as shown in [1] and
[2].

This patch allows a non-first writer to ignore the handoff bit if it
is not originally set or initiated by the first waiter. This patch is
shown to be effective in fixing the lockup problem reported in [1].

[1] https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/
[2] https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/

Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Link: https://lore.kernel.org/r/20220622200419.778799-1-longman@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/locking/rwsem.c |   30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -335,8 +335,6 @@ struct rwsem_waiter {
 	struct task_struct *task;
 	enum rwsem_waiter_type type;
 	unsigned long timeout;
-
-	/* Writer only, not initialized in reader */
 	bool handoff_set;
 };
 #define rwsem_first_waiter(sem) \
@@ -456,10 +454,12 @@ static void rwsem_mark_wake(struct rw_se
 			 * to give up the lock), request a HANDOFF to
 			 * force the issue.
 			 */
-			if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
-			    time_after(jiffies, waiter->timeout)) {
-				adjustment -= RWSEM_FLAG_HANDOFF;
-				lockevent_inc(rwsem_rlock_handoff);
+			if (time_after(jiffies, waiter->timeout)) {
+				if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
+					adjustment -= RWSEM_FLAG_HANDOFF;
+					lockevent_inc(rwsem_rlock_handoff);
+				}
+				waiter->handoff_set = true;
 			}
 
 			atomic_long_add(-adjustment, &sem->count);
@@ -569,7 +569,7 @@ static void rwsem_mark_wake(struct rw_se
 static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
 					struct rwsem_waiter *waiter)
 {
-	bool first = rwsem_first_waiter(sem) == waiter;
+	struct rwsem_waiter *first = rwsem_first_waiter(sem);
 	long count, new;
 
 	lockdep_assert_held(&sem->wait_lock);
@@ -579,11 +579,20 @@ static inline bool rwsem_try_write_lock(
 		bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
 
 		if (has_handoff) {
-			if (!first)
+			/*
+			 * Honor handoff bit and yield only when the first
+			 * waiter is the one that set it. Otherwisee, we
+			 * still try to acquire the rwsem.
+			 */
+			if (first->handoff_set && (waiter != first))
 				return false;
 
-			/* First waiter inherits a previously set handoff bit */
-			waiter->handoff_set = true;
+			/*
+			 * First waiter can inherit a previously set handoff
+			 * bit and spin on rwsem if lock acquisition fails.
+			 */
+			if (waiter == first)
+				waiter->handoff_set = true;
 		}
 
 		new = count;
@@ -978,6 +987,7 @@ queue:
 	waiter.task = current;
 	waiter.type = RWSEM_WAITING_FOR_READ;
 	waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
+	waiter.handoff_set = false;
 
 	raw_spin_lock_irq(&sem->wait_lock);
 	if (list_empty(&sem->wait_list)) {



  parent reply	other threads:[~2022-08-01 12:07 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-01 11:46 [PATCH 5.15 00/69] 5.15.59-rc1 review Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 01/69] Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 02/69] Revert "ocfs2: mount shared volume without ha stack" Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 03/69] ntfs: fix use-after-free in ntfs_ucsncmp() Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 04/69] fs: sendfile handles O_NONBLOCK of out_fd Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 05/69] secretmem: fix unhandled fault in truncate Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 06/69] mm: fix page leak with multiple threads mapping the same page Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 07/69] hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 08/69] asm-generic: remove a broken and needless ifdef conditional Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 09/69] s390/archrandom: prevent CPACF trng invocations in interrupt context Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 10/69] nouveau/svm: Fix to migrate all requested pages Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 11/69] drm/simpledrm: Fix return type of simpledrm_simple_display_pipe_mode_valid() Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 12/69] watch_queue: Fix missing rcu annotation Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 13/69] watch_queue: Fix missing locking in add_watch_to_object() Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 14/69] tcp: Fix data-races around sysctl_tcp_dsack Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 15/69] tcp: Fix a data-race around sysctl_tcp_app_win Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 16/69] tcp: Fix a data-race around sysctl_tcp_adv_win_scale Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 17/69] tcp: Fix a data-race around sysctl_tcp_frto Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 18/69] tcp: Fix a data-race around sysctl_tcp_nometrics_save Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 19/69] tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 20/69] ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS) Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 21/69] ice: do not setup vlan for loopback VSI Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 22/69] scsi: ufs: host: Hold reference returned by of_parse_phandle() Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 23/69] Revert "tcp: change pingpong threshold to 3" Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 24/69] octeontx2-pf: Fix UDP/TCP src and dst port tc filters Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 25/69] tcp: Fix data-races around sysctl_tcp_moderate_rcvbuf Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 26/69] tcp: Fix a data-race around sysctl_tcp_limit_output_bytes Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 27/69] tcp: Fix a data-race around sysctl_tcp_challenge_ack_limit Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 28/69] scsi: core: Fix warning in scsi_alloc_sgtables() Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 29/69] scsi: mpt3sas: Stop fw fault watchdog work item during system shutdown Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 30/69] net: ping6: Fix memleak in ipv6_renew_options() Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 31/69] ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 32/69] net/tls: Remove the context from the list in tls_device_down Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 33/69] igmp: Fix data-races around sysctl_igmp_qrv Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 34/69] net: pcs: xpcs: propagate xpcs_read error to xpcs_get_state_c37_sgmii Greg Kroah-Hartman
2022-08-01 11:46 ` [PATCH 5.15 35/69] net: sungem_phy: Add of_node_put() for reference returned by of_get_parent() Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 36/69] tcp: Fix a data-race around sysctl_tcp_min_tso_segs Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 37/69] tcp: Fix a data-race around sysctl_tcp_min_rtt_wlen Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 38/69] tcp: Fix a data-race around sysctl_tcp_autocorking Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 39/69] tcp: Fix a data-race around sysctl_tcp_invalid_ratelimit Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 40/69] Documentation: fix sctp_wmem in ip-sysctl.rst Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 41/69] macsec: fix NULL deref in macsec_add_rxsa Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 42/69] macsec: fix error message in macsec_add_rxsa and _txsa Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 43/69] macsec: limit replay window size with XPN Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 44/69] macsec: always read MACSEC_SA_ATTR_PN as a u64 Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 45/69] net: macsec: fix potential resource leak in macsec_add_rxsa() and macsec_add_txsa() Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 46/69] net: mld: fix reference count leak in mld_{query | report}_work() Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 47/69] tcp: Fix data-races around sk_pacing_rate Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 48/69] net: Fix data-races around sysctl_[rw]mem(_offset)? Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 49/69] tcp: Fix a data-race around sysctl_tcp_comp_sack_delay_ns Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 50/69] tcp: Fix a data-race around sysctl_tcp_comp_sack_slack_ns Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 51/69] tcp: Fix a data-race around sysctl_tcp_comp_sack_nr Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 52/69] tcp: Fix data-races around sysctl_tcp_reflect_tos Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 53/69] ipv4: Fix data-races around sysctl_fib_notify_on_flag_change Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 54/69] i40e: Fix interface init with MSI interrupts (no MSI-X) Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 55/69] sctp: fix sleep in atomic context bug in timer handlers Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 56/69] octeontx2-pf: cn10k: Fix egress ratelimit configuration Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 57/69] netfilter: nf_queue: do not allow packet truncation below transport header offset Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 58/69] virtio-net: fix the race between refill work and close Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 59/69] perf symbol: Correct address for bss symbols Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 60/69] sfc: disable softirqs for ptp TX Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 61/69] sctp: leave the err path free in sctp_stream_init to sctp_stream_free Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 62/69] ARM: crypto: comment out gcc warning that breaks clang builds Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 63/69] mm/hmm: fault non-owner device private entries Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 64/69] page_alloc: fix invalid watermark check on a negative value Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 65/69] ARM: 9216/1: Fix MAX_DMA_ADDRESS overflow Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 66/69] EDAC/ghes: Set the DIMM label unconditionally Greg Kroah-Hartman
2022-08-01 11:47 ` [PATCH 5.15 67/69] docs/kernel-parameters: Update descriptions for "mitigations=" param with retbleed Greg Kroah-Hartman
2022-08-01 11:47 ` Greg Kroah-Hartman [this message]
2022-08-01 11:47 ` [PATCH 5.15 69/69] x86/bugs: Do not enable IBPB at firmware entry when IBPB is not available Greg Kroah-Hartman
2022-08-01 14:26 ` [PATCH 5.15 00/69] 5.15.59-rc1 review Jon Hunter
2022-08-01 21:19 ` Florian Fainelli
2022-08-01 21:53 ` Daniel Díaz
2022-08-01 22:18 ` Shuah Khan
2022-08-02  4:18 ` Bagas Sanjaya
2022-08-02  5:29 ` Guenter Roeck
2022-08-02 11:08 ` Ron Economos
2022-08-02 17:45 ` Sudip Mukherjee (Codethink)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220801114137.213981995@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=john.p.donnelly@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.