linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
       [not found] <CGME20230531003446epcas2p1fc55e0439a9c667685d495cd5f5b2e93@epcas2p1.samsung.com>
@ 2023-05-31  0:34 ` Bongkyu Kim
  2023-05-31 16:45   ` kernel test robot
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Bongkyu Kim @ 2023-05-31  0:34 UTC (permalink / raw)
  To: peterz, mingo, will, longman, boqun.feng
  Cc: linux-kernel, jwook1.kim, lakkyung.jung, Bongkyu Kim

Remove reader optimistic spinning has a great regression on application
startup performance in android device. In mobile environment, reader
optimistic spinning is still useful because there're not many readers.
So re-enable reader optimistic spinning and disabled by default. And,
can turn on by cmdline.

Test result:
This is 15 application startup performance in our s5e8535 soc.
- Cortex A78*2 + Cortex A55*6

Application             base  opt_rspin  Diff  Diff(%)
--------------------  ------  ---------  ----  -------
* Total(geomean)         343        330   -13    +3.8%
--------------------  ------  ---------  ----  -------
helloworld               110        108    -2    +1.8%
Amazon_Seller            397        388    -9    +2.3%
Whatsapp                 311        304    -7    +2.3%
Simple_PDF_Reader        500        463   -37    +7.4%
FaceApp                  330        317   -13    +3.9%
Timestamp_Camera_Free    451        443    -8    +1.8%
Kindle                   629        597   -32    +5.1%
Coinbase                 243        233   -10    +4.1%
Firefox                  425        399   -26    +6.1%
Candy_Crush_Soda         552        538   -14    +2.5%
Hill_Climb_Racing        245        230   -15    +6.1%
Call_Recorder            437        426   -11    +2.5%
Color_Fill_3D            190        180   -10    +5.3%
eToro                    512        505    -7    +1.4%
GroupMe                  281        266   -15    +5.3%

Signed-off-by: Bongkyu Kim <bongkyu7.kim@samsung.com>
---
 .../admin-guide/kernel-parameters.txt         |   2 +
 kernel/locking/lock_events_list.h             |   5 +-
 kernel/locking/rwsem.c                        | 292 +++++++++++++++---
 3 files changed, 255 insertions(+), 44 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bb23a36a7ff7..b92a6b3f965f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5495,6 +5495,8 @@
 
 	rw		[KNL] Mount root device read-write on boot
 
+	rwsem.opt_rspin= [KNL] Use rwsem reader optimistic spinning
+
 	S		[KNL] Run init in single mode
 
 	s390_iommu=	[HW,S390]
diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
index 97fb6f3f840a..270a0d351932 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -56,9 +56,12 @@ LOCK_EVENT(rwsem_sleep_reader)	/* # of reader sleeps			*/
 LOCK_EVENT(rwsem_sleep_writer)	/* # of writer sleeps			*/
 LOCK_EVENT(rwsem_wake_reader)	/* # of reader wakeups			*/
 LOCK_EVENT(rwsem_wake_writer)	/* # of writer wakeups			*/
-LOCK_EVENT(rwsem_opt_lock)	/* # of opt-acquired write locks	*/
+LOCK_EVENT(rwsem_opt_rlock)	/* # of opt-acquired read locks		*/
+LOCK_EVENT(rwsem_opt_wlock)	/* # of opt-acquired write locks	*/
 LOCK_EVENT(rwsem_opt_fail)	/* # of failed optspins			*/
 LOCK_EVENT(rwsem_opt_nospin)	/* # of disabled optspins		*/
+LOCK_EVENT(rwsem_opt_norspin)	/* # of disabled reader-only optspins	*/
+LOCK_EVENT(rwsem_opt_rlock2)	/* # of opt-acquired 2ndary read locks	*/
 LOCK_EVENT(rwsem_rlock)		/* # of read locks acquired		*/
 LOCK_EVENT(rwsem_rlock_steal)	/* # of read locks by lock stealing	*/
 LOCK_EVENT(rwsem_rlock_fast)	/* # of fast read locks acquired	*/
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 9eabd585ce7a..016dbc4312e6 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -33,13 +33,19 @@
 #include "lock_events.h"
 
 /*
- * The least significant 2 bits of the owner value has the following
+ * The least significant 3 bits of the owner value has the following
  * meanings when set.
  *  - Bit 0: RWSEM_READER_OWNED - The rwsem is owned by readers
- *  - Bit 1: RWSEM_NONSPINNABLE - Cannot spin on a reader-owned lock
+ *  - Bit 1: RWSEM_RD_NONSPINNABLE - Readers cannot spin on this lock.
+ *  - Bit 2: RWSEM_WR_NONSPINNABLE - Writers cannot spin on this lock.
  *
- * When the rwsem is reader-owned and a spinning writer has timed out,
- * the nonspinnable bit will be set to disable optimistic spinning.
+ * When the rwsem is either owned by an anonymous writer, or it is
+ * reader-owned, but a spinning writer has timed out, both nonspinnable
+ * bits will be set to disable optimistic spinning by readers and writers.
+ * In the later case, the last unlocking reader should then check the
+ * writer nonspinnable bit and clear it only to give writers preference
+ * to acquire the lock via optimistic spinning, but not readers. Similar
+ * action is also done in the reader slowpath.
 
  * When a writer acquires a rwsem, it puts its task_struct pointer
  * into the owner field. It is cleared after an unlock.
@@ -59,9 +65,47 @@
  * is previously owned by a writer and the following conditions are met:
  *  - rwsem is not currently writer owned
  *  - the handoff isn't set.
+ *
+ * Reader optimistic spinning is helpful when the reader critical section
+ * is short and there aren't that many readers around. It makes readers
+ * relatively more preferred than writers. When a writer times out spinning
+ * on a reader-owned lock and set the nospinnable bits, there are two main
+ * reasons for that.
+ *
+ *  1) The reader critical section is long, perhaps the task sleeps after
+ *     acquiring the read lock.
+ *  2) There are just too many readers contending the lock causing it to
+ *     take a while to service all of them.
+ *
+ * In the former case, long reader critical section will impede the progress
+ * of writers which is usually more important for system performance. In
+ * the later case, reader optimistic spinning tends to make the reader
+ * groups that contain readers that acquire the lock together smaller
+ * leading to more of them. That may hurt performance in some cases. In
+ * other words, the setting of nonspinnable bits indicates that reader
+ * optimistic spinning may not be helpful for those workloads that cause
+ * it.
+ *
+ * Therefore, any writers that had observed the setting of the writer
+ * nonspinnable bit for a given rwsem after they fail to acquire the lock
+ * via optimistic spinning will set the reader nonspinnable bit once they
+ * acquire the write lock. Similarly, readers that observe the setting
+ * of reader nonspinnable bit at slowpath entry will set the reader
+ * nonspinnable bits when they acquire the read lock via the wakeup path.
+ *
+ * Once the reader nonspinnable bit is on, it will only be reset when
+ * a writer is able to acquire the rwsem in the fast path or somehow a
+ * reader or writer in the slowpath doesn't observe the nonspinable bit.
+ *
+ * This is to discourage reader optmistic spinning on that particular
+ * rwsem and make writers more preferred. This adaptive disabling of reader
+ * optimistic spinning will alleviate the negative side effect of this
+ * feature.
  */
 #define RWSEM_READER_OWNED	(1UL << 0)
-#define RWSEM_NONSPINNABLE	(1UL << 1)
+#define RWSEM_RD_NONSPINNABLE	(1UL << 1)
+#define RWSEM_WR_NONSPINNABLE	(1UL << 2)
+#define RWSEM_NONSPINNABLE	(RWSEM_RD_NONSPINNABLE | RWSEM_WR_NONSPINNABLE)
 #define RWSEM_OWNER_FLAGS_MASK	(RWSEM_READER_OWNED | RWSEM_NONSPINNABLE)
 
 #ifdef CONFIG_DEBUG_RWSEMS
@@ -127,6 +171,12 @@
 #define RWSEM_READ_FAILED_MASK	(RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
 				 RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)
 
+#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
+/* Reader optimistic spinning, default disabled */
+static bool rwsem_opt_rspin;
+module_param_named(opt_rspin, rwsem_opt_rspin, bool, 0644);
+#endif
+
 /*
  * All writes to owner are protected by WRITE_ONCE() to make sure that
  * store tearing can't happen as optimistic spinners may read and use
@@ -171,7 +221,7 @@ static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem,
 					    struct task_struct *owner)
 {
 	unsigned long val = (unsigned long)owner | RWSEM_READER_OWNED |
-		(atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE);
+		(atomic_long_read(&sem->owner) & RWSEM_RD_NONSPINNABLE);
 
 	atomic_long_set(&sem->owner, val);
 }
@@ -341,6 +391,7 @@ struct rwsem_waiter {
 	enum rwsem_waiter_type type;
 	unsigned long timeout;
 	bool handoff_set;
+	unsigned long last_rowner;
 };
 #define rwsem_first_waiter(sem) \
 	list_first_entry(&sem->wait_list, struct rwsem_waiter, list)
@@ -480,6 +531,10 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
 		 * the reader is copied over.
 		 */
 		owner = waiter->task;
+		if (waiter->last_rowner & RWSEM_RD_NONSPINNABLE) {
+			owner = (void *)((unsigned long)owner | RWSEM_RD_NONSPINNABLE);
+			lockevent_inc(rwsem_opt_norspin);
+		}
 		__rwsem_set_reader_owned(sem, owner);
 	}
 
@@ -684,6 +739,30 @@ enum owner_state {
 };
 
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
+/*
+ * Try to acquire read lock before the reader is put on wait queue.
+ * Lock acquisition isn't allowed if the rwsem is locked or a writer handoff
+ * is ongoing.
+ */
+static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem)
+{
+	long count = atomic_long_read(&sem->count);
+
+	if (count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))
+		return false;
+
+	count = atomic_long_fetch_add_acquire(RWSEM_READER_BIAS, &sem->count);
+	if (!(count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) {
+		rwsem_set_reader_owned(sem);
+		lockevent_inc(rwsem_opt_rlock);
+		return true;
+	}
+
+	/* Back out the change */
+	atomic_long_add(-RWSEM_READER_BIAS, &sem->count);
+	return false;
+}
+
 /*
  * Try to acquire write lock before the writer has been put on wait queue.
  */
@@ -695,14 +774,15 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
 		if (atomic_long_try_cmpxchg_acquire(&sem->count, &count,
 					count | RWSEM_WRITER_LOCKED)) {
 			rwsem_set_owner(sem);
-			lockevent_inc(rwsem_opt_lock);
+			lockevent_inc(rwsem_opt_wlock);
 			return true;
 		}
 	}
 	return false;
 }
 
-static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
+static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
+					   unsigned long nonspinnable)
 {
 	struct task_struct *owner;
 	unsigned long flags;
@@ -721,7 +801,7 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 	/*
 	 * Don't check the read-owner as the entry may be stale.
 	 */
-	if ((flags & RWSEM_NONSPINNABLE) ||
+	if ((flags & nonspinnable) ||
 	    (owner && !(flags & RWSEM_READER_OWNED) && !owner_on_cpu(owner)))
 		ret = false;
 
@@ -732,9 +812,9 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 #define OWNER_SPINNABLE		(OWNER_NULL | OWNER_WRITER | OWNER_READER)
 
 static inline enum owner_state
-rwsem_owner_state(struct task_struct *owner, unsigned long flags)
+rwsem_owner_state(struct task_struct *owner, unsigned long flags, unsigned long nonspinnable)
 {
-	if (flags & RWSEM_NONSPINNABLE)
+	if (flags & nonspinnable)
 		return OWNER_NONSPINNABLE;
 
 	if (flags & RWSEM_READER_OWNED)
@@ -744,7 +824,7 @@ rwsem_owner_state(struct task_struct *owner, unsigned long flags)
 }
 
 static noinline enum owner_state
-rwsem_spin_on_owner(struct rw_semaphore *sem)
+rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
 {
 	struct task_struct *new, *owner;
 	unsigned long flags, new_flags;
@@ -753,7 +833,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
 	lockdep_assert_preemption_disabled();
 
 	owner = rwsem_owner_flags(sem, &flags);
-	state = rwsem_owner_state(owner, flags);
+	state = rwsem_owner_state(owner, flags, nonspinnable);
 	if (state != OWNER_WRITER)
 		return state;
 
@@ -766,7 +846,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
 		 */
 		new = rwsem_owner_flags(sem, &new_flags);
 		if ((new != owner) || (new_flags != flags)) {
-			state = rwsem_owner_state(new, new_flags);
+			state = rwsem_owner_state(new, new_flags, nonspinnable);
 			break;
 		}
 
@@ -816,12 +896,14 @@ static inline u64 rwsem_rspin_threshold(struct rw_semaphore *sem)
 	return sched_clock() + delta;
 }
 
-static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
+static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock)
 {
 	bool taken = false;
 	int prev_owner_state = OWNER_NULL;
 	int loop = 0;
 	u64 rspin_threshold = 0;
+	unsigned long nonspinnable = wlock ? RWSEM_WR_NONSPINNABLE
+					   : RWSEM_RD_NONSPINNABLE;
 
 	/* sem->wait_lock should not be held when doing optimistic spinning */
 	if (!osq_lock(&sem->osq))
@@ -836,14 +918,15 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 	for (;;) {
 		enum owner_state owner_state;
 
-		owner_state = rwsem_spin_on_owner(sem);
+		owner_state = rwsem_spin_on_owner(sem, nonspinnable);
 		if (!(owner_state & OWNER_SPINNABLE))
 			break;
 
 		/*
 		 * Try to acquire the lock
 		 */
-		taken = rwsem_try_write_lock_unqueued(sem);
+		taken = wlock ? rwsem_try_write_lock_unqueued(sem)
+			      : rwsem_try_read_lock_unqueued(sem);
 
 		if (taken)
 			break;
@@ -851,7 +934,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 		/*
 		 * Time-based reader-owned rwsem optimistic spinning
 		 */
-		if (owner_state == OWNER_READER) {
+		if (wlock && (owner_state == OWNER_READER)) {
 			/*
 			 * Re-initialize rspin_threshold every time when
 			 * the owner state changes from non-reader to reader.
@@ -860,7 +943,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 			 * the beginning of the 2nd reader phase.
 			 */
 			if (prev_owner_state != OWNER_READER) {
-				if (rwsem_test_oflags(sem, RWSEM_NONSPINNABLE))
+				if (rwsem_test_oflags(sem, nonspinnable))
 					break;
 				rspin_threshold = rwsem_rspin_threshold(sem);
 				loop = 0;
@@ -935,30 +1018,89 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 }
 
 /*
- * Clear the owner's RWSEM_NONSPINNABLE bit if it is set. This should
+ * Clear the owner's RWSEM_WR_NONSPINNABLE bit if it is set. This should
  * only be called when the reader count reaches 0.
+ *
+ * This give writers better chance to acquire the rwsem first before
+ * readers when the rwsem was being held by readers for a relatively long
+ * period of time. Race can happen that an optimistic spinner may have
+ * just stolen the rwsem and set the owner, but just clearing the
+ * RWSEM_WR_NONSPINNABLE bit will do no harm anyway.
  */
-static inline void clear_nonspinnable(struct rw_semaphore *sem)
+static inline void clear_wr_nonspinnable(struct rw_semaphore *sem)
 {
-	if (unlikely(rwsem_test_oflags(sem, RWSEM_NONSPINNABLE)))
-		atomic_long_andnot(RWSEM_NONSPINNABLE, &sem->owner);
+	if (unlikely(rwsem_test_oflags(sem, RWSEM_WR_NONSPINNABLE)))
+		atomic_long_andnot(RWSEM_WR_NONSPINNABLE, &sem->owner);
+}
+
+/*
+ * This function is called when the reader fails to acquire the lock via
+ * optimistic spinning. In this case we will still attempt to do a trylock
+ * when comparing the rwsem state right now with the state when entering
+ * the slowpath indicates that the reader is still in a valid reader phase.
+ * This happens when the following conditions are true:
+ *
+ * 1) The lock is currently reader owned, and
+ * 2) The lock is previously not reader-owned or the last read owner changes.
+ *
+ * In the former case, we have transitioned from a writer phase to a
+ * reader-phase while spinning. In the latter case, it means the reader
+ * phase hasn't ended when we entered the optimistic spinning loop. In
+ * both cases, the reader is eligible to acquire the lock. This is the
+ * secondary path where a read lock is acquired optimistically.
+ *
+ * The reader non-spinnable bit wasn't set at time of entry or it will
+ * not be here at all.
+ */
+static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
+					      unsigned long last_rowner)
+{
+	unsigned long owner = atomic_long_read(&sem->owner);
+
+	if (!(owner & RWSEM_READER_OWNED))
+		return false;
+
+	if (((owner ^ last_rowner) & ~RWSEM_OWNER_FLAGS_MASK) &&
+	    rwsem_try_read_lock_unqueued(sem)) {
+		lockevent_inc(rwsem_opt_rlock2);
+		lockevent_add(rwsem_opt_fail, -1);
+		return true;
+	}
+	return false;
+}
+
+static inline bool rwsem_no_spinners(struct rw_semaphore *sem)
+{
+	return !osq_is_locked(&sem->osq);
 }
 
 #else
-static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
+static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
+					   unsigned long nonspinnable)
 {
 	return false;
 }
 
-static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem)
+static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock)
 {
 	return false;
 }
 
-static inline void clear_nonspinnable(struct rw_semaphore *sem) { }
+static inline void clear_wr_nonspinnable(struct rw_semaphore *sem) { }
+
+static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
+					      unsigned long last_rowner)
+{
+	return false;
+}
+
+static inline bool rwsem_no_spinners(sem)
+{
+	return false;
+}
 
 static inline enum owner_state
-rwsem_spin_on_owner(struct rw_semaphore *sem)
+rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
 {
 	return OWNER_NONSPINNABLE;
 }
@@ -984,7 +1126,7 @@ static inline void rwsem_cond_wake_waiter(struct rw_semaphore *sem, long count,
 		wake_type = RWSEM_WAKE_READERS;
 	} else {
 		wake_type = RWSEM_WAKE_ANY;
-		clear_nonspinnable(sem);
+		clear_wr_nonspinnable(sem);
 	}
 	rwsem_mark_wake(sem, wake_type, wake_q);
 }
@@ -995,32 +1137,66 @@ static inline void rwsem_cond_wake_waiter(struct rw_semaphore *sem, long count,
 static struct rw_semaphore __sched *
 rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int state)
 {
-	long adjustment = -RWSEM_READER_BIAS;
+	long owner, adjustment = -RWSEM_READER_BIAS;
 	long rcnt = (count >> RWSEM_READER_SHIFT);
 	struct rwsem_waiter waiter;
 	DEFINE_WAKE_Q(wake_q);
 
 	/*
 	 * To prevent a constant stream of readers from starving a sleeping
-	 * waiter, don't attempt optimistic lock stealing if the lock is
-	 * currently owned by readers.
+	 * waiter, don't attempt optimistic spinning if the lock is currently
+	 * owned by readers.
 	 */
-	if ((atomic_long_read(&sem->owner) & RWSEM_READER_OWNED) &&
-	    (rcnt > 1) && !(count & RWSEM_WRITER_LOCKED))
+	owner = atomic_long_read(&sem->owner);
+	if ((owner & RWSEM_READER_OWNED) && (rcnt > 1) &&
+	   !(count & RWSEM_WRITER_LOCKED))
 		goto queue;
 
 	/*
-	 * Reader optimistic lock stealing.
+	 * Reader optimistic lock stealing
+	 *
+	 * We can take the read lock directly without doing
+	 * rwsem_optimistic_spin() if the conditions are right.
+	 * Also wake up other readers if it is the first reader.
 	 */
-	if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF))) {
+	if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF)) &&
+	    rwsem_no_spinners(sem)) {
 		rwsem_set_reader_owned(sem);
 		lockevent_inc(rwsem_rlock_steal);
+		if (rcnt == 1)
+			goto wake_readers;
+		return sem;
+	}
+
+#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
+	if (!rwsem_opt_rspin)
+		goto queue;
+#endif
 
+	/*
+	 * Save the current read-owner of rwsem, if available, and the
+	 * reader nonspinnable bit.
+	 */
+	waiter.last_rowner = owner;
+	if (!(waiter.last_rowner & RWSEM_READER_OWNED))
+		waiter.last_rowner &= RWSEM_RD_NONSPINNABLE;
+
+	if (!rwsem_can_spin_on_owner(sem, RWSEM_RD_NONSPINNABLE))
+		goto queue;
+
+	/*
+	 * Undo read bias from down_read() and do optimistic spinning.
+	 */
+	atomic_long_add(-RWSEM_READER_BIAS, &sem->count);
+	adjustment = 0;
+	if (rwsem_optimistic_spin(sem, false)) {
+		/* rwsem_optimistic_spin() implies ACQUIRE on success */
 		/*
-		 * Wake up other readers in the wait queue if it is
-		 * the first reader.
+		 * Wake up other readers in the wait list if the front
+		 * waiter is a reader.
 		 */
-		if ((rcnt == 1) && (count & RWSEM_FLAG_WAITERS)) {
+wake_readers:
+		if ((atomic_long_read(&sem->count) & RWSEM_FLAG_WAITERS)) {
 			raw_spin_lock_irq(&sem->wait_lock);
 			if (!list_empty(&sem->wait_list))
 				rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED,
@@ -1029,6 +1205,9 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 			wake_up_q(&wake_q);
 		}
 		return sem;
+	} else if (rwsem_reader_phase_trylock(sem, waiter.last_rowner)) {
+		/* rwsem_reader_phase_trylock() implies ACQUIRE on success */
+		return sem;
 	}
 
 queue:
@@ -1045,7 +1224,8 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 		 * immediately as its RWSEM_READER_BIAS has already been set
 		 * in the count.
 		 */
-		if (!(atomic_long_read(&sem->count) & RWSEM_WRITER_MASK)) {
+		if (adjustment && !(atomic_long_read(&sem->count) &
+		     RWSEM_WRITER_MASK)) {
 			/* Provide lock ACQUIRE */
 			smp_acquire__after_ctrl_dep();
 			raw_spin_unlock_irq(&sem->wait_lock);
@@ -1058,7 +1238,10 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 	rwsem_add_waiter(sem, &waiter);
 
 	/* we're now waiting on the lock, but no longer actively locking */
-	count = atomic_long_add_return(adjustment, &sem->count);
+	if (adjustment)
+		count = atomic_long_add_return(adjustment, &sem->count);
+	else
+		count = atomic_long_read(&sem->count);
 
 	rwsem_cond_wake_waiter(sem, count, &wake_q);
 	raw_spin_unlock_irq(&sem->wait_lock);
@@ -1100,21 +1283,43 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 	return ERR_PTR(-EINTR);
 }
 
+/*
+ * This function is called by the a write lock owner. So the owner value
+ * won't get changed by others.
+ */
+static inline void rwsem_disable_reader_optspin(struct rw_semaphore *sem,
+						bool disable)
+{
+	if (unlikely(disable)) {
+		atomic_long_or(RWSEM_RD_NONSPINNABLE, &sem->owner);
+		lockevent_inc(rwsem_opt_norspin);
+	}
+}
+
 /*
  * Wait until we successfully acquire the write lock
  */
 static struct rw_semaphore __sched *
 rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 {
+	bool disable_rspin;
 	struct rwsem_waiter waiter;
 	DEFINE_WAKE_Q(wake_q);
 
 	/* do optimistic spinning and steal lock if possible */
-	if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
+	if (rwsem_can_spin_on_owner(sem, RWSEM_WR_NONSPINNABLE) &&
+	    rwsem_optimistic_spin(sem, true)) {
 		/* rwsem_optimistic_spin() implies ACQUIRE on success */
 		return sem;
 	}
 
+	/*
+	 * Disable reader optimistic spinning for this rwsem after
+	 * acquiring the write lock when the setting of the nonspinnable
+	 * bits are observed.
+	 */
+	disable_rspin = atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE;
+
 	/*
 	 * Optimistic spinning failed, proceed to the slowpath
 	 * and block until we can acquire the sem.
@@ -1170,7 +1375,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 		if (waiter.handoff_set) {
 			enum owner_state owner_state;
 
-			owner_state = rwsem_spin_on_owner(sem);
+			owner_state = rwsem_spin_on_owner(sem, RWSEM_NONSPINNABLE);
 			if (owner_state == OWNER_NULL)
 				goto trylock_again;
 		}
@@ -1182,6 +1387,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
 		raw_spin_lock_irq(&sem->wait_lock);
 	}
 	__set_current_state(TASK_RUNNING);
+	rwsem_disable_reader_optspin(sem, disable_rspin);
 	raw_spin_unlock_irq(&sem->wait_lock);
 	lockevent_inc(rwsem_wlock);
 	trace_contention_end(sem, 0);
@@ -1348,7 +1554,7 @@ static inline void __up_read(struct rw_semaphore *sem)
 	DEBUG_RWSEMS_WARN_ON(tmp < 0, sem);
 	if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) ==
 		      RWSEM_FLAG_WAITERS)) {
-		clear_nonspinnable(sem);
+		clear_wr_nonspinnable(sem);
 		rwsem_wake(sem);
 	}
 	preempt_enable();
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
  2023-05-31  0:34 ` [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning Bongkyu Kim
@ 2023-05-31 16:45   ` kernel test robot
  2023-06-01  7:10   ` kernel test robot
       [not found]   ` <d5b65c4b-7232-e5a7-27ae-b8efab037396@redhat.com>
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-05-31 16:45 UTC (permalink / raw)
  To: Bongkyu Kim, peterz, mingo, will, longman, boqun.feng
  Cc: oe-kbuild-all, linux-kernel, jwook1.kim, lakkyung.jung, Bongkyu Kim

Hi Bongkyu,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on linus/master v6.4-rc4 next-20230531]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bongkyu-Kim/locking-rwsem-Optionally-re-enable-reader-optimistic-spinning/20230531-083658
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20230531003436.7082-1-bongkyu7.kim%40samsung.com
patch subject: [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20230601/202306010043.VJHcuCnb-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build):
        mkdir -p ~/bin
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/8c4098eb89be5b82aded3d17b22f78013454d058
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Bongkyu-Kim/locking-rwsem-Optionally-re-enable-reader-optimistic-spinning/20230531-083658
        git checkout 8c4098eb89be5b82aded3d17b22f78013454d058
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross W=1 O=build_dir ARCH=m68k olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash kernel/locking/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306010043.VJHcuCnb-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

>> kernel/locking/rwsem.c:1097:20: error: function declaration isn't a prototype [-Werror=strict-prototypes]
    1097 | static inline bool rwsem_no_spinners(sem)
         |                    ^~~~~~~~~~~~~~~~~
   kernel/locking/rwsem.c: In function 'rwsem_no_spinners':
>> kernel/locking/rwsem.c:1097:20: warning: old-style function definition [-Wold-style-definition]
>> kernel/locking/rwsem.c:1097:20: error: type of 'sem' defaults to 'int' [-Werror=implicit-int]
   cc1: some warnings being treated as errors


vim +1097 kernel/locking/rwsem.c

  1096	
> 1097	static inline bool rwsem_no_spinners(sem)
  1098	{
  1099		return false;
  1100	}
  1101	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
  2023-05-31  0:34 ` [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning Bongkyu Kim
  2023-05-31 16:45   ` kernel test robot
@ 2023-06-01  7:10   ` kernel test robot
       [not found]   ` <d5b65c4b-7232-e5a7-27ae-b8efab037396@redhat.com>
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-06-01  7:10 UTC (permalink / raw)
  To: Bongkyu Kim, peterz, mingo, will, longman, boqun.feng
  Cc: llvm, oe-kbuild-all, linux-kernel, jwook1.kim, lakkyung.jung,
	Bongkyu Kim

Hi Bongkyu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/locking/core]
[also build test WARNING on linus/master v6.4-rc4 next-20230601]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bongkyu-Kim/locking-rwsem-Optionally-re-enable-reader-optimistic-spinning/20230531-083658
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20230531003436.7082-1-bongkyu7.kim%40samsung.com
patch subject: [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
config: x86_64-randconfig-x063-20230531 (https://download.01.org/0day-ci/archive/20230601/202306011420.8opqhG4p-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        mkdir -p ~/bin
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/8c4098eb89be5b82aded3d17b22f78013454d058
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Bongkyu-Kim/locking-rwsem-Optionally-re-enable-reader-optimistic-spinning/20230531-083658
        git checkout 8c4098eb89be5b82aded3d17b22f78013454d058
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang ~/bin/make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang ~/bin/make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/events/ arch/x86/kernel/fpu/ drivers/of/ drivers/usb/host/ kernel/locking/ kernel/sched/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306011420.8opqhG4p-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/locking/rwsem.c:1163:24: warning: incompatible pointer to integer conversion passing 'struct rw_semaphore *' to parameter of type 'int' [-Wint-conversion]
               rwsem_no_spinners(sem)) {
                                 ^~~
   kernel/locking/rwsem.c:332:1: warning: unused function 'rwsem_owner_flags' [-Wunused-function]
   rwsem_owner_flags(struct rw_semaphore *sem, unsigned long *pflags)
   ^
   2 warnings generated.


vim +1163 kernel/locking/rwsem.c

  1133	
  1134	/*
  1135	 * Wait for the read lock to be granted
  1136	 */
  1137	static struct rw_semaphore __sched *
  1138	rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int state)
  1139	{
  1140		long owner, adjustment = -RWSEM_READER_BIAS;
  1141		long rcnt = (count >> RWSEM_READER_SHIFT);
  1142		struct rwsem_waiter waiter;
  1143		DEFINE_WAKE_Q(wake_q);
  1144	
  1145		/*
  1146		 * To prevent a constant stream of readers from starving a sleeping
  1147		 * waiter, don't attempt optimistic spinning if the lock is currently
  1148		 * owned by readers.
  1149		 */
  1150		owner = atomic_long_read(&sem->owner);
  1151		if ((owner & RWSEM_READER_OWNED) && (rcnt > 1) &&
  1152		   !(count & RWSEM_WRITER_LOCKED))
  1153			goto queue;
  1154	
  1155		/*
  1156		 * Reader optimistic lock stealing
  1157		 *
  1158		 * We can take the read lock directly without doing
  1159		 * rwsem_optimistic_spin() if the conditions are right.
  1160		 * Also wake up other readers if it is the first reader.
  1161		 */
  1162		if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF)) &&
> 1163		    rwsem_no_spinners(sem)) {
  1164			rwsem_set_reader_owned(sem);
  1165			lockevent_inc(rwsem_rlock_steal);
  1166			if (rcnt == 1)
  1167				goto wake_readers;
  1168			return sem;
  1169		}
  1170	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
       [not found]       ` <8e11066a-017f-a190-39de-17f92128e42f@redhat.com>
@ 2023-06-02  0:46         ` Bongkyu Kim
  2023-06-13  4:36         ` Bongkyu Kim
  1 sibling, 0 replies; 5+ messages in thread
From: Bongkyu Kim @ 2023-06-02  0:46 UTC (permalink / raw)
  To: Waiman Long, peterz, mingo, will, boqun.feng
  Cc: linux-kernel, jwook1.kim, lakkyung.jung

[-- Attachment #1: Type: text/plain, Size: 30082 bytes --]

On Thu, Jun 01, 2023 at 10:06:13AM -0400, Waiman Long wrote:
> On 5/31/23 21:22, Bongkyu Kim wrote:
> > On Wed, May 31, 2023 at 02:29:54PM -0400, Waiman Long wrote:
> > > On 5/30/23 20:34, Bongkyu Kim wrote:
> > > > Remove reader optimistic spinning has a great regression on application
> > > I won't consider this a great regression if it is just a few % point of
> > > performance differences.
> > > 
> > > BTW, this patch is mostly a revert of commit 617f3ef95177 ("locking/rwsem:
> > > Remove reader optimistic spinning"). So it should be mentioned here.
> > > 
> > I will fix these and resend patch v2.
> > Thanks for your review.
> 
> There is one more point that I forgot to mention. Enabling this option can
> degrade or improve performance depending on the actual workload. A system
> administrator that is not familiar with the rwsem code will not know what to
> do with it. So I would suggest a bit more wording on the documentation part
> about under what condition should the users try to enable it to see if it
> helps.
> 
> -Longman
> 

All right. I will add a more wording to the documentation part and resend
patch v3.

Thanks,
Bongkyu

> > > > startup performance in android device. In mobile environment, reader
> > > > optimistic spinning is still useful because there're not many readers.
> > > > So re-enable reader optimistic spinning and disabled by default. And,
> > > > can turn on by cmdline.
> > > > 
> > > > Test result:
> > > > This is 15 application startup performance in our s5e8535 soc.
> > > > - Cortex A78*2 + Cortex A55*6
> > > > 
> > > > Application             base  opt_rspin  Diff  Diff(%)
> > > > --------------------  ------  ---------  ----  -------
> > > > * Total(geomean)         343        330   -13    +3.8%
> > > > --------------------  ------  ---------  ----  -------
> > > > helloworld               110        108    -2    +1.8%
> > > > Amazon_Seller            397        388    -9    +2.3%
> > > > Whatsapp                 311        304    -7    +2.3%
> > > > Simple_PDF_Reader        500        463   -37    +7.4%
> > > > FaceApp                  330        317   -13    +3.9%
> > > > Timestamp_Camera_Free    451        443    -8    +1.8%
> > > > Kindle                   629        597   -32    +5.1%
> > > > Coinbase                 243        233   -10    +4.1%
> > > > Firefox                  425        399   -26    +6.1%
> > > > Candy_Crush_Soda         552        538   -14    +2.5%
> > > > Hill_Climb_Racing        245        230   -15    +6.1%
> > > > Call_Recorder            437        426   -11    +2.5%
> > > > Color_Fill_3D            190        180   -10    +5.3%
> > > > eToro                    512        505    -7    +1.4%
> > > > GroupMe                  281        266   -15    +5.3%
> > > > 
> > > > Signed-off-by: Bongkyu Kim <bongkyu7.kim@samsung.com>
> > > > ---
> > > >    .../admin-guide/kernel-parameters.txt         |   2 +
> > > >    kernel/locking/lock_events_list.h             |   5 +-
> > > >    kernel/locking/rwsem.c                        | 292 +++++++++++++++---
> > > >    3 files changed, 255 insertions(+), 44 deletions(-)
> > > > 
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > index bb23a36a7ff7..b92a6b3f965f 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -5495,6 +5495,8 @@
> > > >    	rw		[KNL] Mount root device read-write on boot
> > > > +	rwsem.opt_rspin= [KNL] Use rwsem reader optimistic spinning
> > > > +
> > > >    	S		[KNL] Run init in single mode
> > > >    	s390_iommu=	[HW,S390]
> > > > diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
> > > > index 97fb6f3f840a..270a0d351932 100644
> > > > --- a/kernel/locking/lock_events_list.h
> > > > +++ b/kernel/locking/lock_events_list.h
> > > > @@ -56,9 +56,12 @@ LOCK_EVENT(rwsem_sleep_reader)	/* # of reader sleeps			*/
> > > >    LOCK_EVENT(rwsem_sleep_writer)	/* # of writer sleeps			*/
> > > >    LOCK_EVENT(rwsem_wake_reader)	/* # of reader wakeups			*/
> > > >    LOCK_EVENT(rwsem_wake_writer)	/* # of writer wakeups			*/
> > > > -LOCK_EVENT(rwsem_opt_lock)	/* # of opt-acquired write locks	*/
> > > > +LOCK_EVENT(rwsem_opt_rlock)	/* # of opt-acquired read locks		*/
> > > > +LOCK_EVENT(rwsem_opt_wlock)	/* # of opt-acquired write locks	*/
> > > >    LOCK_EVENT(rwsem_opt_fail)	/* # of failed optspins			*/
> > > >    LOCK_EVENT(rwsem_opt_nospin)	/* # of disabled optspins		*/
> > > > +LOCK_EVENT(rwsem_opt_norspin)	/* # of disabled reader-only optspins	*/
> > > > +LOCK_EVENT(rwsem_opt_rlock2)	/* # of opt-acquired 2ndary read locks	*/
> > > >    LOCK_EVENT(rwsem_rlock)		/* # of read locks acquired		*/
> > > >    LOCK_EVENT(rwsem_rlock_steal)	/* # of read locks by lock stealing	*/
> > > >    LOCK_EVENT(rwsem_rlock_fast)	/* # of fast read locks acquired	*/
> > > > diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> > > > index 9eabd585ce7a..016dbc4312e6 100644
> > > > --- a/kernel/locking/rwsem.c
> > > > +++ b/kernel/locking/rwsem.c
> > > > @@ -33,13 +33,19 @@
> > > >    #include "lock_events.h"
> > > >    /*
> > > > - * The least significant 2 bits of the owner value has the following
> > > > + * The least significant 3 bits of the owner value has the following
> > > >     * meanings when set.
> > > >     *  - Bit 0: RWSEM_READER_OWNED - The rwsem is owned by readers
> > > > - *  - Bit 1: RWSEM_NONSPINNABLE - Cannot spin on a reader-owned lock
> > > > + *  - Bit 1: RWSEM_RD_NONSPINNABLE - Readers cannot spin on this lock.
> > > > + *  - Bit 2: RWSEM_WR_NONSPINNABLE - Writers cannot spin on this lock.
> > > >     *
> > > > - * When the rwsem is reader-owned and a spinning writer has timed out,
> > > > - * the nonspinnable bit will be set to disable optimistic spinning.
> > > > + * When the rwsem is either owned by an anonymous writer, or it is
> > > > + * reader-owned, but a spinning writer has timed out, both nonspinnable
> > > > + * bits will be set to disable optimistic spinning by readers and writers.
> > > > + * In the later case, the last unlocking reader should then check the
> > > > + * writer nonspinnable bit and clear it only to give writers preference
> > > > + * to acquire the lock via optimistic spinning, but not readers. Similar
> > > > + * action is also done in the reader slowpath.
> > > >     * When a writer acquires a rwsem, it puts its task_struct pointer
> > > >     * into the owner field. It is cleared after an unlock.
> > > > @@ -59,9 +65,47 @@
> > > >     * is previously owned by a writer and the following conditions are met:
> > > >     *  - rwsem is not currently writer owned
> > > >     *  - the handoff isn't set.
> > > > + *
> > > > + * Reader optimistic spinning is helpful when the reader critical section
> > > > + * is short and there aren't that many readers around. It makes readers
> > > > + * relatively more preferred than writers. When a writer times out spinning
> > > > + * on a reader-owned lock and set the nospinnable bits, there are two main
> > > > + * reasons for that.
> > > > + *
> > > > + *  1) The reader critical section is long, perhaps the task sleeps after
> > > > + *     acquiring the read lock.
> > > > + *  2) There are just too many readers contending the lock causing it to
> > > > + *     take a while to service all of them.
> > > > + *
> > > > + * In the former case, long reader critical section will impede the progress
> > > > + * of writers which is usually more important for system performance. In
> > > > + * the later case, reader optimistic spinning tends to make the reader
> > > > + * groups that contain readers that acquire the lock together smaller
> > > > + * leading to more of them. That may hurt performance in some cases. In
> > > > + * other words, the setting of nonspinnable bits indicates that reader
> > > > + * optimistic spinning may not be helpful for those workloads that cause
> > > > + * it.
> > > > + *
> > > > + * Therefore, any writers that had observed the setting of the writer
> > > > + * nonspinnable bit for a given rwsem after they fail to acquire the lock
> > > > + * via optimistic spinning will set the reader nonspinnable bit once they
> > > > + * acquire the write lock. Similarly, readers that observe the setting
> > > > + * of reader nonspinnable bit at slowpath entry will set the reader
> > > > + * nonspinnable bits when they acquire the read lock via the wakeup path.
> > > > + *
> > > > + * Once the reader nonspinnable bit is on, it will only be reset when
> > > > + * a writer is able to acquire the rwsem in the fast path or somehow a
> > > > + * reader or writer in the slowpath doesn't observe the nonspinable bit.
> > > > + *
> > > > + * This is to discourage reader optmistic spinning on that particular
> > > > + * rwsem and make writers more preferred. This adaptive disabling of reader
> > > > + * optimistic spinning will alleviate the negative side effect of this
> > > > + * feature.
> > > >     */
> > > >    #define RWSEM_READER_OWNED	(1UL << 0)
> > > > -#define RWSEM_NONSPINNABLE	(1UL << 1)
> > > > +#define RWSEM_RD_NONSPINNABLE	(1UL << 1)
> > > > +#define RWSEM_WR_NONSPINNABLE	(1UL << 2)
> > > > +#define RWSEM_NONSPINNABLE	(RWSEM_RD_NONSPINNABLE | RWSEM_WR_NONSPINNABLE)
> > > >    #define RWSEM_OWNER_FLAGS_MASK	(RWSEM_READER_OWNED | RWSEM_NONSPINNABLE)
> > > >    #ifdef CONFIG_DEBUG_RWSEMS
> > > > @@ -127,6 +171,12 @@
> > > >    #define RWSEM_READ_FAILED_MASK	(RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
> > > >    				 RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)
> > > > +#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
> > > > +/* Reader optimistic spinning, default disabled */
> > > > +static bool rwsem_opt_rspin;
> > > > +module_param_named(opt_rspin, rwsem_opt_rspin, bool, 0644);
> > > > +#endif
> > > The rwsem code isn't really a kernel module. It is a bit odd to use module
> > > parameter here.
> > > 
> > > 
> > > > +
> > > >    /*
> > > >     * All writes to owner are protected by WRITE_ONCE() to make sure that
> > > >     * store tearing can't happen as optimistic spinners may read and use
> > > > @@ -171,7 +221,7 @@ static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem,
> > > >    					    struct task_struct *owner)
> > > >    {
> > > >    	unsigned long val = (unsigned long)owner | RWSEM_READER_OWNED |
> > > > -		(atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE);
> > > > +		(atomic_long_read(&sem->owner) & RWSEM_RD_NONSPINNABLE);
> > > >    	atomic_long_set(&sem->owner, val);
> > > >    }
> > > > @@ -341,6 +391,7 @@ struct rwsem_waiter {
> > > >    	enum rwsem_waiter_type type;
> > > >    	unsigned long timeout;
> > > >    	bool handoff_set;
> > > > +	unsigned long last_rowner;
> > > >    };
> > > >    #define rwsem_first_waiter(sem) \
> > > >    	list_first_entry(&sem->wait_list, struct rwsem_waiter, list)
> > > > @@ -480,6 +531,10 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
> > > >    		 * the reader is copied over.
> > > >    		 */
> > > >    		owner = waiter->task;
> > > > +		if (waiter->last_rowner & RWSEM_RD_NONSPINNABLE) {
> > > > +			owner = (void *)((unsigned long)owner | RWSEM_RD_NONSPINNABLE);
> > > > +			lockevent_inc(rwsem_opt_norspin);
> > > > +		}
> > > >    		__rwsem_set_reader_owned(sem, owner);
> > > >    	}
> > > > @@ -684,6 +739,30 @@ enum owner_state {
> > > >    };
> > > >    #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
> > > > +/*
> > > > + * Try to acquire read lock before the reader is put on wait queue.
> > > > + * Lock acquisition isn't allowed if the rwsem is locked or a writer handoff
> > > > + * is ongoing.
> > > > + */
> > > > +static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem)
> > > > +{
> > > > +	long count = atomic_long_read(&sem->count);
> > > > +
> > > > +	if (count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))
> > > > +		return false;
> > > > +
> > > > +	count = atomic_long_fetch_add_acquire(RWSEM_READER_BIAS, &sem->count);
> > > > +	if (!(count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) {
> > > > +		rwsem_set_reader_owned(sem);
> > > > +		lockevent_inc(rwsem_opt_rlock);
> > > > +		return true;
> > > > +	}
> > > > +
> > > > +	/* Back out the change */
> > > > +	atomic_long_add(-RWSEM_READER_BIAS, &sem->count);
> > > > +	return false;
> > > > +}
> > > > +
> > > >    /*
> > > >     * Try to acquire write lock before the writer has been put on wait queue.
> > > >     */
> > > > @@ -695,14 +774,15 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
> > > >    		if (atomic_long_try_cmpxchg_acquire(&sem->count, &count,
> > > >    					count | RWSEM_WRITER_LOCKED)) {
> > > >    			rwsem_set_owner(sem);
> > > > -			lockevent_inc(rwsem_opt_lock);
> > > > +			lockevent_inc(rwsem_opt_wlock);
> > > >    			return true;
> > > >    		}
> > > >    	}
> > > >    	return false;
> > > >    }
> > > > -static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > > +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
> > > > +					   unsigned long nonspinnable)
> > > >    {
> > > >    	struct task_struct *owner;
> > > >    	unsigned long flags;
> > > > @@ -721,7 +801,7 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > >    	/*
> > > >    	 * Don't check the read-owner as the entry may be stale.
> > > >    	 */
> > > > -	if ((flags & RWSEM_NONSPINNABLE) ||
> > > > +	if ((flags & nonspinnable) ||
> > > >    	    (owner && !(flags & RWSEM_READER_OWNED) && !owner_on_cpu(owner)))
> > > >    		ret = false;
> > > > @@ -732,9 +812,9 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > >    #define OWNER_SPINNABLE		(OWNER_NULL | OWNER_WRITER | OWNER_READER)
> > > >    static inline enum owner_state
> > > > -rwsem_owner_state(struct task_struct *owner, unsigned long flags)
> > > > +rwsem_owner_state(struct task_struct *owner, unsigned long flags, unsigned long nonspinnable)
> > > >    {
> > > > -	if (flags & RWSEM_NONSPINNABLE)
> > > > +	if (flags & nonspinnable)
> > > >    		return OWNER_NONSPINNABLE;
> > > >    	if (flags & RWSEM_READER_OWNED)
> > > > @@ -744,7 +824,7 @@ rwsem_owner_state(struct task_struct *owner, unsigned long flags)
> > > >    }
> > > >    static noinline enum owner_state
> > > > -rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > > +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> > > >    {
> > > >    	struct task_struct *new, *owner;
> > > >    	unsigned long flags, new_flags;
> > > > @@ -753,7 +833,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > >    	lockdep_assert_preemption_disabled();
> > > >    	owner = rwsem_owner_flags(sem, &flags);
> > > > -	state = rwsem_owner_state(owner, flags);
> > > > +	state = rwsem_owner_state(owner, flags, nonspinnable);
> > > >    	if (state != OWNER_WRITER)
> > > >    		return state;
> > > > @@ -766,7 +846,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > >    		 */
> > > >    		new = rwsem_owner_flags(sem, &new_flags);
> > > >    		if ((new != owner) || (new_flags != flags)) {
> > > > -			state = rwsem_owner_state(new, new_flags);
> > > > +			state = rwsem_owner_state(new, new_flags, nonspinnable);
> > > >    			break;
> > > >    		}
> > > > @@ -816,12 +896,14 @@ static inline u64 rwsem_rspin_threshold(struct rw_semaphore *sem)
> > > >    	return sched_clock() + delta;
> > > >    }
> > > > -static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > > +static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock)
> > > >    {
> > > >    	bool taken = false;
> > > >    	int prev_owner_state = OWNER_NULL;
> > > >    	int loop = 0;
> > > >    	u64 rspin_threshold = 0;
> > > > +	unsigned long nonspinnable = wlock ? RWSEM_WR_NONSPINNABLE
> > > > +					   : RWSEM_RD_NONSPINNABLE;
> > > >    	/* sem->wait_lock should not be held when doing optimistic spinning */
> > > >    	if (!osq_lock(&sem->osq))
> > > > @@ -836,14 +918,15 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    	for (;;) {
> > > >    		enum owner_state owner_state;
> > > > -		owner_state = rwsem_spin_on_owner(sem);
> > > > +		owner_state = rwsem_spin_on_owner(sem, nonspinnable);
> > > >    		if (!(owner_state & OWNER_SPINNABLE))
> > > >    			break;
> > > >    		/*
> > > >    		 * Try to acquire the lock
> > > >    		 */
> > > > -		taken = rwsem_try_write_lock_unqueued(sem);
> > > > +		taken = wlock ? rwsem_try_write_lock_unqueued(sem)
> > > > +			      : rwsem_try_read_lock_unqueued(sem);
> > > >    		if (taken)
> > > >    			break;
> > > > @@ -851,7 +934,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    		/*
> > > >    		 * Time-based reader-owned rwsem optimistic spinning
> > > >    		 */
> > > > -		if (owner_state == OWNER_READER) {
> > > > +		if (wlock && (owner_state == OWNER_READER)) {
> > > >    			/*
> > > >    			 * Re-initialize rspin_threshold every time when
> > > >    			 * the owner state changes from non-reader to reader.
> > > > @@ -860,7 +943,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    			 * the beginning of the 2nd reader phase.
> > > >    			 */
> > > >    			if (prev_owner_state != OWNER_READER) {
> > > > -				if (rwsem_test_oflags(sem, RWSEM_NONSPINNABLE))
> > > > +				if (rwsem_test_oflags(sem, nonspinnable))
> > > >    					break;
> > > >    				rspin_threshold = rwsem_rspin_threshold(sem);
> > > >    				loop = 0;
> > > > @@ -935,30 +1018,89 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    }
> > > >    /*
> > > > - * Clear the owner's RWSEM_NONSPINNABLE bit if it is set. This should
> > > > + * Clear the owner's RWSEM_WR_NONSPINNABLE bit if it is set. This should
> > > >     * only be called when the reader count reaches 0.
> > > > + *
> > > > + * This give writers better chance to acquire the rwsem first before
> > > > + * readers when the rwsem was being held by readers for a relatively long
> > > > + * period of time. Race can happen that an optimistic spinner may have
> > > > + * just stolen the rwsem and set the owner, but just clearing the
> > > > + * RWSEM_WR_NONSPINNABLE bit will do no harm anyway.
> > > >     */
> > > > -static inline void clear_nonspinnable(struct rw_semaphore *sem)
> > > > +static inline void clear_wr_nonspinnable(struct rw_semaphore *sem)
> > > >    {
> > > > -	if (unlikely(rwsem_test_oflags(sem, RWSEM_NONSPINNABLE)))
> > > > -		atomic_long_andnot(RWSEM_NONSPINNABLE, &sem->owner);
> > > > +	if (unlikely(rwsem_test_oflags(sem, RWSEM_WR_NONSPINNABLE)))
> > > > +		atomic_long_andnot(RWSEM_WR_NONSPINNABLE, &sem->owner);
> > > > +}
> > > > +
> > > > +/*
> > > > + * This function is called when the reader fails to acquire the lock via
> > > > + * optimistic spinning. In this case we will still attempt to do a trylock
> > > > + * when comparing the rwsem state right now with the state when entering
> > > > + * the slowpath indicates that the reader is still in a valid reader phase.
> > > > + * This happens when the following conditions are true:
> > > > + *
> > > > + * 1) The lock is currently reader owned, and
> > > > + * 2) The lock is previously not reader-owned or the last read owner changes.
> > > > + *
> > > > + * In the former case, we have transitioned from a writer phase to a
> > > > + * reader-phase while spinning. In the latter case, it means the reader
> > > > + * phase hasn't ended when we entered the optimistic spinning loop. In
> > > > + * both cases, the reader is eligible to acquire the lock. This is the
> > > > + * secondary path where a read lock is acquired optimistically.
> > > > + *
> > > > + * The reader non-spinnable bit wasn't set at time of entry or it will
> > > > + * not be here at all.
> > > > + */
> > > > +static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
> > > > +					      unsigned long last_rowner)
> > > > +{
> > > > +	unsigned long owner = atomic_long_read(&sem->owner);
> > > > +
> > > > +	if (!(owner & RWSEM_READER_OWNED))
> > > > +		return false;
> > > > +
> > > > +	if (((owner ^ last_rowner) & ~RWSEM_OWNER_FLAGS_MASK) &&
> > > > +	    rwsem_try_read_lock_unqueued(sem)) {
> > > > +		lockevent_inc(rwsem_opt_rlock2);
> > > > +		lockevent_add(rwsem_opt_fail, -1);
> > > > +		return true;
> > > > +	}
> > > > +	return false;
> > > > +}
> > > > +
> > > > +static inline bool rwsem_no_spinners(struct rw_semaphore *sem)
> > > > +{
> > > > +	return !osq_is_locked(&sem->osq);
> > > >    }
> > > >    #else
> > > > -static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > > +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
> > > > +					   unsigned long nonspinnable)
> > > >    {
> > > >    	return false;
> > > >    }
> > > > -static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > > +static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock)
> > > >    {
> > > >    	return false;
> > > >    }
> > > > -static inline void clear_nonspinnable(struct rw_semaphore *sem) { }
> > > > +static inline void clear_wr_nonspinnable(struct rw_semaphore *sem) { }
> > > > +
> > > > +static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
> > > > +					      unsigned long last_rowner)
> > > > +{
> > > > +	return false;
> > > > +}
> > > > +
> > > > +static inline bool rwsem_no_spinners(sem)
> > > > +{
> > > > +	return false;
> > > > +}
> > > >    static inline enum owner_state
> > > > -rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > > +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> > > >    {
> > > >    	return OWNER_NONSPINNABLE;
> > > >    }
> > > > @@ -984,7 +1126,7 @@ static inline void rwsem_cond_wake_waiter(struct rw_semaphore *sem, long count,
> > > >    		wake_type = RWSEM_WAKE_READERS;
> > > >    	} else {
> > > >    		wake_type = RWSEM_WAKE_ANY;
> > > > -		clear_nonspinnable(sem);
> > > > +		clear_wr_nonspinnable(sem);
> > > >    	}
> > > >    	rwsem_mark_wake(sem, wake_type, wake_q);
> > > >    }
> > > > @@ -995,32 +1137,66 @@ static inline void rwsem_cond_wake_waiter(struct rw_semaphore *sem, long count,
> > > >    static struct rw_semaphore __sched *
> > > >    rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int state)
> > > >    {
> > > > -	long adjustment = -RWSEM_READER_BIAS;
> > > > +	long owner, adjustment = -RWSEM_READER_BIAS;
> > > >    	long rcnt = (count >> RWSEM_READER_SHIFT);
> > > >    	struct rwsem_waiter waiter;
> > > >    	DEFINE_WAKE_Q(wake_q);
> > > >    	/*
> > > >    	 * To prevent a constant stream of readers from starving a sleeping
> > > > -	 * waiter, don't attempt optimistic lock stealing if the lock is
> > > > -	 * currently owned by readers.
> > > > +	 * waiter, don't attempt optimistic spinning if the lock is currently
> > > > +	 * owned by readers.
> > > >    	 */
> > > > -	if ((atomic_long_read(&sem->owner) & RWSEM_READER_OWNED) &&
> > > > -	    (rcnt > 1) && !(count & RWSEM_WRITER_LOCKED))
> > > > +	owner = atomic_long_read(&sem->owner);
> > > > +	if ((owner & RWSEM_READER_OWNED) && (rcnt > 1) &&
> > > > +	   !(count & RWSEM_WRITER_LOCKED))
> > > >    		goto queue;
> > > >    	/*
> > > > -	 * Reader optimistic lock stealing.
> > > > +	 * Reader optimistic lock stealing
> > > > +	 *
> > > > +	 * We can take the read lock directly without doing
> > > > +	 * rwsem_optimistic_spin() if the conditions are right.
> > > > +	 * Also wake up other readers if it is the first reader.
> > > >    	 */
> > > > -	if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF))) {
> > > > +	if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF)) &&
> > > > +	    rwsem_no_spinners(sem)) {
> > > >    		rwsem_set_reader_owned(sem);
> > > >    		lockevent_inc(rwsem_rlock_steal);
> > > > +		if (rcnt == 1)
> > > > +			goto wake_readers;
> > > > +		return sem;
> > > > +	}
> > > > +
> > > > +#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
> > > > +	if (!rwsem_opt_rspin)
> > > > +		goto queue;
> > > > +#endif
> > > I would suggest changing that to
> > > 
> > > if (!IS_ENABLED(CONFIG_RWSEM_SPIN_ON_OWNER) || !rwsem_opt_rspin)
> > >          goto queue;
> > > 
> > > > +	/*
> > > > +	 * Save the current read-owner of rwsem, if available, and the
> > > > +	 * reader nonspinnable bit.
> > > > +	 */
> > > > +	waiter.last_rowner = owner;
> > > > +	if (!(waiter.last_rowner & RWSEM_READER_OWNED))
> > > > +		waiter.last_rowner &= RWSEM_RD_NONSPINNABLE;
> > > > +
> > > > +	if (!rwsem_can_spin_on_owner(sem, RWSEM_RD_NONSPINNABLE))
> > > > +		goto queue;
> > > > +
> > > > +	/*
> > > > +	 * Undo read bias from down_read() and do optimistic spinning.
> > > > +	 */
> > > > +	atomic_long_add(-RWSEM_READER_BIAS, &sem->count);
> > > > +	adjustment = 0;
> > > > +	if (rwsem_optimistic_spin(sem, false)) {
> > > > +		/* rwsem_optimistic_spin() implies ACQUIRE on success */
> > > >    		/*
> > > > -		 * Wake up other readers in the wait queue if it is
> > > > -		 * the first reader.
> > > > +		 * Wake up other readers in the wait list if the front
> > > > +		 * waiter is a reader.
> > > >    		 */
> > > > -		if ((rcnt == 1) && (count & RWSEM_FLAG_WAITERS)) {
> > > > +wake_readers:
> > > > +		if ((atomic_long_read(&sem->count) & RWSEM_FLAG_WAITERS)) {
> > > >    			raw_spin_lock_irq(&sem->wait_lock);
> > > >    			if (!list_empty(&sem->wait_list))
> > > >    				rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED,
> > > > @@ -1029,6 +1205,9 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    			wake_up_q(&wake_q);
> > > >    		}
> > > >    		return sem;
> > > > +	} else if (rwsem_reader_phase_trylock(sem, waiter.last_rowner)) {
> > > > +		/* rwsem_reader_phase_trylock() implies ACQUIRE on success */
> > > > +		return sem;
> > > >    	}
> > > >    queue:
> > > > @@ -1045,7 +1224,8 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    		 * immediately as its RWSEM_READER_BIAS has already been set
> > > >    		 * in the count.
> > > >    		 */
> > > > -		if (!(atomic_long_read(&sem->count) & RWSEM_WRITER_MASK)) {
> > > > +		if (adjustment && !(atomic_long_read(&sem->count) &
> > > > +		     RWSEM_WRITER_MASK)) {
> > > >    			/* Provide lock ACQUIRE */
> > > >    			smp_acquire__after_ctrl_dep();
> > > >    			raw_spin_unlock_irq(&sem->wait_lock);
> > > > @@ -1058,7 +1238,10 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    	rwsem_add_waiter(sem, &waiter);
> > > >    	/* we're now waiting on the lock, but no longer actively locking */
> > > > -	count = atomic_long_add_return(adjustment, &sem->count);
> > > > +	if (adjustment)
> > > > +		count = atomic_long_add_return(adjustment, &sem->count);
> > > > +	else
> > > > +		count = atomic_long_read(&sem->count);
> > > >    	rwsem_cond_wake_waiter(sem, count, &wake_q);
> > > >    	raw_spin_unlock_irq(&sem->wait_lock);
> > > > @@ -1100,21 +1283,43 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    	return ERR_PTR(-EINTR);
> > > >    }
> > > > +/*
> > > > + * This function is called by the a write lock owner. So the owner value
> > > > + * won't get changed by others.
> > > > + */
> > > > +static inline void rwsem_disable_reader_optspin(struct rw_semaphore *sem,
> > > > +						bool disable)
> > > > +{
> > > > +	if (unlikely(disable)) {
> > > > +		atomic_long_or(RWSEM_RD_NONSPINNABLE, &sem->owner);
> > > > +		lockevent_inc(rwsem_opt_norspin);
> > > > +	}
> > > > +}
> > > > +
> > > >    /*
> > > >     * Wait until we successfully acquire the write lock
> > > >     */
> > > >    static struct rw_semaphore __sched *
> > > >    rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> > > >    {
> > > > +	bool disable_rspin;
> > > >    	struct rwsem_waiter waiter;
> > > >    	DEFINE_WAKE_Q(wake_q);
> > > >    	/* do optimistic spinning and steal lock if possible */
> > > > -	if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
> > > > +	if (rwsem_can_spin_on_owner(sem, RWSEM_WR_NONSPINNABLE) &&
> > > > +	    rwsem_optimistic_spin(sem, true)) {
> > > >    		/* rwsem_optimistic_spin() implies ACQUIRE on success */
> > > >    		return sem;
> > > >    	}
> > > > +	/*
> > > > +	 * Disable reader optimistic spinning for this rwsem after
> > > > +	 * acquiring the write lock when the setting of the nonspinnable
> > > > +	 * bits are observed.
> > > > +	 */
> > > > +	disable_rspin = atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE;
> > > > +
> > > >    	/*
> > > >    	 * Optimistic spinning failed, proceed to the slowpath
> > > >    	 * and block until we can acquire the sem.
> > > > @@ -1170,7 +1375,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> > > >    		if (waiter.handoff_set) {
> > > >    			enum owner_state owner_state;
> > > > -			owner_state = rwsem_spin_on_owner(sem);
> > > > +			owner_state = rwsem_spin_on_owner(sem, RWSEM_NONSPINNABLE);
> > > >    			if (owner_state == OWNER_NULL)
> > > >    				goto trylock_again;
> > > >    		}
> > > > @@ -1182,6 +1387,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> > > >    		raw_spin_lock_irq(&sem->wait_lock);
> > > >    	}
> > > >    	__set_current_state(TASK_RUNNING);
> > > > +	rwsem_disable_reader_optspin(sem, disable_rspin);
> > > >    	raw_spin_unlock_irq(&sem->wait_lock);
> > > >    	lockevent_inc(rwsem_wlock);
> > > >    	trace_contention_end(sem, 0);
> > > > @@ -1348,7 +1554,7 @@ static inline void __up_read(struct rw_semaphore *sem)
> > > >    	DEBUG_RWSEMS_WARN_ON(tmp < 0, sem);
> > > >    	if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) ==
> > > >    		      RWSEM_FLAG_WAITERS)) {
> > > > -		clear_nonspinnable(sem);
> > > > +		clear_wr_nonspinnable(sem);
> > > >    		rwsem_wake(sem);
> > > >    	}
> > > >    	preempt_enable();
> > > I don't have a strong feeling pro or against it. It does provide a modest
> > > improvement in some use cases, but it does make the code a bit more complex
> > > and harder to understand.
> > > 
> > > Cheers,
> > > Longman
> > > 
> > > 
> > > 
> 
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning
       [not found]       ` <8e11066a-017f-a190-39de-17f92128e42f@redhat.com>
  2023-06-02  0:46         ` Bongkyu Kim
@ 2023-06-13  4:36         ` Bongkyu Kim
  1 sibling, 0 replies; 5+ messages in thread
From: Bongkyu Kim @ 2023-06-13  4:36 UTC (permalink / raw)
  To: Waiman Long, peterz, mingo, will, boqun.feng
  Cc: linux-kernel, jwook1.kim, lakkyung.jung, bongkyu7.kim

[-- Attachment #1: Type: text/plain, Size: 30537 bytes --]

On Thu, Jun 01, 2023 at 10:06:13AM -0400, Waiman Long wrote:
> On 5/31/23 21:22, Bongkyu Kim wrote:
> > On Wed, May 31, 2023 at 02:29:54PM -0400, Waiman Long wrote:
> > > On 5/30/23 20:34, Bongkyu Kim wrote:
> > > > Remove reader optimistic spinning has a great regression on application
> > > I won't consider this a great regression if it is just a few % point of
> > > performance differences.
> > > 
> > > BTW, this patch is mostly a revert of commit 617f3ef95177 ("locking/rwsem:
> > > Remove reader optimistic spinning"). So it should be mentioned here.
> > > 
> > I will fix these and resend patch v2.
> > Thanks for your review.
> 

I missed your comments in last reply. I reply again.
I'm sorry for confusing you. Please check the answers below.

> There is one more point that I forgot to mention. Enabling this option can
> degrade or improve performance depending on the actual workload. A system
> administrator that is not familiar with the rwsem code will not know what to
> do with it. So I would suggest a bit more wording on the documentation part
> about under what condition should the users try to enable it to see if it
> helps.
> 
> -Longman
> 

I will add more wording to the documentation.

> > > > startup performance in android device. In mobile environment, reader
> > > > optimistic spinning is still useful because there're not many readers.
> > > > So re-enable reader optimistic spinning and disabled by default. And,
> > > > can turn on by cmdline.
> > > > 
> > > > Test result:
> > > > This is 15 application startup performance in our s5e8535 soc.
> > > > - Cortex A78*2 + Cortex A55*6
> > > > 
> > > > Application             base  opt_rspin  Diff  Diff(%)
> > > > --------------------  ------  ---------  ----  -------
> > > > * Total(geomean)         343        330   -13    +3.8%
> > > > --------------------  ------  ---------  ----  -------
> > > > helloworld               110        108    -2    +1.8%
> > > > Amazon_Seller            397        388    -9    +2.3%
> > > > Whatsapp                 311        304    -7    +2.3%
> > > > Simple_PDF_Reader        500        463   -37    +7.4%
> > > > FaceApp                  330        317   -13    +3.9%
> > > > Timestamp_Camera_Free    451        443    -8    +1.8%
> > > > Kindle                   629        597   -32    +5.1%
> > > > Coinbase                 243        233   -10    +4.1%
> > > > Firefox                  425        399   -26    +6.1%
> > > > Candy_Crush_Soda         552        538   -14    +2.5%
> > > > Hill_Climb_Racing        245        230   -15    +6.1%
> > > > Call_Recorder            437        426   -11    +2.5%
> > > > Color_Fill_3D            190        180   -10    +5.3%
> > > > eToro                    512        505    -7    +1.4%
> > > > GroupMe                  281        266   -15    +5.3%
> > > > 
> > > > Signed-off-by: Bongkyu Kim <bongkyu7.kim@samsung.com>
> > > > ---
> > > >    .../admin-guide/kernel-parameters.txt         |   2 +
> > > >    kernel/locking/lock_events_list.h             |   5 +-
> > > >    kernel/locking/rwsem.c                        | 292 +++++++++++++++---
> > > >    3 files changed, 255 insertions(+), 44 deletions(-)
> > > > 
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > index bb23a36a7ff7..b92a6b3f965f 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -5495,6 +5495,8 @@
> > > >    	rw		[KNL] Mount root device read-write on boot
> > > > +	rwsem.opt_rspin= [KNL] Use rwsem reader optimistic spinning
> > > > +
> > > >    	S		[KNL] Run init in single mode
> > > >    	s390_iommu=	[HW,S390]
> > > > diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
> > > > index 97fb6f3f840a..270a0d351932 100644
> > > > --- a/kernel/locking/lock_events_list.h
> > > > +++ b/kernel/locking/lock_events_list.h
> > > > @@ -56,9 +56,12 @@ LOCK_EVENT(rwsem_sleep_reader)	/* # of reader sleeps			*/
> > > >    LOCK_EVENT(rwsem_sleep_writer)	/* # of writer sleeps			*/
> > > >    LOCK_EVENT(rwsem_wake_reader)	/* # of reader wakeups			*/
> > > >    LOCK_EVENT(rwsem_wake_writer)	/* # of writer wakeups			*/
> > > > -LOCK_EVENT(rwsem_opt_lock)	/* # of opt-acquired write locks	*/
> > > > +LOCK_EVENT(rwsem_opt_rlock)	/* # of opt-acquired read locks		*/
> > > > +LOCK_EVENT(rwsem_opt_wlock)	/* # of opt-acquired write locks	*/
> > > >    LOCK_EVENT(rwsem_opt_fail)	/* # of failed optspins			*/
> > > >    LOCK_EVENT(rwsem_opt_nospin)	/* # of disabled optspins		*/
> > > > +LOCK_EVENT(rwsem_opt_norspin)	/* # of disabled reader-only optspins	*/
> > > > +LOCK_EVENT(rwsem_opt_rlock2)	/* # of opt-acquired 2ndary read locks	*/
> > > >    LOCK_EVENT(rwsem_rlock)		/* # of read locks acquired		*/
> > > >    LOCK_EVENT(rwsem_rlock_steal)	/* # of read locks by lock stealing	*/
> > > >    LOCK_EVENT(rwsem_rlock_fast)	/* # of fast read locks acquired	*/
> > > > diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> > > > index 9eabd585ce7a..016dbc4312e6 100644
> > > > --- a/kernel/locking/rwsem.c
> > > > +++ b/kernel/locking/rwsem.c
> > > > @@ -33,13 +33,19 @@
> > > >    #include "lock_events.h"
> > > >    /*
> > > > - * The least significant 2 bits of the owner value has the following
> > > > + * The least significant 3 bits of the owner value has the following
> > > >     * meanings when set.
> > > >     *  - Bit 0: RWSEM_READER_OWNED - The rwsem is owned by readers
> > > > - *  - Bit 1: RWSEM_NONSPINNABLE - Cannot spin on a reader-owned lock
> > > > + *  - Bit 1: RWSEM_RD_NONSPINNABLE - Readers cannot spin on this lock.
> > > > + *  - Bit 2: RWSEM_WR_NONSPINNABLE - Writers cannot spin on this lock.
> > > >     *
> > > > - * When the rwsem is reader-owned and a spinning writer has timed out,
> > > > - * the nonspinnable bit will be set to disable optimistic spinning.
> > > > + * When the rwsem is either owned by an anonymous writer, or it is
> > > > + * reader-owned, but a spinning writer has timed out, both nonspinnable
> > > > + * bits will be set to disable optimistic spinning by readers and writers.
> > > > + * In the later case, the last unlocking reader should then check the
> > > > + * writer nonspinnable bit and clear it only to give writers preference
> > > > + * to acquire the lock via optimistic spinning, but not readers. Similar
> > > > + * action is also done in the reader slowpath.
> > > >     * When a writer acquires a rwsem, it puts its task_struct pointer
> > > >     * into the owner field. It is cleared after an unlock.
> > > > @@ -59,9 +65,47 @@
> > > >     * is previously owned by a writer and the following conditions are met:
> > > >     *  - rwsem is not currently writer owned
> > > >     *  - the handoff isn't set.
> > > > + *
> > > > + * Reader optimistic spinning is helpful when the reader critical section
> > > > + * is short and there aren't that many readers around. It makes readers
> > > > + * relatively more preferred than writers. When a writer times out spinning
> > > > + * on a reader-owned lock and set the nospinnable bits, there are two main
> > > > + * reasons for that.
> > > > + *
> > > > + *  1) The reader critical section is long, perhaps the task sleeps after
> > > > + *     acquiring the read lock.
> > > > + *  2) There are just too many readers contending the lock causing it to
> > > > + *     take a while to service all of them.
> > > > + *
> > > > + * In the former case, long reader critical section will impede the progress
> > > > + * of writers which is usually more important for system performance. In
> > > > + * the later case, reader optimistic spinning tends to make the reader
> > > > + * groups that contain readers that acquire the lock together smaller
> > > > + * leading to more of them. That may hurt performance in some cases. In
> > > > + * other words, the setting of nonspinnable bits indicates that reader
> > > > + * optimistic spinning may not be helpful for those workloads that cause
> > > > + * it.
> > > > + *
> > > > + * Therefore, any writers that had observed the setting of the writer
> > > > + * nonspinnable bit for a given rwsem after they fail to acquire the lock
> > > > + * via optimistic spinning will set the reader nonspinnable bit once they
> > > > + * acquire the write lock. Similarly, readers that observe the setting
> > > > + * of reader nonspinnable bit at slowpath entry will set the reader
> > > > + * nonspinnable bits when they acquire the read lock via the wakeup path.
> > > > + *
> > > > + * Once the reader nonspinnable bit is on, it will only be reset when
> > > > + * a writer is able to acquire the rwsem in the fast path or somehow a
> > > > + * reader or writer in the slowpath doesn't observe the nonspinable bit.
> > > > + *
> > > > + * This is to discourage reader optmistic spinning on that particular
> > > > + * rwsem and make writers more preferred. This adaptive disabling of reader
> > > > + * optimistic spinning will alleviate the negative side effect of this
> > > > + * feature.
> > > >     */
> > > >    #define RWSEM_READER_OWNED	(1UL << 0)
> > > > -#define RWSEM_NONSPINNABLE	(1UL << 1)
> > > > +#define RWSEM_RD_NONSPINNABLE	(1UL << 1)
> > > > +#define RWSEM_WR_NONSPINNABLE	(1UL << 2)
> > > > +#define RWSEM_NONSPINNABLE	(RWSEM_RD_NONSPINNABLE | RWSEM_WR_NONSPINNABLE)
> > > >    #define RWSEM_OWNER_FLAGS_MASK	(RWSEM_READER_OWNED | RWSEM_NONSPINNABLE)
> > > >    #ifdef CONFIG_DEBUG_RWSEMS
> > > > @@ -127,6 +171,12 @@
> > > >    #define RWSEM_READ_FAILED_MASK	(RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
> > > >    				 RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)
> > > > +#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
> > > > +/* Reader optimistic spinning, default disabled */
> > > > +static bool rwsem_opt_rspin;
> > > > +module_param_named(opt_rspin, rwsem_opt_rspin, bool, 0644);
> > > > +#endif
> > > The rwsem code isn't really a kernel module. It is a bit odd to use module
> > > parameter here.
> > > 

I will change to __setup or early_param.

> > > 
> > > > +
> > > >    /*
> > > >     * All writes to owner are protected by WRITE_ONCE() to make sure that
> > > >     * store tearing can't happen as optimistic spinners may read and use
> > > > @@ -171,7 +221,7 @@ static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem,
> > > >    					    struct task_struct *owner)
> > > >    {
> > > >    	unsigned long val = (unsigned long)owner | RWSEM_READER_OWNED |
> > > > -		(atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE);
> > > > +		(atomic_long_read(&sem->owner) & RWSEM_RD_NONSPINNABLE);
> > > >    	atomic_long_set(&sem->owner, val);
> > > >    }
> > > > @@ -341,6 +391,7 @@ struct rwsem_waiter {
> > > >    	enum rwsem_waiter_type type;
> > > >    	unsigned long timeout;
> > > >    	bool handoff_set;
> > > > +	unsigned long last_rowner;
> > > >    };
> > > >    #define rwsem_first_waiter(sem) \
> > > >    	list_first_entry(&sem->wait_list, struct rwsem_waiter, list)
> > > > @@ -480,6 +531,10 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
> > > >    		 * the reader is copied over.
> > > >    		 */
> > > >    		owner = waiter->task;
> > > > +		if (waiter->last_rowner & RWSEM_RD_NONSPINNABLE) {
> > > > +			owner = (void *)((unsigned long)owner | RWSEM_RD_NONSPINNABLE);
> > > > +			lockevent_inc(rwsem_opt_norspin);
> > > > +		}
> > > >    		__rwsem_set_reader_owned(sem, owner);
> > > >    	}
> > > > @@ -684,6 +739,30 @@ enum owner_state {
> > > >    };
> > > >    #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
> > > > +/*
> > > > + * Try to acquire read lock before the reader is put on wait queue.
> > > > + * Lock acquisition isn't allowed if the rwsem is locked or a writer handoff
> > > > + * is ongoing.
> > > > + */
> > > > +static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem)
> > > > +{
> > > > +	long count = atomic_long_read(&sem->count);
> > > > +
> > > > +	if (count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))
> > > > +		return false;
> > > > +
> > > > +	count = atomic_long_fetch_add_acquire(RWSEM_READER_BIAS, &sem->count);
> > > > +	if (!(count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) {
> > > > +		rwsem_set_reader_owned(sem);
> > > > +		lockevent_inc(rwsem_opt_rlock);
> > > > +		return true;
> > > > +	}
> > > > +
> > > > +	/* Back out the change */
> > > > +	atomic_long_add(-RWSEM_READER_BIAS, &sem->count);
> > > > +	return false;
> > > > +}
> > > > +
> > > >    /*
> > > >     * Try to acquire write lock before the writer has been put on wait queue.
> > > >     */
> > > > @@ -695,14 +774,15 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
> > > >    		if (atomic_long_try_cmpxchg_acquire(&sem->count, &count,
> > > >    					count | RWSEM_WRITER_LOCKED)) {
> > > >    			rwsem_set_owner(sem);
> > > > -			lockevent_inc(rwsem_opt_lock);
> > > > +			lockevent_inc(rwsem_opt_wlock);
> > > >    			return true;
> > > >    		}
> > > >    	}
> > > >    	return false;
> > > >    }
> > > > -static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > > +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
> > > > +					   unsigned long nonspinnable)
> > > >    {
> > > >    	struct task_struct *owner;
> > > >    	unsigned long flags;
> > > > @@ -721,7 +801,7 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > >    	/*
> > > >    	 * Don't check the read-owner as the entry may be stale.
> > > >    	 */
> > > > -	if ((flags & RWSEM_NONSPINNABLE) ||
> > > > +	if ((flags & nonspinnable) ||
> > > >    	    (owner && !(flags & RWSEM_READER_OWNED) && !owner_on_cpu(owner)))
> > > >    		ret = false;
> > > > @@ -732,9 +812,9 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > >    #define OWNER_SPINNABLE		(OWNER_NULL | OWNER_WRITER | OWNER_READER)
> > > >    static inline enum owner_state
> > > > -rwsem_owner_state(struct task_struct *owner, unsigned long flags)
> > > > +rwsem_owner_state(struct task_struct *owner, unsigned long flags, unsigned long nonspinnable)
> > > >    {
> > > > -	if (flags & RWSEM_NONSPINNABLE)
> > > > +	if (flags & nonspinnable)
> > > >    		return OWNER_NONSPINNABLE;
> > > >    	if (flags & RWSEM_READER_OWNED)
> > > > @@ -744,7 +824,7 @@ rwsem_owner_state(struct task_struct *owner, unsigned long flags)
> > > >    }
> > > >    static noinline enum owner_state
> > > > -rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > > +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> > > >    {
> > > >    	struct task_struct *new, *owner;
> > > >    	unsigned long flags, new_flags;
> > > > @@ -753,7 +833,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > >    	lockdep_assert_preemption_disabled();
> > > >    	owner = rwsem_owner_flags(sem, &flags);
> > > > -	state = rwsem_owner_state(owner, flags);
> > > > +	state = rwsem_owner_state(owner, flags, nonspinnable);
> > > >    	if (state != OWNER_WRITER)
> > > >    		return state;
> > > > @@ -766,7 +846,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > >    		 */
> > > >    		new = rwsem_owner_flags(sem, &new_flags);
> > > >    		if ((new != owner) || (new_flags != flags)) {
> > > > -			state = rwsem_owner_state(new, new_flags);
> > > > +			state = rwsem_owner_state(new, new_flags, nonspinnable);
> > > >    			break;
> > > >    		}
> > > > @@ -816,12 +896,14 @@ static inline u64 rwsem_rspin_threshold(struct rw_semaphore *sem)
> > > >    	return sched_clock() + delta;
> > > >    }
> > > > -static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > > +static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock)
> > > >    {
> > > >    	bool taken = false;
> > > >    	int prev_owner_state = OWNER_NULL;
> > > >    	int loop = 0;
> > > >    	u64 rspin_threshold = 0;
> > > > +	unsigned long nonspinnable = wlock ? RWSEM_WR_NONSPINNABLE
> > > > +					   : RWSEM_RD_NONSPINNABLE;
> > > >    	/* sem->wait_lock should not be held when doing optimistic spinning */
> > > >    	if (!osq_lock(&sem->osq))
> > > > @@ -836,14 +918,15 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    	for (;;) {
> > > >    		enum owner_state owner_state;
> > > > -		owner_state = rwsem_spin_on_owner(sem);
> > > > +		owner_state = rwsem_spin_on_owner(sem, nonspinnable);
> > > >    		if (!(owner_state & OWNER_SPINNABLE))
> > > >    			break;
> > > >    		/*
> > > >    		 * Try to acquire the lock
> > > >    		 */
> > > > -		taken = rwsem_try_write_lock_unqueued(sem);
> > > > +		taken = wlock ? rwsem_try_write_lock_unqueued(sem)
> > > > +			      : rwsem_try_read_lock_unqueued(sem);
> > > >    		if (taken)
> > > >    			break;
> > > > @@ -851,7 +934,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    		/*
> > > >    		 * Time-based reader-owned rwsem optimistic spinning
> > > >    		 */
> > > > -		if (owner_state == OWNER_READER) {
> > > > +		if (wlock && (owner_state == OWNER_READER)) {
> > > >    			/*
> > > >    			 * Re-initialize rspin_threshold every time when
> > > >    			 * the owner state changes from non-reader to reader.
> > > > @@ -860,7 +943,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    			 * the beginning of the 2nd reader phase.
> > > >    			 */
> > > >    			if (prev_owner_state != OWNER_READER) {
> > > > -				if (rwsem_test_oflags(sem, RWSEM_NONSPINNABLE))
> > > > +				if (rwsem_test_oflags(sem, nonspinnable))
> > > >    					break;
> > > >    				rspin_threshold = rwsem_rspin_threshold(sem);
> > > >    				loop = 0;
> > > > @@ -935,30 +1018,89 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > >    }
> > > >    /*
> > > > - * Clear the owner's RWSEM_NONSPINNABLE bit if it is set. This should
> > > > + * Clear the owner's RWSEM_WR_NONSPINNABLE bit if it is set. This should
> > > >     * only be called when the reader count reaches 0.
> > > > + *
> > > > + * This give writers better chance to acquire the rwsem first before
> > > > + * readers when the rwsem was being held by readers for a relatively long
> > > > + * period of time. Race can happen that an optimistic spinner may have
> > > > + * just stolen the rwsem and set the owner, but just clearing the
> > > > + * RWSEM_WR_NONSPINNABLE bit will do no harm anyway.
> > > >     */
> > > > -static inline void clear_nonspinnable(struct rw_semaphore *sem)
> > > > +static inline void clear_wr_nonspinnable(struct rw_semaphore *sem)
> > > >    {
> > > > -	if (unlikely(rwsem_test_oflags(sem, RWSEM_NONSPINNABLE)))
> > > > -		atomic_long_andnot(RWSEM_NONSPINNABLE, &sem->owner);
> > > > +	if (unlikely(rwsem_test_oflags(sem, RWSEM_WR_NONSPINNABLE)))
> > > > +		atomic_long_andnot(RWSEM_WR_NONSPINNABLE, &sem->owner);
> > > > +}
> > > > +
> > > > +/*
> > > > + * This function is called when the reader fails to acquire the lock via
> > > > + * optimistic spinning. In this case we will still attempt to do a trylock
> > > > + * when comparing the rwsem state right now with the state when entering
> > > > + * the slowpath indicates that the reader is still in a valid reader phase.
> > > > + * This happens when the following conditions are true:
> > > > + *
> > > > + * 1) The lock is currently reader owned, and
> > > > + * 2) The lock is previously not reader-owned or the last read owner changes.
> > > > + *
> > > > + * In the former case, we have transitioned from a writer phase to a
> > > > + * reader-phase while spinning. In the latter case, it means the reader
> > > > + * phase hasn't ended when we entered the optimistic spinning loop. In
> > > > + * both cases, the reader is eligible to acquire the lock. This is the
> > > > + * secondary path where a read lock is acquired optimistically.
> > > > + *
> > > > + * The reader non-spinnable bit wasn't set at time of entry or it will
> > > > + * not be here at all.
> > > > + */
> > > > +static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
> > > > +					      unsigned long last_rowner)
> > > > +{
> > > > +	unsigned long owner = atomic_long_read(&sem->owner);
> > > > +
> > > > +	if (!(owner & RWSEM_READER_OWNED))
> > > > +		return false;
> > > > +
> > > > +	if (((owner ^ last_rowner) & ~RWSEM_OWNER_FLAGS_MASK) &&
> > > > +	    rwsem_try_read_lock_unqueued(sem)) {
> > > > +		lockevent_inc(rwsem_opt_rlock2);
> > > > +		lockevent_add(rwsem_opt_fail, -1);
> > > > +		return true;
> > > > +	}
> > > > +	return false;
> > > > +}
> > > > +
> > > > +static inline bool rwsem_no_spinners(struct rw_semaphore *sem)
> > > > +{
> > > > +	return !osq_is_locked(&sem->osq);
> > > >    }
> > > >    #else
> > > > -static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
> > > > +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
> > > > +					   unsigned long nonspinnable)
> > > >    {
> > > >    	return false;
> > > >    }
> > > > -static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem)
> > > > +static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock)
> > > >    {
> > > >    	return false;
> > > >    }
> > > > -static inline void clear_nonspinnable(struct rw_semaphore *sem) { }
> > > > +static inline void clear_wr_nonspinnable(struct rw_semaphore *sem) { }
> > > > +
> > > > +static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
> > > > +					      unsigned long last_rowner)
> > > > +{
> > > > +	return false;
> > > > +}
> > > > +
> > > > +static inline bool rwsem_no_spinners(sem)
> > > > +{
> > > > +	return false;
> > > > +}
> > > >    static inline enum owner_state
> > > > -rwsem_spin_on_owner(struct rw_semaphore *sem)
> > > > +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> > > >    {
> > > >    	return OWNER_NONSPINNABLE;
> > > >    }
> > > > @@ -984,7 +1126,7 @@ static inline void rwsem_cond_wake_waiter(struct rw_semaphore *sem, long count,
> > > >    		wake_type = RWSEM_WAKE_READERS;
> > > >    	} else {
> > > >    		wake_type = RWSEM_WAKE_ANY;
> > > > -		clear_nonspinnable(sem);
> > > > +		clear_wr_nonspinnable(sem);
> > > >    	}
> > > >    	rwsem_mark_wake(sem, wake_type, wake_q);
> > > >    }
> > > > @@ -995,32 +1137,66 @@ static inline void rwsem_cond_wake_waiter(struct rw_semaphore *sem, long count,
> > > >    static struct rw_semaphore __sched *
> > > >    rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int state)
> > > >    {
> > > > -	long adjustment = -RWSEM_READER_BIAS;
> > > > +	long owner, adjustment = -RWSEM_READER_BIAS;
> > > >    	long rcnt = (count >> RWSEM_READER_SHIFT);
> > > >    	struct rwsem_waiter waiter;
> > > >    	DEFINE_WAKE_Q(wake_q);
> > > >    	/*
> > > >    	 * To prevent a constant stream of readers from starving a sleeping
> > > > -	 * waiter, don't attempt optimistic lock stealing if the lock is
> > > > -	 * currently owned by readers.
> > > > +	 * waiter, don't attempt optimistic spinning if the lock is currently
> > > > +	 * owned by readers.
> > > >    	 */
> > > > -	if ((atomic_long_read(&sem->owner) & RWSEM_READER_OWNED) &&
> > > > -	    (rcnt > 1) && !(count & RWSEM_WRITER_LOCKED))
> > > > +	owner = atomic_long_read(&sem->owner);
> > > > +	if ((owner & RWSEM_READER_OWNED) && (rcnt > 1) &&
> > > > +	   !(count & RWSEM_WRITER_LOCKED))
> > > >    		goto queue;
> > > >    	/*
> > > > -	 * Reader optimistic lock stealing.
> > > > +	 * Reader optimistic lock stealing
> > > > +	 *
> > > > +	 * We can take the read lock directly without doing
> > > > +	 * rwsem_optimistic_spin() if the conditions are right.
> > > > +	 * Also wake up other readers if it is the first reader.
> > > >    	 */
> > > > -	if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF))) {
> > > > +	if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF)) &&
> > > > +	    rwsem_no_spinners(sem)) {
> > > >    		rwsem_set_reader_owned(sem);
> > > >    		lockevent_inc(rwsem_rlock_steal);
> > > > +		if (rcnt == 1)
> > > > +			goto wake_readers;
> > > > +		return sem;
> > > > +	}
> > > > +
> > > > +#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
> > > > +	if (!rwsem_opt_rspin)
> > > > +		goto queue;
> > > > +#endif
> > > I would suggest changing that to

I will change to like this.

> > > 
> > > if (!IS_ENABLED(CONFIG_RWSEM_SPIN_ON_OWNER) || !rwsem_opt_rspin)
> > >          goto queue;
> > > 
> > > > +	/*
> > > > +	 * Save the current read-owner of rwsem, if available, and the
> > > > +	 * reader nonspinnable bit.
> > > > +	 */
> > > > +	waiter.last_rowner = owner;
> > > > +	if (!(waiter.last_rowner & RWSEM_READER_OWNED))
> > > > +		waiter.last_rowner &= RWSEM_RD_NONSPINNABLE;
> > > > +
> > > > +	if (!rwsem_can_spin_on_owner(sem, RWSEM_RD_NONSPINNABLE))
> > > > +		goto queue;
> > > > +
> > > > +	/*
> > > > +	 * Undo read bias from down_read() and do optimistic spinning.
> > > > +	 */
> > > > +	atomic_long_add(-RWSEM_READER_BIAS, &sem->count);
> > > > +	adjustment = 0;
> > > > +	if (rwsem_optimistic_spin(sem, false)) {
> > > > +		/* rwsem_optimistic_spin() implies ACQUIRE on success */
> > > >    		/*
> > > > -		 * Wake up other readers in the wait queue if it is
> > > > -		 * the first reader.
> > > > +		 * Wake up other readers in the wait list if the front
> > > > +		 * waiter is a reader.
> > > >    		 */
> > > > -		if ((rcnt == 1) && (count & RWSEM_FLAG_WAITERS)) {
> > > > +wake_readers:
> > > > +		if ((atomic_long_read(&sem->count) & RWSEM_FLAG_WAITERS)) {
> > > >    			raw_spin_lock_irq(&sem->wait_lock);
> > > >    			if (!list_empty(&sem->wait_list))
> > > >    				rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED,
> > > > @@ -1029,6 +1205,9 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    			wake_up_q(&wake_q);
> > > >    		}
> > > >    		return sem;
> > > > +	} else if (rwsem_reader_phase_trylock(sem, waiter.last_rowner)) {
> > > > +		/* rwsem_reader_phase_trylock() implies ACQUIRE on success */
> > > > +		return sem;
> > > >    	}
> > > >    queue:
> > > > @@ -1045,7 +1224,8 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    		 * immediately as its RWSEM_READER_BIAS has already been set
> > > >    		 * in the count.
> > > >    		 */
> > > > -		if (!(atomic_long_read(&sem->count) & RWSEM_WRITER_MASK)) {
> > > > +		if (adjustment && !(atomic_long_read(&sem->count) &
> > > > +		     RWSEM_WRITER_MASK)) {
> > > >    			/* Provide lock ACQUIRE */
> > > >    			smp_acquire__after_ctrl_dep();
> > > >    			raw_spin_unlock_irq(&sem->wait_lock);
> > > > @@ -1058,7 +1238,10 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    	rwsem_add_waiter(sem, &waiter);
> > > >    	/* we're now waiting on the lock, but no longer actively locking */
> > > > -	count = atomic_long_add_return(adjustment, &sem->count);
> > > > +	if (adjustment)
> > > > +		count = atomic_long_add_return(adjustment, &sem->count);
> > > > +	else
> > > > +		count = atomic_long_read(&sem->count);
> > > >    	rwsem_cond_wake_waiter(sem, count, &wake_q);
> > > >    	raw_spin_unlock_irq(&sem->wait_lock);
> > > > @@ -1100,21 +1283,43 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
> > > >    	return ERR_PTR(-EINTR);
> > > >    }
> > > > +/*
> > > > + * This function is called by the a write lock owner. So the owner value
> > > > + * won't get changed by others.
> > > > + */
> > > > +static inline void rwsem_disable_reader_optspin(struct rw_semaphore *sem,
> > > > +						bool disable)
> > > > +{
> > > > +	if (unlikely(disable)) {
> > > > +		atomic_long_or(RWSEM_RD_NONSPINNABLE, &sem->owner);
> > > > +		lockevent_inc(rwsem_opt_norspin);
> > > > +	}
> > > > +}
> > > > +
> > > >    /*
> > > >     * Wait until we successfully acquire the write lock
> > > >     */
> > > >    static struct rw_semaphore __sched *
> > > >    rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> > > >    {
> > > > +	bool disable_rspin;
> > > >    	struct rwsem_waiter waiter;
> > > >    	DEFINE_WAKE_Q(wake_q);
> > > >    	/* do optimistic spinning and steal lock if possible */
> > > > -	if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
> > > > +	if (rwsem_can_spin_on_owner(sem, RWSEM_WR_NONSPINNABLE) &&
> > > > +	    rwsem_optimistic_spin(sem, true)) {
> > > >    		/* rwsem_optimistic_spin() implies ACQUIRE on success */
> > > >    		return sem;
> > > >    	}
> > > > +	/*
> > > > +	 * Disable reader optimistic spinning for this rwsem after
> > > > +	 * acquiring the write lock when the setting of the nonspinnable
> > > > +	 * bits are observed.
> > > > +	 */
> > > > +	disable_rspin = atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE;
> > > > +
> > > >    	/*
> > > >    	 * Optimistic spinning failed, proceed to the slowpath
> > > >    	 * and block until we can acquire the sem.
> > > > @@ -1170,7 +1375,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> > > >    		if (waiter.handoff_set) {
> > > >    			enum owner_state owner_state;
> > > > -			owner_state = rwsem_spin_on_owner(sem);
> > > > +			owner_state = rwsem_spin_on_owner(sem, RWSEM_NONSPINNABLE);
> > > >    			if (owner_state == OWNER_NULL)
> > > >    				goto trylock_again;
> > > >    		}
> > > > @@ -1182,6 +1387,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> > > >    		raw_spin_lock_irq(&sem->wait_lock);
> > > >    	}
> > > >    	__set_current_state(TASK_RUNNING);
> > > > +	rwsem_disable_reader_optspin(sem, disable_rspin);
> > > >    	raw_spin_unlock_irq(&sem->wait_lock);
> > > >    	lockevent_inc(rwsem_wlock);
> > > >    	trace_contention_end(sem, 0);
> > > > @@ -1348,7 +1554,7 @@ static inline void __up_read(struct rw_semaphore *sem)
> > > >    	DEBUG_RWSEMS_WARN_ON(tmp < 0, sem);
> > > >    	if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) ==
> > > >    		      RWSEM_FLAG_WAITERS)) {
> > > > -		clear_nonspinnable(sem);
> > > > +		clear_wr_nonspinnable(sem);
> > > >    		rwsem_wake(sem);
> > > >    	}
> > > >    	preempt_enable();
> > > I don't have a strong feeling pro or against it. It does provide a modest
> > > improvement in some use cases, but it does make the code a bit more complex
> > > and harder to understand.
> > > 
> > > Cheers,
> > > Longman
> > > 

In mobile device, application launch time diff about 4% is not small.
We are working hard to make any improvement for this.
In my opinion, reader optimistic spinning is still helpful in light workload environment.
Is this patch acceptable? If so, I will send the new patch for fix about your comments.

Thanks,
Bongkyu

> > > 
> > > 
> 
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-13  4:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20230531003446epcas2p1fc55e0439a9c667685d495cd5f5b2e93@epcas2p1.samsung.com>
2023-05-31  0:34 ` [PATCH] locking/rwsem: Optionally re-enable reader optimistic spinning Bongkyu Kim
2023-05-31 16:45   ` kernel test robot
2023-06-01  7:10   ` kernel test robot
     [not found]   ` <d5b65c4b-7232-e5a7-27ae-b8efab037396@redhat.com>
     [not found]     ` <20230601012246.GA8364@KORCO045595.samsungds.net>
     [not found]       ` <8e11066a-017f-a190-39de-17f92128e42f@redhat.com>
2023-06-02  0:46         ` Bongkyu Kim
2023-06-13  4:36         ` Bongkyu Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).