rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rcu 01/14] rcu: Simplify rcu_init_nohz() cpumask handling
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 02/14] rcu: Fix late wakeup when flush of bypass cblist happens Paul E. McKenney
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Zhen Lei, Joel Fernandes,
	Frederic Weisbecker, Paul E . McKenney

From: Zhen Lei <thunder.leizhen@huawei.com>

In kernels built with either CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y or
CONFIG_NO_HZ_FULL=y, additional CPUs must be added to rcu_nocb_mask.
Except that kernels booted without the rcu_nocbs= will not have
allocated rcu_nocb_mask.  And the current rcu_init_nohz() function uses
its need_rcu_nocb_mask and offload_all local variables to track the
rcu_nocb and nohz_full state.

But there is a much simpler approach, namely creating a cpumask pointer
to track the default and then using cpumask_available() to check the
rcu_nocb_mask state.  This commit takes this approach, thereby simplifying
and shortening the rcu_init_nohz() function.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_nocb.h | 34 +++++++++++-----------------------
 1 file changed, 11 insertions(+), 23 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 0a5f0ef414845..ce526cc2791ca 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1210,45 +1210,33 @@ EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
 void __init rcu_init_nohz(void)
 {
 	int cpu;
-	bool need_rcu_nocb_mask = false;
-	bool offload_all = false;
 	struct rcu_data *rdp;
-
-#if defined(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL)
-	if (!rcu_state.nocb_is_setup) {
-		need_rcu_nocb_mask = true;
-		offload_all = true;
-	}
-#endif /* #if defined(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) */
+	const struct cpumask *cpumask = NULL;
 
 #if defined(CONFIG_NO_HZ_FULL)
-	if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask)) {
-		need_rcu_nocb_mask = true;
-		offload_all = false; /* NO_HZ_FULL has its own mask. */
-	}
-#endif /* #if defined(CONFIG_NO_HZ_FULL) */
+	if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask))
+		cpumask = tick_nohz_full_mask;
+#endif
+
+	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) &&
+	    !rcu_state.nocb_is_setup && !cpumask)
+		cpumask = cpu_possible_mask;
 
-	if (need_rcu_nocb_mask) {
+	if (cpumask) {
 		if (!cpumask_available(rcu_nocb_mask)) {
 			if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
 				pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
 				return;
 			}
 		}
+
+		cpumask_or(rcu_nocb_mask, rcu_nocb_mask, cpumask);
 		rcu_state.nocb_is_setup = true;
 	}
 
 	if (!rcu_state.nocb_is_setup)
 		return;
 
-#if defined(CONFIG_NO_HZ_FULL)
-	if (tick_nohz_full_running)
-		cpumask_or(rcu_nocb_mask, rcu_nocb_mask, tick_nohz_full_mask);
-#endif /* #if defined(CONFIG_NO_HZ_FULL) */
-
-	if (offload_all)
-		cpumask_setall(rcu_nocb_mask);
-
 	if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
 		pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
 		cpumask_and(rcu_nocb_mask, cpu_possible_mask,
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 02/14] rcu: Fix late wakeup when flush of bypass cblist happens
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 01/14] rcu: Simplify rcu_init_nohz() cpumask handling Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 03/14] rcu: Fix missing nocb gp wake on rcu_barrier() Paul E. McKenney
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Frederic Weisbecker, Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

When the bypass cblist gets too big or its timeout has occurred, it is
flushed into the main cblist. However, the bypass timer is still running
and the behavior is that it would eventually expire and wake the GP
thread.

Since we are going to use the bypass cblist for lazy CBs, do the wakeup
soon as the flush for "too big or too long" bypass list happens.
Otherwise, long delays can happen for callbacks which get promoted from
lazy to non-lazy.

This is a good thing to do anyway (regardless of future lazy patches),
since it makes the behavior consistent with behavior of other code paths
where flushing into the ->cblist makes the GP kthread into a
non-sleeping state quickly.

[ Frederic Weisbecker: Changes to avoid unnecessary GP-thread wakeups plus
		    comment changes. ]

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_nocb.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index ce526cc2791ca..f77a6d7e13564 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -433,8 +433,9 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
 	    ncbs >= qhimark) {
 		rcu_nocb_lock(rdp);
+		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+
 		if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
-			*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 			if (*was_alldone)
 				trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 						    TPS("FirstQ"));
@@ -447,7 +448,12 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 			rcu_advance_cbs_nowake(rdp->mynode, rdp);
 			rdp->nocb_gp_adv_time = j;
 		}
-		rcu_nocb_unlock_irqrestore(rdp, flags);
+
+		// The flush succeeded and we moved CBs into the regular list.
+		// Don't wait for the wake up timer as it may be too far ahead.
+		// Wake up the GP thread now instead, if the cblist was empty.
+		__call_rcu_nocb_wake(rdp, *was_alldone, flags);
+
 		return true; // Callback already enqueued.
 	}
 
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 03/14] rcu: Fix missing nocb gp wake on rcu_barrier()
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 01/14] rcu: Simplify rcu_init_nohz() cpumask handling Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 02/14] rcu: Fix late wakeup when flush of bypass cblist happens Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 04/14] rcu: Make call_rcu() lazy to save power Paul E. McKenney
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Frederic Weisbecker,
	Joel Fernandes, Paul E . McKenney

From: Frederic Weisbecker <frederic@kernel.org>

In preparation for RCU lazy changes, wake up the RCU nocb gp thread if
needed after an entrain.  This change prevents the RCU barrier callback
from waiting in the queue for several seconds before the lazy callbacks
in front of it are serviced.

Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c      | 11 +++++++++++
 kernel/rcu/tree.h      |  1 +
 kernel/rcu/tree_nocb.h |  5 +++++
 3 files changed, 17 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 6bb8e72bc8151..fb7a1b95af71e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3894,6 +3894,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 {
 	unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
 	unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
+	bool wake_nocb = false;
+	bool was_alldone = false;
 
 	lockdep_assert_held(&rcu_state.barrier_lock);
 	if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
@@ -3902,7 +3904,14 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 	rdp->barrier_head.func = rcu_barrier_callback;
 	debug_rcu_head_queue(&rdp->barrier_head);
 	rcu_nocb_lock(rdp);
+	/*
+	 * Flush bypass and wakeup rcuog if we add callbacks to an empty regular
+	 * queue. This way we don't wait for bypass timer that can reach seconds
+	 * if it's fully lazy.
+	 */
+	was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
 	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
+	wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
 	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
 		atomic_inc(&rcu_state.barrier_cpu_count);
 	} else {
@@ -3910,6 +3919,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 		rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
 	}
 	rcu_nocb_unlock(rdp);
+	if (wake_nocb)
+		wake_nocb_gp(rdp, false);
 	smp_store_release(&rdp->barrier_seq_snap, gseq);
 }
 
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index d4a97e40ea9c3..925dd98f8b23b 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
 static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
+static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				  unsigned long j);
 static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index f77a6d7e13564..094fd454b6c38 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
 {
 }
 
+static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
+{
+	return false;
+}
+
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				  unsigned long j)
 {
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 04/14] rcu: Make call_rcu() lazy to save power
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (2 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 03/14] rcu: Fix missing nocb gp wake on rcu_barrier() Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 05/14] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() Paul E. McKenney
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul McKenney, Frederic Weisbecker

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

Implement timer-based RCU callback batching (also known as lazy
callbacks). With this we save about 5-10% of power consumed due
to RCU requests that happen when system is lightly loaded or idle.

By default, all async callbacks (queued via call_rcu) are marked
lazy. An alternate API call_rcu_flush() is provided for the few users,
for example synchronize_rcu(), that need the old behavior.

The batch is flushed whenever a certain amount of time has passed, or
the batch on a particular CPU grows too big. Also memory pressure will
flush it in a future patch.

To handle several corner cases automagically (such as rcu_barrier() and
hotplug), we re-use bypass lists which were originally introduced to
address lock contention, to handle lazy CBs as well. The bypass list
length has the lazy CB length included in it. A separate lazy CB length
counter is also introduced to keep track of the number of lazy CBs.

[ paulmck: Fix formatting of inline call_rcu_lazy() definition. ]

Suggested-by: Paul McKenney <paulmck@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/rcupdate.h |   9 +++
 kernel/rcu/Kconfig       |   8 ++
 kernel/rcu/rcu.h         |   8 ++
 kernel/rcu/tiny.c        |   2 +-
 kernel/rcu/tree.c        | 129 ++++++++++++++++++++-----------
 kernel/rcu/tree.h        |  11 ++-
 kernel/rcu/tree_exp.h    |   2 +-
 kernel/rcu/tree_nocb.h   | 159 +++++++++++++++++++++++++++++++--------
 8 files changed, 246 insertions(+), 82 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 08605ce7379d7..f6288c1124425 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -108,6 +108,15 @@ static inline int rcu_preempt_depth(void)
 
 #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
 
+#ifdef CONFIG_RCU_LAZY
+void call_rcu_flush(struct rcu_head *head, rcu_callback_t func);
+#else
+static inline void call_rcu_flush(struct rcu_head *head, rcu_callback_t func)
+{
+	call_rcu(head, func);
+}
+#endif
+
 /* Internal to kernel */
 void rcu_init(void);
 extern int rcu_scheduler_active;
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index d471d22a5e21b..d78f6181c8aad 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -311,4 +311,12 @@ config TASKS_TRACE_RCU_READ_MB
 	  Say N here if you hate read-side memory barriers.
 	  Take the default if you are unsure.
 
+config RCU_LAZY
+	bool "RCU callback lazy invocation functionality"
+	depends on RCU_NOCB_CPU
+	default n
+	help
+	  To save power, batch RCU callbacks and flush after delay, memory
+	  pressure, or callback list growing too big.
+
 endmenu # "RCU Subsystem"
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index be5979da07f59..65704cbc9df7b 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -474,6 +474,14 @@ enum rcutorture_type {
 	INVALID_RCU_FLAVOR
 };
 
+#if defined(CONFIG_RCU_LAZY)
+unsigned long rcu_lazy_get_jiffies_till_flush(void);
+void rcu_lazy_set_jiffies_till_flush(unsigned long j);
+#else
+static inline unsigned long rcu_lazy_get_jiffies_till_flush(void) { return 0; }
+static inline void rcu_lazy_set_jiffies_till_flush(unsigned long j) { }
+#endif
+
 #if defined(CONFIG_TREE_RCU)
 void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
 			    unsigned long *gp_seq);
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index a33a8d4942c37..810479cf17bae 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -44,7 +44,7 @@ static struct rcu_ctrlblk rcu_ctrlblk = {
 
 void rcu_barrier(void)
 {
-	wait_rcu_gp(call_rcu);
+	wait_rcu_gp(call_rcu_flush);
 }
 EXPORT_SYMBOL(rcu_barrier);
 
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index fb7a1b95af71e..6eaa020a9d289 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2728,47 +2728,8 @@ static void check_cb_ovld(struct rcu_data *rdp)
 	raw_spin_unlock_rcu_node(rnp);
 }
 
-/**
- * call_rcu() - Queue an RCU callback for invocation after a grace period.
- * @head: structure to be used for queueing the RCU updates.
- * @func: actual callback function to be invoked after the grace period
- *
- * The callback function will be invoked some time after a full grace
- * period elapses, in other words after all pre-existing RCU read-side
- * critical sections have completed.  However, the callback function
- * might well execute concurrently with RCU read-side critical sections
- * that started after call_rcu() was invoked.
- *
- * RCU read-side critical sections are delimited by rcu_read_lock()
- * and rcu_read_unlock(), and may be nested.  In addition, but only in
- * v5.0 and later, regions of code across which interrupts, preemption,
- * or softirqs have been disabled also serve as RCU read-side critical
- * sections.  This includes hardware interrupt handlers, softirq handlers,
- * and NMI handlers.
- *
- * Note that all CPUs must agree that the grace period extended beyond
- * all pre-existing RCU read-side critical section.  On systems with more
- * than one CPU, this means that when "func()" is invoked, each CPU is
- * guaranteed to have executed a full memory barrier since the end of its
- * last RCU read-side critical section whose beginning preceded the call
- * to call_rcu().  It also means that each CPU executing an RCU read-side
- * critical section that continues beyond the start of "func()" must have
- * executed a memory barrier after the call_rcu() but before the beginning
- * of that RCU read-side critical section.  Note that these guarantees
- * include CPUs that are offline, idle, or executing in user mode, as
- * well as CPUs that are executing in the kernel.
- *
- * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
- * resulting RCU callback function "func()", then both CPU A and CPU B are
- * guaranteed to execute a full memory barrier during the time interval
- * between the call to call_rcu() and the invocation of "func()" -- even
- * if CPU A and CPU B are the same CPU (but again only if the system has
- * more than one CPU).
- *
- * Implementation of these memory-ordering guarantees is described here:
- * Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.
- */
-void call_rcu(struct rcu_head *head, rcu_callback_t func)
+static void
+__call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
 {
 	static atomic_t doublefrees;
 	unsigned long flags;
@@ -2809,7 +2770,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
 	}
 
 	check_cb_ovld(rdp);
-	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
+	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy))
 		return; // Enqueued onto ->nocb_bypass, so just leave.
 	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
 	rcu_segcblist_enqueue(&rdp->cblist, head);
@@ -2831,8 +2792,84 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
 		local_irq_restore(flags);
 	}
 }
-EXPORT_SYMBOL_GPL(call_rcu);
 
+#ifdef CONFIG_RCU_LAZY
+/**
+ * call_rcu_flush() - Queue RCU callback for invocation after grace period, and
+ * flush all lazy callbacks (including the new one) to the main ->cblist while
+ * doing so.
+ *
+ * @head: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all pre-existing RCU read-side
+ * critical sections have completed.
+ *
+ * Use this API instead of call_rcu() if you don't want the callback to be
+ * invoked after very long periods of time, which can happen on systems without
+ * memory pressure and on systems which are lightly loaded or mostly idle.
+ * This function will cause callbacks to be invoked sooner than later at the
+ * expense of extra power. Other than that, this function is identical to, and
+ * reuses call_rcu()'s logic. Refer to call_rcu() for more details about memory
+ * ordering and other functionality.
+ */
+void call_rcu_flush(struct rcu_head *head, rcu_callback_t func)
+{
+	return __call_rcu_common(head, func, false);
+}
+EXPORT_SYMBOL_GPL(call_rcu_flush);
+#endif
+
+/**
+ * call_rcu() - Queue an RCU callback for invocation after a grace period.
+ * By default the callbacks are 'lazy' and are kept hidden from the main
+ * ->cblist to prevent starting of grace periods too soon.
+ * If you desire grace periods to start very soon, use call_rcu_flush().
+ *
+ * @head: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all pre-existing RCU read-side
+ * critical sections have completed.  However, the callback function
+ * might well execute concurrently with RCU read-side critical sections
+ * that started after call_rcu() was invoked.
+ *
+ * RCU read-side critical sections are delimited by rcu_read_lock()
+ * and rcu_read_unlock(), and may be nested.  In addition, but only in
+ * v5.0 and later, regions of code across which interrupts, preemption,
+ * or softirqs have been disabled also serve as RCU read-side critical
+ * sections.  This includes hardware interrupt handlers, softirq handlers,
+ * and NMI handlers.
+ *
+ * Note that all CPUs must agree that the grace period extended beyond
+ * all pre-existing RCU read-side critical section.  On systems with more
+ * than one CPU, this means that when "func()" is invoked, each CPU is
+ * guaranteed to have executed a full memory barrier since the end of its
+ * last RCU read-side critical section whose beginning preceded the call
+ * to call_rcu().  It also means that each CPU executing an RCU read-side
+ * critical section that continues beyond the start of "func()" must have
+ * executed a memory barrier after the call_rcu() but before the beginning
+ * of that RCU read-side critical section.  Note that these guarantees
+ * include CPUs that are offline, idle, or executing in user mode, as
+ * well as CPUs that are executing in the kernel.
+ *
+ * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
+ * resulting RCU callback function "func()", then both CPU A and CPU B are
+ * guaranteed to execute a full memory barrier during the time interval
+ * between the call to call_rcu() and the invocation of "func()" -- even
+ * if CPU A and CPU B are the same CPU (but again only if the system has
+ * more than one CPU).
+ *
+ * Implementation of these memory-ordering guarantees is described here:
+ * Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.
+ */
+void call_rcu(struct rcu_head *head, rcu_callback_t func)
+{
+	return __call_rcu_common(head, func, true);
+}
+EXPORT_SYMBOL_GPL(call_rcu);
 
 /* Maximum number of jiffies to wait before draining a batch. */
 #define KFREE_DRAIN_JIFFIES (5 * HZ)
@@ -3507,7 +3544,7 @@ void synchronize_rcu(void)
 		if (rcu_gp_is_expedited())
 			synchronize_rcu_expedited();
 		else
-			wait_rcu_gp(call_rcu);
+			wait_rcu_gp(call_rcu_flush);
 		return;
 	}
 
@@ -3910,7 +3947,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 	 * if it's fully lazy.
 	 */
 	was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
-	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
 	wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
 	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
 		atomic_inc(&rcu_state.barrier_cpu_count);
@@ -4336,7 +4373,7 @@ void rcutree_migrate_callbacks(int cpu)
 	my_rdp = this_cpu_ptr(&rcu_data);
 	my_rnp = my_rdp->mynode;
 	rcu_nocb_lock(my_rdp); /* irqs already disabled. */
-	WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies));
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false));
 	raw_spin_lock_rcu_node(my_rnp); /* irqs already disabled. */
 	/* Leverage recent GPs and set GP for new callbacks. */
 	needwake = rcu_advance_cbs(my_rnp, rdp) ||
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 925dd98f8b23b..fcb5d696eb170 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -263,14 +263,16 @@ struct rcu_data {
 	unsigned long last_fqs_resched;	/* Time of last rcu_resched(). */
 	unsigned long last_sched_clock;	/* Jiffies of last rcu_sched_clock_irq(). */
 
+	long lazy_len;			/* Length of buffered lazy callbacks. */
 	int cpu;
 };
 
 /* Values for nocb_defer_wakeup field in struct rcu_data. */
 #define RCU_NOCB_WAKE_NOT	0
 #define RCU_NOCB_WAKE_BYPASS	1
-#define RCU_NOCB_WAKE		2
-#define RCU_NOCB_WAKE_FORCE	3
+#define RCU_NOCB_WAKE_LAZY	2
+#define RCU_NOCB_WAKE		3
+#define RCU_NOCB_WAKE_FORCE	4
 
 #define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
 					/* For jiffies_till_first_fqs and */
@@ -441,9 +443,10 @@ static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
 static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				  unsigned long j);
+				  unsigned long j, bool lazy);
 static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				bool *was_alldone, unsigned long flags);
+				bool *was_alldone, unsigned long flags,
+				bool lazy);
 static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
 				 unsigned long flags);
 static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level);
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 18e9b4cd78ef8..5cac056007982 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -937,7 +937,7 @@ void synchronize_rcu_expedited(void)
 
 	/* If expedited grace periods are prohibited, fall back to normal. */
 	if (rcu_gp_is_normal()) {
-		wait_rcu_gp(call_rcu);
+		wait_rcu_gp(call_rcu_flush);
 		return;
 	}
 
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 094fd454b6c38..d6e4c076b0515 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -256,6 +256,31 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
 	return __wake_nocb_gp(rdp_gp, rdp, force, flags);
 }
 
+/*
+ * LAZY_FLUSH_JIFFIES decides the maximum amount of time that
+ * can elapse before lazy callbacks are flushed. Lazy callbacks
+ * could be flushed much earlier for a number of other reasons
+ * however, LAZY_FLUSH_JIFFIES will ensure no lazy callbacks are
+ * left unsubmitted to RCU after those many jiffies.
+ */
+#define LAZY_FLUSH_JIFFIES (10 * HZ)
+static unsigned long jiffies_till_flush = LAZY_FLUSH_JIFFIES;
+
+#ifdef CONFIG_RCU_LAZY
+// To be called only from test code.
+void rcu_lazy_set_jiffies_till_flush(unsigned long jif)
+{
+	jiffies_till_flush = jif;
+}
+EXPORT_SYMBOL(rcu_lazy_set_jiffies_till_flush);
+
+unsigned long rcu_lazy_get_jiffies_till_flush(void)
+{
+	return jiffies_till_flush;
+}
+EXPORT_SYMBOL(rcu_lazy_get_jiffies_till_flush);
+#endif
+
 /*
  * Arrange to wake the GP kthread for this NOCB group at some future
  * time when it is safe to do so.
@@ -269,10 +294,14 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
 	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
 
 	/*
-	 * Bypass wakeup overrides previous deferments. In case
-	 * of callback storm, no need to wake up too early.
+	 * Bypass wakeup overrides previous deferments. In case of
+	 * callback storms, no need to wake up too early.
 	 */
-	if (waketype == RCU_NOCB_WAKE_BYPASS) {
+	if (waketype == RCU_NOCB_WAKE_LAZY &&
+	    rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT) {
+		mod_timer(&rdp_gp->nocb_timer, jiffies + jiffies_till_flush);
+		WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
+	} else if (waketype == RCU_NOCB_WAKE_BYPASS) {
 		mod_timer(&rdp_gp->nocb_timer, jiffies + 2);
 		WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
 	} else {
@@ -293,10 +322,13 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
  * proves to be initially empty, just return false because the no-CB GP
  * kthread may need to be awakened in this case.
  *
+ * Return true if there was something to be flushed and it succeeded, otherwise
+ * false.
+ *
  * Note that this function always returns true if rhp is NULL.
  */
 static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				     unsigned long j)
+				     unsigned long j, bool lazy)
 {
 	struct rcu_cblist rcl;
 
@@ -310,7 +342,20 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	/* Note: ->cblist.len already accounts for ->nocb_bypass contents. */
 	if (rhp)
 		rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
-	rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
+
+	/*
+	 * If the new CB requested was a lazy one, queue it onto the main
+	 * ->cblist so we can take advantage of a sooner grade period.
+	 */
+	if (lazy && rhp) {
+		rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL);
+		rcu_cblist_enqueue(&rcl, rhp);
+		WRITE_ONCE(rdp->lazy_len, 0);
+	} else {
+		rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
+		WRITE_ONCE(rdp->lazy_len, 0);
+	}
+
 	rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
 	WRITE_ONCE(rdp->nocb_bypass_first, j);
 	rcu_nocb_bypass_unlock(rdp);
@@ -326,13 +371,13 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
  * Note that this function always returns true if rhp is NULL.
  */
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				  unsigned long j)
+				  unsigned long j, bool lazy)
 {
 	if (!rcu_rdp_is_offloaded(rdp))
 		return true;
 	rcu_lockdep_assert_cblist_protected(rdp);
 	rcu_nocb_bypass_lock(rdp);
-	return rcu_nocb_do_flush_bypass(rdp, rhp, j);
+	return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy);
 }
 
 /*
@@ -345,7 +390,7 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
 	if (!rcu_rdp_is_offloaded(rdp) ||
 	    !rcu_nocb_bypass_trylock(rdp))
 		return;
-	WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j));
+	WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false));
 }
 
 /*
@@ -367,12 +412,14 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
  * there is only one CPU in operation.
  */
 static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				bool *was_alldone, unsigned long flags)
+				bool *was_alldone, unsigned long flags,
+				bool lazy)
 {
 	unsigned long c;
 	unsigned long cur_gp_seq;
 	unsigned long j = jiffies;
 	long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+	bool bypass_is_lazy = (ncbs == READ_ONCE(rdp->lazy_len));
 
 	lockdep_assert_irqs_disabled();
 
@@ -417,25 +464,29 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	// If there hasn't yet been all that many ->cblist enqueues
 	// this jiffy, tell the caller to enqueue onto ->cblist.  But flush
 	// ->nocb_bypass first.
-	if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) {
+	// Lazy CBs throttle this back and do immediate bypass queuing.
+	if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy && !lazy) {
 		rcu_nocb_lock(rdp);
 		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 		if (*was_alldone)
 			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 					    TPS("FirstQ"));
-		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
+
+		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false));
 		WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
 		return false; // Caller must enqueue the callback.
 	}
 
 	// If ->nocb_bypass has been used too long or is too full,
 	// flush ->nocb_bypass to ->cblist.
-	if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
+	if ((ncbs && !bypass_is_lazy && j != READ_ONCE(rdp->nocb_bypass_first)) ||
+	    (ncbs &&  bypass_is_lazy &&
+	     (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_till_flush))) ||
 	    ncbs >= qhimark) {
 		rcu_nocb_lock(rdp);
 		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 
-		if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
+		if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy)) {
 			if (*was_alldone)
 				trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 						    TPS("FirstQ"));
@@ -463,13 +514,24 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 	ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
 	rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
 	rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
+
+	if (lazy)
+		WRITE_ONCE(rdp->lazy_len, rdp->lazy_len + 1);
+
 	if (!ncbs) {
 		WRITE_ONCE(rdp->nocb_bypass_first, j);
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
 	}
 	rcu_nocb_bypass_unlock(rdp);
 	smp_mb(); /* Order enqueue before wake. */
-	if (ncbs) {
+	// A wake up of the grace period kthread or timer adjustment
+	// needs to be done only if:
+	// 1. Bypass list was fully empty before (this is the first
+	//    bypass list entry), or:
+	// 2. Both of these conditions are met:
+	//    a. The bypass list previously had only lazy CBs, and:
+	//    b. The new CB is non-lazy.
+	if (ncbs && (!bypass_is_lazy || lazy)) {
 		local_irq_restore(flags);
 	} else {
 		// No-CBs GP kthread might be indefinitely asleep, if so, wake.
@@ -497,8 +559,10 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 				 unsigned long flags)
 				 __releases(rdp->nocb_lock)
 {
+	long bypass_len;
 	unsigned long cur_gp_seq;
 	unsigned long j;
+	long lazy_len;
 	long len;
 	struct task_struct *t;
 
@@ -512,9 +576,16 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 	}
 	// Need to actually to a wakeup.
 	len = rcu_segcblist_n_cbs(&rdp->cblist);
+	bypass_len = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+	lazy_len = READ_ONCE(rdp->lazy_len);
 	if (was_alldone) {
 		rdp->qlen_last_fqs_check = len;
-		if (!irqs_disabled_flags(flags)) {
+		// Only lazy CBs in bypass list
+		if (lazy_len && bypass_len == lazy_len) {
+			rcu_nocb_unlock_irqrestore(rdp, flags);
+			wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_LAZY,
+					   TPS("WakeLazy"));
+		} else if (!irqs_disabled_flags(flags)) {
 			/* ... if queue was empty ... */
 			rcu_nocb_unlock_irqrestore(rdp, flags);
 			wake_nocb_gp(rdp, false);
@@ -605,12 +676,12 @@ static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu)
 static void nocb_gp_wait(struct rcu_data *my_rdp)
 {
 	bool bypass = false;
-	long bypass_ncbs;
 	int __maybe_unused cpu = my_rdp->cpu;
 	unsigned long cur_gp_seq;
 	unsigned long flags;
 	bool gotcbs = false;
 	unsigned long j = jiffies;
+	bool lazy = false;
 	bool needwait_gp = false; // This prevents actual uninitialized use.
 	bool needwake;
 	bool needwake_gp;
@@ -640,24 +711,43 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 	 * won't be ignored for long.
 	 */
 	list_for_each_entry(rdp, &my_rdp->nocb_head_rdp, nocb_entry_rdp) {
+		long bypass_ncbs;
+		bool flush_bypass = false;
+		long lazy_ncbs;
+
 		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
 		rcu_nocb_lock_irqsave(rdp, flags);
 		lockdep_assert_held(&rdp->nocb_lock);
 		bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
-		if (bypass_ncbs &&
+		lazy_ncbs = READ_ONCE(rdp->lazy_len);
+
+		if (bypass_ncbs && (lazy_ncbs == bypass_ncbs) &&
+		    (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_till_flush) ||
+		     bypass_ncbs > 2 * qhimark)) {
+			flush_bypass = true;
+		} else if (bypass_ncbs && (lazy_ncbs != bypass_ncbs) &&
 		    (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
 		     bypass_ncbs > 2 * qhimark)) {
-			// Bypass full or old, so flush it.
-			(void)rcu_nocb_try_flush_bypass(rdp, j);
-			bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+			flush_bypass = true;
 		} else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
 			rcu_nocb_unlock_irqrestore(rdp, flags);
 			continue; /* No callbacks here, try next. */
 		}
+
+		if (flush_bypass) {
+			// Bypass full or old, so flush it.
+			(void)rcu_nocb_try_flush_bypass(rdp, j);
+			bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+			lazy_ncbs = READ_ONCE(rdp->lazy_len);
+		}
+
 		if (bypass_ncbs) {
 			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
-					    TPS("Bypass"));
-			bypass = true;
+					    bypass_ncbs == lazy_ncbs ? TPS("Lazy") : TPS("Bypass"));
+			if (bypass_ncbs == lazy_ncbs)
+				lazy = true;
+			else
+				bypass = true;
 		}
 		rnp = rdp->mynode;
 
@@ -705,12 +795,20 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
 	my_rdp->nocb_gp_gp = needwait_gp;
 	my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
 
-	if (bypass && !rcu_nocb_poll) {
-		// At least one child with non-empty ->nocb_bypass, so set
-		// timer in order to avoid stranding its callbacks.
-		wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
-				   TPS("WakeBypassIsDeferred"));
+	// At least one child with non-empty ->nocb_bypass, so set
+	// timer in order to avoid stranding its callbacks.
+	if (!rcu_nocb_poll) {
+		// If bypass list only has lazy CBs. Add a deferred lazy wake up.
+		if (lazy && !bypass) {
+			wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_LAZY,
+					TPS("WakeLazyIsDeferred"));
+		// Otherwise add a deferred bypass wake up.
+		} else if (bypass) {
+			wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
+					TPS("WakeBypassIsDeferred"));
+		}
 	}
+
 	if (rcu_nocb_poll) {
 		/* Polling, so trace if first poll in the series. */
 		if (gotcbs)
@@ -1036,7 +1134,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 	 * return false, which means that future calls to rcu_nocb_try_bypass()
 	 * will refuse to put anything into the bypass.
 	 */
-	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
 	/*
 	 * Start with invoking rcu_core() early. This way if the current thread
 	 * happens to preempt an ongoing call to rcu_core() in the middle,
@@ -1278,6 +1376,7 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
 	raw_spin_lock_init(&rdp->nocb_gp_lock);
 	timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0);
 	rcu_cblist_init(&rdp->nocb_bypass);
+	WRITE_ONCE(rdp->lazy_len, 0);
 	mutex_init(&rdp->nocb_gp_kthread_mutex);
 }
 
@@ -1564,13 +1663,13 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
 }
 
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				  unsigned long j)
+				  unsigned long j, bool lazy)
 {
 	return true;
 }
 
 static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				bool *was_alldone, unsigned long flags)
+				bool *was_alldone, unsigned long flags, bool lazy)
 {
 	return false;
 }
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 05/14] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (3 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 04/14] rcu: Make call_rcu() lazy to save power Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 06/14] rcu: Shrinker for lazy rcu Paul E. McKenney
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

This consolidates the code a bit and makes it cleaner. Functionally it
is the same.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_nocb.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index d6e4c076b0515..213daf81c057f 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -327,10 +327,11 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
  *
  * Note that this function always returns true if rhp is NULL.
  */
-static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
 				     unsigned long j, bool lazy)
 {
 	struct rcu_cblist rcl;
+	struct rcu_head *rhp = rhp_in;
 
 	WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
 	rcu_lockdep_assert_cblist_protected(rdp);
@@ -345,16 +346,16 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 
 	/*
 	 * If the new CB requested was a lazy one, queue it onto the main
-	 * ->cblist so we can take advantage of a sooner grade period.
+	 * ->cblist so that we can take advantage of the grace-period that will
+	 * happen regardless. But queue it onto the bypass list first so that
+	 * the lazy CB is ordered with the existing CBs in the bypass list.
 	 */
 	if (lazy && rhp) {
-		rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL);
-		rcu_cblist_enqueue(&rcl, rhp);
-		WRITE_ONCE(rdp->lazy_len, 0);
-	} else {
-		rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
-		WRITE_ONCE(rdp->lazy_len, 0);
+		rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
+		rhp = NULL;
 	}
+	rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
+	WRITE_ONCE(rdp->lazy_len, 0);
 
 	rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
 	WRITE_ONCE(rdp->nocb_bypass_first, j);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 06/14] rcu: Shrinker for lazy rcu
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (4 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 05/14] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 07/14] rcuscale: Add laziness and kfree tests Paul E. McKenney
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Vineeth Pillai,
	Joel Fernandes, Paul E . McKenney

From: Vineeth Pillai <vineeth@bitbyteword.org>

The shrinker is used to speed up the free'ing of memory potentially held
by RCU lazy callbacks. RCU kernel module test cases show this to be
effective. Test is introduced in a later patch.

Signed-off-by: Vineeth Pillai <vineeth@bitbyteword.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_nocb.h | 52 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 213daf81c057f..9e1c8caec5ceb 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1312,6 +1312,55 @@ int rcu_nocb_cpu_offload(int cpu)
 }
 EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
 
+static unsigned long
+lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	int cpu;
+	unsigned long count = 0;
+
+	/* Snapshot count of all CPUs */
+	for_each_possible_cpu(cpu) {
+		struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+
+		count +=  READ_ONCE(rdp->lazy_len);
+	}
+
+	return count ? count : SHRINK_EMPTY;
+}
+
+static unsigned long
+lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+{
+	int cpu;
+	unsigned long flags;
+	unsigned long count = 0;
+
+	/* Snapshot count of all CPUs */
+	for_each_possible_cpu(cpu) {
+		struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+		int _count = READ_ONCE(rdp->lazy_len);
+
+		if (_count == 0)
+			continue;
+		rcu_nocb_lock_irqsave(rdp, flags);
+		WRITE_ONCE(rdp->lazy_len, 0);
+		rcu_nocb_unlock_irqrestore(rdp, flags);
+		wake_nocb_gp(rdp, false);
+		sc->nr_to_scan -= _count;
+		count += _count;
+		if (sc->nr_to_scan <= 0)
+			break;
+	}
+	return count ? count : SHRINK_STOP;
+}
+
+static struct shrinker lazy_rcu_shrinker = {
+	.count_objects = lazy_rcu_shrink_count,
+	.scan_objects = lazy_rcu_shrink_scan,
+	.batch = 0,
+	.seeks = DEFAULT_SEEKS,
+};
+
 void __init rcu_init_nohz(void)
 {
 	int cpu;
@@ -1342,6 +1391,9 @@ void __init rcu_init_nohz(void)
 	if (!rcu_state.nocb_is_setup)
 		return;
 
+	if (register_shrinker(&lazy_rcu_shrinker, "rcu-lazy"))
+		pr_err("Failed to register lazy_rcu shrinker!\n");
+
 	if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
 		pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
 		cpumask_and(rcu_nocb_mask, cpu_possible_mask,
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 07/14] rcuscale: Add laziness and kfree tests
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (5 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 06/14] rcu: Shrinker for lazy rcu Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 08/14] percpu-refcount: Use call_rcu_flush() for atomic switch Paul E. McKenney
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

This commit adds 2 tests to rcuscale.  The first one is a startup test
to check whether we are not too lazy or too hard working.  The second
one causes kfree_rcu() itself to use call_rcu() and checks memory
pressure. Testing indicates that the new call_rcu() keeps memory pressure
under control roughly as well as does kfree_rcu().

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/rcuscale.c | 68 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 3ef02d4a81085..bbdcac1804ec8 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -95,6 +95,7 @@ torture_param(int, verbose, 1, "Enable verbose debugging printk()s");
 torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
 torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() scale test?");
 torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate.");
+torture_param(int, kfree_by_call_rcu, 0, "Use call_rcu() to emulate kfree_rcu()?");
 
 static char *scale_type = "rcu";
 module_param(scale_type, charp, 0444);
@@ -659,6 +660,14 @@ struct kfree_obj {
 	struct rcu_head rh;
 };
 
+/* Used if doing RCU-kfree'ing via call_rcu(). */
+static void kfree_call_rcu(struct rcu_head *rh)
+{
+	struct kfree_obj *obj = container_of(rh, struct kfree_obj, rh);
+
+	kfree(obj);
+}
+
 static int
 kfree_scale_thread(void *arg)
 {
@@ -696,6 +705,11 @@ kfree_scale_thread(void *arg)
 			if (!alloc_ptr)
 				return -ENOMEM;
 
+			if (kfree_by_call_rcu) {
+				call_rcu(&(alloc_ptr->rh), kfree_call_rcu);
+				continue;
+			}
+
 			// By default kfree_rcu_test_single and kfree_rcu_test_double are
 			// initialized to false. If both have the same value (false or true)
 			// both are randomly tested, otherwise only the one with value true
@@ -767,11 +781,59 @@ kfree_scale_shutdown(void *arg)
 	return -EINVAL;
 }
 
+// Used if doing RCU-kfree'ing via call_rcu().
+static unsigned long jiffies_at_lazy_cb;
+static struct rcu_head lazy_test1_rh;
+static int rcu_lazy_test1_cb_called;
+static void call_rcu_lazy_test1(struct rcu_head *rh)
+{
+	jiffies_at_lazy_cb = jiffies;
+	WRITE_ONCE(rcu_lazy_test1_cb_called, 1);
+}
+
 static int __init
 kfree_scale_init(void)
 {
-	long i;
 	int firsterr = 0;
+	long i;
+	unsigned long jif_start;
+	unsigned long orig_jif;
+
+	// Also, do a quick self-test to ensure laziness is as much as
+	// expected.
+	if (kfree_by_call_rcu && !IS_ENABLED(CONFIG_RCU_LAZY)) {
+		pr_alert("CONFIG_RCU_LAZY is disabled, falling back to kfree_rcu() "
+			 "for delayed RCU kfree'ing\n");
+		kfree_by_call_rcu = 0;
+	}
+
+	if (kfree_by_call_rcu) {
+		/* do a test to check the timeout. */
+		orig_jif = rcu_lazy_get_jiffies_till_flush();
+
+		rcu_lazy_set_jiffies_till_flush(2 * HZ);
+		rcu_barrier();
+
+		jif_start = jiffies;
+		jiffies_at_lazy_cb = 0;
+		call_rcu(&lazy_test1_rh, call_rcu_lazy_test1);
+
+		smp_cond_load_relaxed(&rcu_lazy_test1_cb_called, VAL == 1);
+
+		rcu_lazy_set_jiffies_till_flush(orig_jif);
+
+		if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start < 2 * HZ)) {
+			pr_alert("ERROR: call_rcu() CBs are not being lazy as expected!\n");
+			WARN_ON_ONCE(1);
+			return -1;
+		}
+
+		if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start > 3 * HZ)) {
+			pr_alert("ERROR: call_rcu() CBs are being too lazy!\n");
+			WARN_ON_ONCE(1);
+			return -1;
+		}
+	}
 
 	kfree_nrealthreads = compute_real(kfree_nthreads);
 	/* Start up the kthreads. */
@@ -784,7 +846,9 @@ kfree_scale_init(void)
 		schedule_timeout_uninterruptible(1);
 	}
 
-	pr_alert("kfree object size=%zu\n", kfree_mult * sizeof(struct kfree_obj));
+	pr_alert("kfree object size=%zu, kfree_by_call_rcu=%d\n",
+			kfree_mult * sizeof(struct kfree_obj),
+			kfree_by_call_rcu);
 
 	kfree_reader_tasks = kcalloc(kfree_nrealthreads, sizeof(kfree_reader_tasks[0]),
 			       GFP_KERNEL);
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2
@ 2022-10-19 22:51 Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 01/14] rcu: Simplify rcu_init_nohz() cpumask handling Paul E. McKenney
                   ` (13 more replies)
  0 siblings, 14 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu; +Cc: linux-kernel, kernel-team, rostedt

Hello!

This series provides energy efficiency for nearly-idle systems by making
call_rcu() more lazy.  Several NOCB changes come along for the ride:

1.	Simplify rcu_init_nohz() cpumask handling, courtesy of Zhen Lei.

2.	Fix late wakeup when flush of bypass cblist happens, courtesy of
	"Joel Fernandes (Google)".

3.	Fix missing nocb gp wake on rcu_barrier(), courtesy of Frederic
	Weisbecker.

4.	Make call_rcu() lazy to save power, courtesy of "Joel Fernandes
	(Google)".

5.	Refactor code a bit in rcu_nocb_do_flush_bypass(), courtesy of
	"Joel Fernandes (Google)".

6.	Shrinker for lazy rcu, courtesy of Vineeth Pillai.

7.	Add laziness and kfree tests, courtesy of "Joel Fernandes
	(Google)".

8.	percpu-refcount: Use call_rcu_flush() for atomic switch, courtesy
	of "Joel Fernandes (Google)".

9.	Use call_rcu_flush() instead of call_rcu, courtesy of "Joel
	Fernandes (Google)".

10.	Use call_rcu_flush() for async reader test, courtesy of "Joel
	Fernandes (Google)".

11.	Use call_rcu_flush() where needed, courtesy of "Joel Fernandes
	(Google)".

12.	scsi/scsi_error: Use call_rcu_flush() instead of call_rcu(),
	courtesy of Uladzislau Rezki.

13.	Make queue_rcu_work() use call_rcu_flush(), courtesy of Uladzislau
	Rezki.

14.	Use call_rcu_flush() instead of call_rcu(), courtesy of "Joel
	Fernandes (Google)".

						Thanx, Paul

------------------------------------------------------------------------

 b/drivers/scsi/scsi_error.c |    2 
 b/include/linux/rcupdate.h  |    9 +
 b/kernel/rcu/Kconfig        |    8 +
 b/kernel/rcu/rcu.h          |    8 +
 b/kernel/rcu/rcuscale.c     |   68 +++++++++++-
 b/kernel/rcu/rcutorture.c   |   16 +-
 b/kernel/rcu/sync.c         |    2 
 b/kernel/rcu/tiny.c         |    2 
 b/kernel/rcu/tree.c         |   11 +
 b/kernel/rcu/tree.h         |    1 
 b/kernel/rcu/tree_exp.h     |    2 
 b/kernel/rcu/tree_nocb.h    |   34 +-----
 b/kernel/workqueue.c        |    2 
 b/lib/percpu-refcount.c     |    3 
 b/net/rxrpc/conn_object.c   |    2 
 kernel/rcu/rcuscale.c       |    2 
 kernel/rcu/tree.c           |  129 +++++++++++++++--------
 kernel/rcu/tree.h           |   11 +
 kernel/rcu/tree_nocb.h      |  243 ++++++++++++++++++++++++++++++++++++--------
 19 files changed, 424 insertions(+), 131 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH rcu 08/14] percpu-refcount: Use call_rcu_flush() for atomic switch
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (6 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 07/14] rcuscale: Add laziness and kfree tests Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 09/14] rcu/sync: Use call_rcu_flush() instead of call_rcu Paul E. McKenney
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

call_rcu() changes to save power will slow down the percpu refcounter's
"per-CPU to atomic switch" path. The primitive uses RCU when switching to
atomic mode. The enqueued async callback wakes up waiters waiting in the
percpu_ref_switch_waitq. Due to this, per-CPU refcount users will slow down,
such as blk_pre_runtime_suspend().

Use the call_rcu_flush() API instead which reverts to the old behavior.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 lib/percpu-refcount.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index e5c5315da2741..65c58a029297d 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -230,7 +230,8 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
 		percpu_ref_noop_confirm_switch;
 
 	percpu_ref_get(ref);	/* put after confirmation */
-	call_rcu(&ref->data->rcu, percpu_ref_switch_to_atomic_rcu);
+	call_rcu_flush(&ref->data->rcu,
+		       percpu_ref_switch_to_atomic_rcu);
 }
 
 static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 09/14] rcu/sync: Use call_rcu_flush() instead of call_rcu
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (7 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 08/14] percpu-refcount: Use call_rcu_flush() for atomic switch Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 10/14] rcu/rcuscale: Use call_rcu_flush() for async reader test Paul E. McKenney
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

call_rcu() changes to save power will slow down rcu sync. Use the
call_rcu_flush() API instead which reverts to the old behavior.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/sync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 5cefc702158fe..bdce3b5d7f714 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -44,7 +44,7 @@ static void rcu_sync_func(struct rcu_head *rhp);
 
 static void rcu_sync_call(struct rcu_sync *rsp)
 {
-	call_rcu(&rsp->cb_head, rcu_sync_func);
+	call_rcu_flush(&rsp->cb_head, rcu_sync_func);
 }
 
 /**
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 10/14] rcu/rcuscale: Use call_rcu_flush() for async reader test
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (8 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 09/14] rcu/sync: Use call_rcu_flush() instead of call_rcu Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 11/14] rcu/rcutorture: Use call_rcu_flush() where needed Paul E. McKenney
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

rcuscale uses call_rcu() to queue async readers. With recent changes to
save power, the test will have fewer async readers in flight. Use the
call_rcu_flush() API instead to revert to the old behavior.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/rcuscale.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index bbdcac1804ec8..0385e9b123998 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -176,7 +176,7 @@ static struct rcu_scale_ops rcu_ops = {
 	.get_gp_seq	= rcu_get_gp_seq,
 	.gp_diff	= rcu_seq_diff,
 	.exp_completed	= rcu_exp_batches_completed,
-	.async		= call_rcu,
+	.async		= call_rcu_flush,
 	.gp_barrier	= rcu_barrier,
 	.sync		= synchronize_rcu,
 	.exp_sync	= synchronize_rcu_expedited,
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 11/14] rcu/rcutorture: Use call_rcu_flush() where needed
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (9 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 10/14] rcu/rcuscale: Use call_rcu_flush() for async reader test Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 12/14] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu() Paul E. McKenney
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

call_rcu() changes to save power will change the behavior of rcutorture
tests. Use the call_rcu_flush() API instead which reverts to the old
behavior.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/rcutorture.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 503c2aa845a4a..c8ddb4b635b77 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -510,7 +510,7 @@ static unsigned long rcu_no_completed(void)
 
 static void rcu_torture_deferred_free(struct rcu_torture *p)
 {
-	call_rcu(&p->rtort_rcu, rcu_torture_cb);
+	call_rcu_flush(&p->rtort_rcu, rcu_torture_cb);
 }
 
 static void rcu_sync_torture_init(void)
@@ -551,7 +551,7 @@ static struct rcu_torture_ops rcu_ops = {
 	.start_gp_poll_exp_full	= start_poll_synchronize_rcu_expedited_full,
 	.poll_gp_state_exp	= poll_state_synchronize_rcu,
 	.cond_sync_exp		= cond_synchronize_rcu_expedited,
-	.call			= call_rcu,
+	.call			= call_rcu_flush,
 	.cb_barrier		= rcu_barrier,
 	.fqs			= rcu_force_quiescent_state,
 	.stats			= NULL,
@@ -848,7 +848,7 @@ static void rcu_tasks_torture_deferred_free(struct rcu_torture *p)
 
 static void synchronize_rcu_mult_test(void)
 {
-	synchronize_rcu_mult(call_rcu_tasks, call_rcu);
+	synchronize_rcu_mult(call_rcu_tasks, call_rcu_flush);
 }
 
 static struct rcu_torture_ops tasks_ops = {
@@ -3388,13 +3388,13 @@ static void rcu_test_debug_objects(void)
 	/* Try to queue the rh2 pair of callbacks for the same grace period. */
 	preempt_disable(); /* Prevent preemption from interrupting test. */
 	rcu_read_lock(); /* Make it impossible to finish a grace period. */
-	call_rcu(&rh1, rcu_torture_leak_cb); /* Start grace period. */
+	call_rcu_flush(&rh1, rcu_torture_leak_cb); /* Start grace period. */
 	local_irq_disable(); /* Make it harder to start a new grace period. */
-	call_rcu(&rh2, rcu_torture_leak_cb);
-	call_rcu(&rh2, rcu_torture_err_cb); /* Duplicate callback. */
+	call_rcu_flush(&rh2, rcu_torture_leak_cb);
+	call_rcu_flush(&rh2, rcu_torture_err_cb); /* Duplicate callback. */
 	if (rhp) {
-		call_rcu(rhp, rcu_torture_leak_cb);
-		call_rcu(rhp, rcu_torture_err_cb); /* Another duplicate callback. */
+		call_rcu_flush(rhp, rcu_torture_leak_cb);
+		call_rcu_flush(rhp, rcu_torture_err_cb); /* Another duplicate callback. */
 	}
 	local_irq_enable();
 	rcu_read_unlock();
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 12/14] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (10 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 11/14] rcu/rcutorture: Use call_rcu_flush() where needed Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush() Paul E. McKenney
  2022-10-19 22:51 ` [PATCH rcu 14/14] rxrpc: Use call_rcu_flush() instead of call_rcu() Paul E. McKenney
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Uladzislau Rezki,
	Joel Fernandes, Paul E . McKenney

From: Uladzislau Rezki <urezki@gmail.com>

Slow boot time is seen on KVM running typical Linux distributions due to
SCSI layer calling call_rcu(). Recent changes to save power may be
causing this slowness. Using call_rcu_flush() fixes the issue and brings
the boot time back to what it originally was. Convert it.

Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 drivers/scsi/scsi_error.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 6995c89792300..634672e67c81f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 	 * Ensure that all tasks observe the host state change before the
 	 * host_failed change.
 	 */
-	call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
+	call_rcu_flush(&scmd->rcu, scsi_eh_inc_host_failed);
 }
 
 /**
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (11 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 12/14] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu() Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  2022-10-24  0:36   ` Joel Fernandes
  2022-10-19 22:51 ` [PATCH rcu 14/14] rxrpc: Use call_rcu_flush() instead of call_rcu() Paul E. McKenney
  13 siblings, 1 reply; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Uladzislau Rezki,
	Joel Fernandes, Paul E . McKenney

From: Uladzislau Rezki <urezki@gmail.com>

call_rcu() changes to save power will slow down RCU workqueue items
queued via queue_rcu_work(). This may not be an issue, however we cannot
assume that workqueue users are OK with long delays. Use
call_rcu_flush() API instead which reverts to the old behavior.

Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7cd5f5e7e0a1b..b4b0e828b529e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork)
 
 	if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
 		rwork->wq = wq;
-		call_rcu(&rwork->rcu, rcu_work_rcufn);
+		call_rcu_flush(&rwork->rcu, rcu_work_rcufn);
 		return true;
 	}
 
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH rcu 14/14] rxrpc: Use call_rcu_flush() instead of call_rcu()
  2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
                   ` (12 preceding siblings ...)
  2022-10-19 22:51 ` [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush() Paul E. McKenney
@ 2022-10-19 22:51 ` Paul E. McKenney
  13 siblings, 0 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-19 22:51 UTC (permalink / raw)
  To: rcu
  Cc: linux-kernel, kernel-team, rostedt, Joel Fernandes (Google),
	Paul E . McKenney

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>

call_rcu() changes to save power may cause slowness. Use the
call_rcu_flush() API instead which reverts to the old behavior.

We find this via inspection that the RCU callback does a wakeup of a
thread. This usually indicates that something is waiting on it. To be
safe, let us use call_rcu_flush() here instead.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 net/rxrpc/conn_object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c
index 22089e37e97f0..fdcfb509cc443 100644
--- a/net/rxrpc/conn_object.c
+++ b/net/rxrpc/conn_object.c
@@ -253,7 +253,7 @@ void rxrpc_kill_connection(struct rxrpc_connection *conn)
 	 * must carry a ref on the connection to prevent us getting here whilst
 	 * it is queued or running.
 	 */
-	call_rcu(&conn->rcu, rxrpc_destroy_connection);
+	call_rcu_flush(&conn->rcu, rxrpc_destroy_connection);
 }
 
 /*
-- 
2.31.1.189.g2e36527f23


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-19 22:51 ` [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush() Paul E. McKenney
@ 2022-10-24  0:36   ` Joel Fernandes
  2022-10-24  3:15     ` Paul E. McKenney
  0 siblings, 1 reply; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24  0:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: rcu, linux-kernel, kernel-team, rostedt, Uladzislau Rezki

Hello,

On Wed, Oct 19, 2022 at 6:51 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> From: Uladzislau Rezki <urezki@gmail.com>
>
> call_rcu() changes to save power will slow down RCU workqueue items
> queued via queue_rcu_work(). This may not be an issue, however we cannot
> assume that workqueue users are OK with long delays. Use
> call_rcu_flush() API instead which reverts to the old behavio

On ChromeOS, I can see that queue_rcu_work() is pretty noisy and the
batching is much better if we can just keep it as call_rcu() instead
of call_rcu_flush().

Is there really any reason to keep it as call_rcu_flush() ?  If I
recall, the real reason Vlad's system was slowing down was because of
scsi and the queue_rcu_work() conversion was really a red herring.

Vlad, any thoughts?

thanks,

 - Joel

.
>
> Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> ---
>  kernel/workqueue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 7cd5f5e7e0a1b..b4b0e828b529e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork)
>
>         if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
>                 rwork->wq = wq;
> -               call_rcu(&rwork->rcu, rcu_work_rcufn);
> +               call_rcu_flush(&rwork->rcu, rcu_work_rcufn);
>                 return true;
>         }
>
> --
> 2.31.1.189.g2e36527f23
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24  0:36   ` Joel Fernandes
@ 2022-10-24  3:15     ` Paul E. McKenney
  2022-10-24 10:49       ` Uladzislau Rezki
  0 siblings, 1 reply; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-24  3:15 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: rcu, linux-kernel, kernel-team, rostedt, Uladzislau Rezki

On Sun, Oct 23, 2022 at 08:36:00PM -0400, Joel Fernandes wrote:
> Hello,
> 
> On Wed, Oct 19, 2022 at 6:51 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > From: Uladzislau Rezki <urezki@gmail.com>
> >
> > call_rcu() changes to save power will slow down RCU workqueue items
> > queued via queue_rcu_work(). This may not be an issue, however we cannot
> > assume that workqueue users are OK with long delays. Use
> > call_rcu_flush() API instead which reverts to the old behavio
> 
> On ChromeOS, I can see that queue_rcu_work() is pretty noisy and the
> batching is much better if we can just keep it as call_rcu() instead
> of call_rcu_flush().
> 
> Is there really any reason to keep it as call_rcu_flush() ?  If I
> recall, the real reason Vlad's system was slowing down was because of
> scsi and the queue_rcu_work() conversion was really a red herring.

There are less than 20 invocations of queue_rcu_work(), so it should
be possible look through each.  The low-risk approach is of course to
have queue_rcu_work() use call_rcu_flush().

The next approach might be to have a Kconfig option and/or kernel
boot parameter that allowed a per-system choice.

But it would not hurt to double-check on Android.

							Thanx, Paul

> Vlad, any thoughts?
> 
> thanks,
> 
>  - Joel
> 
> .
> >
> > Signed-off-by: Uladzislau Rezki <urezki@gmail.com>
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > ---
> >  kernel/workqueue.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 7cd5f5e7e0a1b..b4b0e828b529e 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork)
> >
> >         if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
> >                 rwork->wq = wq;
> > -               call_rcu(&rwork->rcu, rcu_work_rcufn);
> > +               call_rcu_flush(&rwork->rcu, rcu_work_rcufn);
> >                 return true;
> >         }
> >
> > --
> > 2.31.1.189.g2e36527f23
> >

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24  3:15     ` Paul E. McKenney
@ 2022-10-24 10:49       ` Uladzislau Rezki
  2022-10-24 12:23         ` Uladzislau Rezki
  0 siblings, 1 reply; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-24 10:49 UTC (permalink / raw)
  To: Paul E. McKenney, Joel Fernandes
  Cc: Joel Fernandes, rcu, linux-kernel, kernel-team, rostedt,
	Uladzislau Rezki

> On Sun, Oct 23, 2022 at 08:36:00PM -0400, Joel Fernandes wrote:
> > Hello,
> > 
> > On Wed, Oct 19, 2022 at 6:51 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > From: Uladzislau Rezki <urezki@gmail.com>
> > >
> > > call_rcu() changes to save power will slow down RCU workqueue items
> > > queued via queue_rcu_work(). This may not be an issue, however we cannot
> > > assume that workqueue users are OK with long delays. Use
> > > call_rcu_flush() API instead which reverts to the old behavio
> > 
> > On ChromeOS, I can see that queue_rcu_work() is pretty noisy and the
> > batching is much better if we can just keep it as call_rcu() instead
> > of call_rcu_flush().
> > 
> > Is there really any reason to keep it as call_rcu_flush() ?  If I
> > recall, the real reason Vlad's system was slowing down was because of
> > scsi and the queue_rcu_work() conversion was really a red herring.
> 
<snip>
*** drivers/acpi/osl.c:
acpi_os_drop_map_ref[401]      queue_rcu_work(system_wq, &map->track.rwork);

*** drivers/gpu/drm/i915/gt/intel_execlists_submission.c:
virtual_context_destroy[3653]  queue_rcu_work(system_wq, &ve->rcu);

*** fs/aio.c:
free_ioctx_reqs[632]           queue_rcu_work(system_wq, &ctx->free_rwork);

*** fs/fs-writeback.c:
inode_switch_wbs[604]          queue_rcu_work(isw_wq, &isw->work);
cleanup_offline_cgwb[676]      queue_rcu_work(isw_wq, &isw->work);

*** include/linux/workqueue.h:
__printf[446]                  extern bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork);

*** kernel/cgroup/cgroup.c:
css_release_work_fn[5253]      queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
css_create[5384]               queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);

*** kernel/rcu/tree.c:
kfree_rcu_monitor[3192]        queue_rcu_work(system_wq, &krwp->rcu_work);

*** net/core/skmsg.c:
sk_psock_drop[852]             queue_rcu_work(system_wq, &psock->rwork);

*** net/sched/act_ct.c:
tcf_ct_flow_table_put[355]     queue_rcu_work(act_ct_wq, &ct_ft->rwork);

*** net/sched/cls_api.c:
tcf_queue_work[225]            return queue_rcu_work(tc_filter_wq, rwork);
<snip>
There are 9 users of the queue_rcu_work() functions. I think there can be
a side effect if we keep it as lazy variant. Please note that i have not
checked all those users.

> There are less than 20 invocations of queue_rcu_work(), so it should
> be possible look through each.  The low-risk approach is of course to
> have queue_rcu_work() use call_rcu_flush().
> 
> The next approach might be to have a Kconfig option and/or kernel
> boot parameter that allowed a per-system choice.
> 
> But it would not hurt to double-check on Android.
> 
I did not see such noise but i will come back some data on 5.10 kernel
today.

> 
> > Vlad, any thoughts?
> > 
At least for the kvfree_rcu() i would like to keep the sync variant, because
we have the below patch that improves bathing:

<snip>
commit 51824b780b719c53113dc39e027fbf670dc66028
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date:   Thu Jun 30 18:33:35 2022 +0200

    rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval

    Currently the monitor work is scheduled with a fixed interval of HZ/20,
    which is roughly 50 milliseconds. The drawback of this approach is
    low utilization of the 512 page slots in scenarios with infrequence
    kvfree_rcu() calls.  For example on an Android system:
<snip>

Apparently i see it in the "dev" branch only.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 10:49       ` Uladzislau Rezki
@ 2022-10-24 12:23         ` Uladzislau Rezki
  2022-10-24 14:31           ` Joel Fernandes
  0 siblings, 1 reply; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-24 12:23 UTC (permalink / raw)
  To: Paul E. McKenney, Joel Fernandes
  Cc: Paul E. McKenney, Joel Fernandes, rcu, linux-kernel, kernel-team,
	rostedt

> > On Sun, Oct 23, 2022 at 08:36:00PM -0400, Joel Fernandes wrote:
> > > Hello,
> > > 
> > > On Wed, Oct 19, 2022 at 6:51 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > From: Uladzislau Rezki <urezki@gmail.com>
> > > >
> > > > call_rcu() changes to save power will slow down RCU workqueue items
> > > > queued via queue_rcu_work(). This may not be an issue, however we cannot
> > > > assume that workqueue users are OK with long delays. Use
> > > > call_rcu_flush() API instead which reverts to the old behavio
> > > 
> > > On ChromeOS, I can see that queue_rcu_work() is pretty noisy and the
> > > batching is much better if we can just keep it as call_rcu() instead
> > > of call_rcu_flush().
> > > 
> > > Is there really any reason to keep it as call_rcu_flush() ?  If I
> > > recall, the real reason Vlad's system was slowing down was because of
> > > scsi and the queue_rcu_work() conversion was really a red herring.
> > 
> <snip>
> *** drivers/acpi/osl.c:
> acpi_os_drop_map_ref[401]      queue_rcu_work(system_wq, &map->track.rwork);
> 
> *** drivers/gpu/drm/i915/gt/intel_execlists_submission.c:
> virtual_context_destroy[3653]  queue_rcu_work(system_wq, &ve->rcu);
> 
> *** fs/aio.c:
> free_ioctx_reqs[632]           queue_rcu_work(system_wq, &ctx->free_rwork);
> 
> *** fs/fs-writeback.c:
> inode_switch_wbs[604]          queue_rcu_work(isw_wq, &isw->work);
> cleanup_offline_cgwb[676]      queue_rcu_work(isw_wq, &isw->work);
> 
> *** include/linux/workqueue.h:
> __printf[446]                  extern bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork);
> 
> *** kernel/cgroup/cgroup.c:
> css_release_work_fn[5253]      queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
> css_create[5384]               queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
> 
> *** kernel/rcu/tree.c:
> kfree_rcu_monitor[3192]        queue_rcu_work(system_wq, &krwp->rcu_work);
> 
> *** net/core/skmsg.c:
> sk_psock_drop[852]             queue_rcu_work(system_wq, &psock->rwork);
> 
> *** net/sched/act_ct.c:
> tcf_ct_flow_table_put[355]     queue_rcu_work(act_ct_wq, &ct_ft->rwork);
> 
> *** net/sched/cls_api.c:
> tcf_queue_work[225]            return queue_rcu_work(tc_filter_wq, rwork);
> <snip>
> There are 9 users of the queue_rcu_work() functions. I think there can be
> a side effect if we keep it as lazy variant. Please note that i have not
> checked all those users.
> 
> > There are less than 20 invocations of queue_rcu_work(), so it should
> > be possible look through each.  The low-risk approach is of course to
> > have queue_rcu_work() use call_rcu_flush().
> > 
> > The next approach might be to have a Kconfig option and/or kernel
> > boot parameter that allowed a per-system choice.
> > 
> > But it would not hurt to double-check on Android.
> > 
> I did not see such noise but i will come back some data on 5.10 kernel
> today.
> 
Home screen swipe:
<snip>
       <...>-15      [003] d..1   202.142205: rcu_batch_start: rcu_preempt CBs=105 bl=10
       <...>-55      [001] d..1   202.166174: rcu_batch_start: rcu_preempt CBs=135 bl=10
       <...>-26      [001] d..1   202.402182: rcu_batch_start: rcu_preempt CBs=221 bl=10
     rcuop/3-40      [003] d..1   202.650323: rcu_batch_start: rcu_preempt CBs=213 bl=10
     rcuop/3-40      [000] d..1   203.210537: rcu_batch_start: rcu_preempt CBs=90 bl=10
     rcuop/5-55      [001] d..1   204.675671: rcu_batch_start: rcu_preempt CBs=14 bl=10
     rcuop/2-33      [002] d..1   205.162229: rcu_batch_start: rcu_preempt CBs=649 bl=10
     rcuop/3-40      [000] d..1   205.418214: rcu_batch_start: rcu_preempt CBs=291 bl=10
     rcuop/3-40      [003] d..1   206.134204: rcu_batch_start: rcu_preempt CBs=174 bl=10
     rcuop/0-15      [003] d..1   206.726311: rcu_batch_start: rcu_preempt CBs=738 bl=10
     rcuop/1-26      [001] d..1   206.814168: rcu_batch_start: rcu_preempt CBs=865 bl=10
     rcuop/3-40      [003] d..1   207.278178: rcu_batch_start: rcu_preempt CBs=287 bl=10
     rcuop/1-26      [001] d..1   208.826279: rcu_batch_start: rcu_preempt CBs=506 bl=10
<snip>

An app launch:
<snip>
         rcuop/3-40      [002] d..1   322.118620: rcu_batch_start: rcu_preempt CBs=99 bl=10
         rcuop/4-48      [005] dn.1   322.454052: rcu_batch_start: rcu_preempt CBs=270 bl=10
         rcuop/5-55      [005] d..1   322.454109: rcu_batch_start: rcu_preempt CBs=91 bl=10
         rcuop/5-55      [007] d..1   322.470054: rcu_batch_start: rcu_preempt CBs=106 bl=10
         rcuop/6-62      [005] d..1   322.482120: rcu_batch_start: rcu_preempt CBs=231 bl=10
         rcuop/4-48      [001] d..1   322.494150: rcu_batch_start: rcu_preempt CBs=227 bl=10
           <...>-69      [002] d..1   322.502442: rcu_batch_start: rcu_preempt CBs=3350 bl=26
         rcuop/1-26      [001] d..1   322.646099: rcu_batch_start: rcu_preempt CBs=1685 bl=13
         rcuop/2-33      [001] d..1   322.670071: rcu_batch_start: rcu_preempt CBs=438 bl=10
         rcuop/1-26      [001] d..1   322.674120: rcu_batch_start: rcu_preempt CBs=18 bl=10
         rcuop/2-33      [003] d..1   322.690152: rcu_batch_start: rcu_preempt CBs=10 bl=10
         rcuop/1-26      [002] d..1   322.698104: rcu_batch_start: rcu_preempt CBs=10 bl=10
         rcuop/3-40      [002] d..1   322.706167: rcu_batch_start: rcu_preempt CBs=313 bl=10
         rcuop/2-33      [003] d..1   322.710075: rcu_batch_start: rcu_preempt CBs=15 bl=10
         rcuop/3-40      [002] d..1   322.742137: rcu_batch_start: rcu_preempt CBs=13 bl=10
         rcuop/5-55      [000] d..1   322.754270: rcu_batch_start: rcu_preempt CBs=157 bl=10
         rcuop/3-40      [000] d..1   322.762182: rcu_batch_start: rcu_preempt CBs=17 bl=10
         rcuop/2-33      [003] d..1   322.774088: rcu_batch_start: rcu_preempt CBs=38 bl=10
         rcuop/3-40      [000] d..1   322.778131: rcu_batch_start: rcu_preempt CBs=23 bl=10
         rcuop/1-26      [002] d..1   322.790105: rcu_batch_start: rcu_preempt CBs=33 bl=10
         rcuop/4-48      [001] d..1   322.798074: rcu_batch_start: rcu_preempt CBs=340 bl=10
         rcuop/2-33      [002] d..1   322.806158: rcu_batch_start: rcu_preempt CBs=18 bl=10
         rcuop/1-26      [002] d..1   322.814057: rcu_batch_start: rcu_preempt CBs=18 bl=10
         rcuop/0-15      [001] d..1   322.822476: rcu_batch_start: rcu_preempt CBs=333 bl=10
         rcuop/4-48      [003] d..1   322.830102: rcu_batch_start: rcu_preempt CBs=11 bl=10
         rcuop/2-33      [001] d..1   322.846109: rcu_batch_start: rcu_preempt CBs=80 bl=10
         rcuop/3-40      [001] d..1   322.854162: rcu_batch_start: rcu_preempt CBs=145 bl=10
         rcuop/4-48      [003] d..1   322.874129: rcu_batch_start: rcu_preempt CBs=21 bl=10
         rcuop/3-40      [001] d..1   322.878149: rcu_batch_start: rcu_preempt CBs=43 bl=10
         rcuop/3-40      [001] d..1   322.906273: rcu_batch_start: rcu_preempt CBs=10 bl=10
         rcuop/4-48      [001] d..1   322.918201: rcu_batch_start: rcu_preempt CBs=23 bl=10
         rcuop/2-33      [001] d..1   322.926212: rcu_batch_start: rcu_preempt CBs=86 bl=10
         rcuop/2-33      [001] d..1   322.946251: rcu_batch_start: rcu_preempt CBs=12 bl=10
         rcuop/5-55      [003] d..1   322.954482: rcu_batch_start: rcu_preempt CBs=70 bl=10
         rcuop/2-33      [003] d..1   322.978146: rcu_batch_start: rcu_preempt CBs=20 bl=10
         rcuop/1-26      [002] d..1   323.014290: rcu_batch_start: rcu_preempt CBs=230 bl=10
         rcuop/4-48      [001] d..1   323.026119: rcu_batch_start: rcu_preempt CBs=73 bl=10
         rcuop/5-55      [003] d..1   323.026175: rcu_batch_start: rcu_preempt CBs=94 bl=10
         rcuop/3-40      [001] d..1   323.035310: rcu_batch_start: rcu_preempt CBs=70 bl=10
         rcuop/0-15      [001] d..1   323.046231: rcu_batch_start: rcu_preempt CBs=165 bl=10
         rcuop/6-62      [005] d..1   323.066132: rcu_batch_start: rcu_preempt CBs=179 bl=10
         rcuop/1-26      [002] d..1   323.174202: rcu_batch_start: rcu_preempt CBs=61 bl=10
         rcuop/2-33      [003] d..1   323.190203: rcu_batch_start: rcu_preempt CBs=80 bl=10
         rcuop/3-40      [003] d..1   323.206210: rcu_batch_start: rcu_preempt CBs=84 bl=10
         rcuop/2-33      [003] d..1   323.226880: rcu_batch_start: rcu_preempt CBs=5 bl=10
<snip>

It is on Android with 5.10 kernel running. I do not see that queue_rcu_work() makes
some noise.

Joel Could you please post your batch_start trace point output where you see the noise?

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 12:23         ` Uladzislau Rezki
@ 2022-10-24 14:31           ` Joel Fernandes
  2022-10-24 15:39             ` Paul E. McKenney
  0 siblings, 1 reply; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 14:31 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 02:23:39PM +0200, Uladzislau Rezki wrote:
> > > On Sun, Oct 23, 2022 at 08:36:00PM -0400, Joel Fernandes wrote:
> > > > On Wed, Oct 19, 2022 at 6:51 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > >
> > > > > From: Uladzislau Rezki <urezki@gmail.com>
> > > > >
> > > > > call_rcu() changes to save power will slow down RCU workqueue items
> > > > > queued via queue_rcu_work(). This may not be an issue, however we cannot
> > > > > assume that workqueue users are OK with long delays. Use
> > > > > call_rcu_flush() API instead which reverts to the old behavio
> > > > 
> > > > On ChromeOS, I can see that queue_rcu_work() is pretty noisy and the
> > > > batching is much better if we can just keep it as call_rcu() instead
> > > > of call_rcu_flush().
> > > > 
> > > > Is there really any reason to keep it as call_rcu_flush() ?  If I
> > > > recall, the real reason Vlad's system was slowing down was because of
> > > > scsi and the queue_rcu_work() conversion was really a red herring.
> > > 
> > <snip>
> > *** drivers/acpi/osl.c:
> > acpi_os_drop_map_ref[401]      queue_rcu_work(system_wq, &map->track.rwork);
> > 
> > *** drivers/gpu/drm/i915/gt/intel_execlists_submission.c:
> > virtual_context_destroy[3653]  queue_rcu_work(system_wq, &ve->rcu);
> > 
> > *** fs/aio.c:
> > free_ioctx_reqs[632]           queue_rcu_work(system_wq, &ctx->free_rwork);
> > 
> > *** fs/fs-writeback.c:
> > inode_switch_wbs[604]          queue_rcu_work(isw_wq, &isw->work);
> > cleanup_offline_cgwb[676]      queue_rcu_work(isw_wq, &isw->work);
> > 
> > *** include/linux/workqueue.h:
> > __printf[446]                  extern bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork);
> > 
> > *** kernel/cgroup/cgroup.c:
> > css_release_work_fn[5253]      queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
> > css_create[5384]               queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
> > 
> > *** kernel/rcu/tree.c:
> > kfree_rcu_monitor[3192]        queue_rcu_work(system_wq, &krwp->rcu_work);
> > 
> > *** net/core/skmsg.c:
> > sk_psock_drop[852]             queue_rcu_work(system_wq, &psock->rwork);
> > 
> > *** net/sched/act_ct.c:
> > tcf_ct_flow_table_put[355]     queue_rcu_work(act_ct_wq, &ct_ft->rwork);
> > 
> > *** net/sched/cls_api.c:
> > tcf_queue_work[225]            return queue_rcu_work(tc_filter_wq, rwork);
> > <snip>
> > There are 9 users of the queue_rcu_work() functions. I think there can be
> > a side effect if we keep it as lazy variant. Please note that i have not
> > checked all those users.
> > 
> > > There are less than 20 invocations of queue_rcu_work(), so it should
> > > be possible look through each.  The low-risk approach is of course to
> > > have queue_rcu_work() use call_rcu_flush().

Yes, once I get to a device (tomorrow), I'll look more. Last I checked it was
kvfree_rcu() -- this was a few weeks/months ago though.

> > > The next approach might be to have a Kconfig option and/or kernel
> > > boot parameter that allowed a per-system choice.
> > > 
> > > But it would not hurt to double-check on Android.
> > > 
> > I did not see such noise but i will come back some data on 5.10 kernel
> > today.
> > 
> Home screen swipe:
> <snip>
>        <...>-15      [003] d..1   202.142205: rcu_batch_start: rcu_preempt CBs=105 bl=10
>        <...>-55      [001] d..1   202.166174: rcu_batch_start: rcu_preempt CBs=135 bl=10
>        <...>-26      [001] d..1   202.402182: rcu_batch_start: rcu_preempt CBs=221 bl=10
>      rcuop/3-40      [003] d..1   202.650323: rcu_batch_start: rcu_preempt CBs=213 bl=10
>      rcuop/3-40      [000] d..1   203.210537: rcu_batch_start: rcu_preempt CBs=90 bl=10
>      rcuop/5-55      [001] d..1   204.675671: rcu_batch_start: rcu_preempt CBs=14 bl=10
>      rcuop/2-33      [002] d..1   205.162229: rcu_batch_start: rcu_preempt CBs=649 bl=10
>      rcuop/3-40      [000] d..1   205.418214: rcu_batch_start: rcu_preempt CBs=291 bl=10
>      rcuop/3-40      [003] d..1   206.134204: rcu_batch_start: rcu_preempt CBs=174 bl=10
>      rcuop/0-15      [003] d..1   206.726311: rcu_batch_start: rcu_preempt CBs=738 bl=10
>      rcuop/1-26      [001] d..1   206.814168: rcu_batch_start: rcu_preempt CBs=865 bl=10
>      rcuop/3-40      [003] d..1   207.278178: rcu_batch_start: rcu_preempt CBs=287 bl=10
>      rcuop/1-26      [001] d..1   208.826279: rcu_batch_start: rcu_preempt CBs=506 bl=10
> <snip>

This looks fine to me, but..

> An app launch:
> <snip>
>          rcuop/3-40      [002] d..1   322.118620: rcu_batch_start: rcu_preempt CBs=99 bl=10
>          rcuop/4-48      [005] dn.1   322.454052: rcu_batch_start: rcu_preempt CBs=270 bl=10
>          rcuop/5-55      [005] d..1   322.454109: rcu_batch_start: rcu_preempt CBs=91 bl=10
>          rcuop/5-55      [007] d..1   322.470054: rcu_batch_start: rcu_preempt CBs=106 bl=10
>          rcuop/6-62      [005] d..1   322.482120: rcu_batch_start: rcu_preempt CBs=231 bl=10
>          rcuop/4-48      [001] d..1   322.494150: rcu_batch_start: rcu_preempt CBs=227 bl=10
>            <...>-69      [002] d..1   322.502442: rcu_batch_start: rcu_preempt CBs=3350 bl=26
>          rcuop/1-26      [001] d..1   322.646099: rcu_batch_start: rcu_preempt CBs=1685 bl=13
>          rcuop/2-33      [001] d..1   322.670071: rcu_batch_start: rcu_preempt CBs=438 bl=10
>          rcuop/1-26      [001] d..1   322.674120: rcu_batch_start: rcu_preempt CBs=18 bl=10
>          rcuop/2-33      [003] d..1   322.690152: rcu_batch_start: rcu_preempt CBs=10 bl=10
>          rcuop/1-26      [002] d..1   322.698104: rcu_batch_start: rcu_preempt CBs=10 bl=10
>          rcuop/3-40      [002] d..1   322.706167: rcu_batch_start: rcu_preempt CBs=313 bl=10
>          rcuop/2-33      [003] d..1   322.710075: rcu_batch_start: rcu_preempt CBs=15 bl=10

The above does not look fine to me (^^^) from a Lazu-RCU PoV.  Here, RCU
callbacks are being invoked every 10-20ms.  The batching I seek is of the
order of seconds, when the system is relatively idle.

Why is Lazy-RCU not in effect for app launch? IOW, which callback is causing
Lazy-RCU to not be lazy here?  Could it be queue_rcu_work()?  Whenever a
non-lazy callback is queued, all the lazy ones are 'promoted' to non-lazy.
That's why I am asking. Even if you queue one non-lazy callback at a high
enough frequency, the lazy ones will no longer giving you batching or the
benefits of laziness.

>          rcuop/3-40      [002] d..1   322.742137: rcu_batch_start: rcu_preempt CBs=13 bl=10
>          rcuop/5-55      [000] d..1   322.754270: rcu_batch_start: rcu_preempt CBs=157 bl=10
>          rcuop/3-40      [000] d..1   322.762182: rcu_batch_start: rcu_preempt CBs=17 bl=10
>          rcuop/2-33      [003] d..1   322.774088: rcu_batch_start: rcu_preempt CBs=38 bl=10
>          rcuop/3-40      [000] d..1   322.778131: rcu_batch_start: rcu_preempt CBs=23 bl=10
>          rcuop/1-26      [002] d..1   322.790105: rcu_batch_start: rcu_preempt CBs=33 bl=10
>          rcuop/4-48      [001] d..1   322.798074: rcu_batch_start: rcu_preempt CBs=340 bl=10
>          rcuop/2-33      [002] d..1   322.806158: rcu_batch_start: rcu_preempt CBs=18 bl=10
>          rcuop/1-26      [002] d..1   322.814057: rcu_batch_start: rcu_preempt CBs=18 bl=10
>          rcuop/0-15      [001] d..1   322.822476: rcu_batch_start: rcu_preempt CBs=333 bl=10
>          rcuop/4-48      [003] d..1   322.830102: rcu_batch_start: rcu_preempt CBs=11 bl=10
>          rcuop/2-33      [001] d..1   322.846109: rcu_batch_start: rcu_preempt CBs=80 bl=10
>          rcuop/3-40      [001] d..1   322.854162: rcu_batch_start: rcu_preempt CBs=145 bl=10
>          rcuop/4-48      [003] d..1   322.874129: rcu_batch_start: rcu_preempt CBs=21 bl=10
>          rcuop/3-40      [001] d..1   322.878149: rcu_batch_start: rcu_preempt CBs=43 bl=10
>          rcuop/3-40      [001] d..1   322.906273: rcu_batch_start: rcu_preempt CBs=10 bl=10
>          rcuop/4-48      [001] d..1   322.918201: rcu_batch_start: rcu_preempt CBs=23 bl=10
>          rcuop/2-33      [001] d..1   322.926212: rcu_batch_start: rcu_preempt CBs=86 bl=10
>          rcuop/2-33      [001] d..1   322.946251: rcu_batch_start: rcu_preempt CBs=12 bl=10
>          rcuop/5-55      [003] d..1   322.954482: rcu_batch_start: rcu_preempt CBs=70 bl=10
>          rcuop/2-33      [003] d..1   322.978146: rcu_batch_start: rcu_preempt CBs=20 bl=10
>          rcuop/1-26      [002] d..1   323.014290: rcu_batch_start: rcu_preempt CBs=230 bl=10
>          rcuop/4-48      [001] d..1   323.026119: rcu_batch_start: rcu_preempt CBs=73 bl=10
>          rcuop/5-55      [003] d..1   323.026175: rcu_batch_start: rcu_preempt CBs=94 bl=10
>          rcuop/3-40      [001] d..1   323.035310: rcu_batch_start: rcu_preempt CBs=70 bl=10
>          rcuop/0-15      [001] d..1   323.046231: rcu_batch_start: rcu_preempt CBs=165 bl=10
>          rcuop/6-62      [005] d..1   323.066132: rcu_batch_start: rcu_preempt CBs=179 bl=10
>          rcuop/1-26      [002] d..1   323.174202: rcu_batch_start: rcu_preempt CBs=61 bl=10
>          rcuop/2-33      [003] d..1   323.190203: rcu_batch_start: rcu_preempt CBs=80 bl=10
>          rcuop/3-40      [003] d..1   323.206210: rcu_batch_start: rcu_preempt CBs=84 bl=10
>          rcuop/2-33      [003] d..1   323.226880: rcu_batch_start: rcu_preempt CBs=5 bl=10

And for several seconds you have the same thing going ^^^.

> <snip>
> 
> It is on Android with 5.10 kernel running. I do not see that queue_rcu_work() makes
> some noise.

Your rcu_batch_start tracepoint output above does not really reveal much
information about which callbacks are lazy and which are not.
rcu_invoke_callback is better in the sense at least you have the name of the
callback and can take a guess.

> Joel Could you please post your batch_start trace point output where you see the noise?

Sure, I'll do that once I get to a device.

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 14:31           ` Joel Fernandes
@ 2022-10-24 15:39             ` Paul E. McKenney
  2022-10-24 16:25               ` Uladzislau Rezki
  0 siblings, 1 reply; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-24 15:39 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 02:31:15PM +0000, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 02:23:39PM +0200, Uladzislau Rezki wrote:
> > > > On Sun, Oct 23, 2022 at 08:36:00PM -0400, Joel Fernandes wrote:
> > > > > On Wed, Oct 19, 2022 at 6:51 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > >
> > > > > > From: Uladzislau Rezki <urezki@gmail.com>
> > > > > >
> > > > > > call_rcu() changes to save power will slow down RCU workqueue items
> > > > > > queued via queue_rcu_work(). This may not be an issue, however we cannot
> > > > > > assume that workqueue users are OK with long delays. Use
> > > > > > call_rcu_flush() API instead which reverts to the old behavio
> > > > > 
> > > > > On ChromeOS, I can see that queue_rcu_work() is pretty noisy and the
> > > > > batching is much better if we can just keep it as call_rcu() instead
> > > > > of call_rcu_flush().
> > > > > 
> > > > > Is there really any reason to keep it as call_rcu_flush() ?  If I
> > > > > recall, the real reason Vlad's system was slowing down was because of
> > > > > scsi and the queue_rcu_work() conversion was really a red herring.
> > > > 
> > > <snip>
> > > *** drivers/acpi/osl.c:
> > > acpi_os_drop_map_ref[401]      queue_rcu_work(system_wq, &map->track.rwork);
> > > 
> > > *** drivers/gpu/drm/i915/gt/intel_execlists_submission.c:
> > > virtual_context_destroy[3653]  queue_rcu_work(system_wq, &ve->rcu);
> > > 
> > > *** fs/aio.c:
> > > free_ioctx_reqs[632]           queue_rcu_work(system_wq, &ctx->free_rwork);
> > > 
> > > *** fs/fs-writeback.c:
> > > inode_switch_wbs[604]          queue_rcu_work(isw_wq, &isw->work);
> > > cleanup_offline_cgwb[676]      queue_rcu_work(isw_wq, &isw->work);
> > > 
> > > *** include/linux/workqueue.h:
> > > __printf[446]                  extern bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork);
> > > 
> > > *** kernel/cgroup/cgroup.c:
> > > css_release_work_fn[5253]      queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
> > > css_create[5384]               queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
> > > 
> > > *** kernel/rcu/tree.c:
> > > kfree_rcu_monitor[3192]        queue_rcu_work(system_wq, &krwp->rcu_work);
> > > 
> > > *** net/core/skmsg.c:
> > > sk_psock_drop[852]             queue_rcu_work(system_wq, &psock->rwork);
> > > 
> > > *** net/sched/act_ct.c:
> > > tcf_ct_flow_table_put[355]     queue_rcu_work(act_ct_wq, &ct_ft->rwork);
> > > 
> > > *** net/sched/cls_api.c:
> > > tcf_queue_work[225]            return queue_rcu_work(tc_filter_wq, rwork);
> > > <snip>
> > > There are 9 users of the queue_rcu_work() functions. I think there can be
> > > a side effect if we keep it as lazy variant. Please note that i have not
> > > checked all those users.
> > > 
> > > > There are less than 20 invocations of queue_rcu_work(), so it should
> > > > be possible look through each.  The low-risk approach is of course to
> > > > have queue_rcu_work() use call_rcu_flush().
> 
> Yes, once I get to a device (tomorrow), I'll look more. Last I checked it was
> kvfree_rcu() -- this was a few weeks/months ago though.
> 
> > > > The next approach might be to have a Kconfig option and/or kernel
> > > > boot parameter that allowed a per-system choice.
> > > > 
> > > > But it would not hurt to double-check on Android.
> > > > 
> > > I did not see such noise but i will come back some data on 5.10 kernel
> > > today.
> > > 
> > Home screen swipe:
> > <snip>
> >        <...>-15      [003] d..1   202.142205: rcu_batch_start: rcu_preempt CBs=105 bl=10
> >        <...>-55      [001] d..1   202.166174: rcu_batch_start: rcu_preempt CBs=135 bl=10
> >        <...>-26      [001] d..1   202.402182: rcu_batch_start: rcu_preempt CBs=221 bl=10
> >      rcuop/3-40      [003] d..1   202.650323: rcu_batch_start: rcu_preempt CBs=213 bl=10
> >      rcuop/3-40      [000] d..1   203.210537: rcu_batch_start: rcu_preempt CBs=90 bl=10
> >      rcuop/5-55      [001] d..1   204.675671: rcu_batch_start: rcu_preempt CBs=14 bl=10
> >      rcuop/2-33      [002] d..1   205.162229: rcu_batch_start: rcu_preempt CBs=649 bl=10
> >      rcuop/3-40      [000] d..1   205.418214: rcu_batch_start: rcu_preempt CBs=291 bl=10
> >      rcuop/3-40      [003] d..1   206.134204: rcu_batch_start: rcu_preempt CBs=174 bl=10
> >      rcuop/0-15      [003] d..1   206.726311: rcu_batch_start: rcu_preempt CBs=738 bl=10
> >      rcuop/1-26      [001] d..1   206.814168: rcu_batch_start: rcu_preempt CBs=865 bl=10
> >      rcuop/3-40      [003] d..1   207.278178: rcu_batch_start: rcu_preempt CBs=287 bl=10
> >      rcuop/1-26      [001] d..1   208.826279: rcu_batch_start: rcu_preempt CBs=506 bl=10
> > <snip>
> 
> This looks fine to me, but..
> 
> > An app launch:
> > <snip>
> >          rcuop/3-40      [002] d..1   322.118620: rcu_batch_start: rcu_preempt CBs=99 bl=10
> >          rcuop/4-48      [005] dn.1   322.454052: rcu_batch_start: rcu_preempt CBs=270 bl=10
> >          rcuop/5-55      [005] d..1   322.454109: rcu_batch_start: rcu_preempt CBs=91 bl=10
> >          rcuop/5-55      [007] d..1   322.470054: rcu_batch_start: rcu_preempt CBs=106 bl=10
> >          rcuop/6-62      [005] d..1   322.482120: rcu_batch_start: rcu_preempt CBs=231 bl=10
> >          rcuop/4-48      [001] d..1   322.494150: rcu_batch_start: rcu_preempt CBs=227 bl=10
> >            <...>-69      [002] d..1   322.502442: rcu_batch_start: rcu_preempt CBs=3350 bl=26
> >          rcuop/1-26      [001] d..1   322.646099: rcu_batch_start: rcu_preempt CBs=1685 bl=13
> >          rcuop/2-33      [001] d..1   322.670071: rcu_batch_start: rcu_preempt CBs=438 bl=10
> >          rcuop/1-26      [001] d..1   322.674120: rcu_batch_start: rcu_preempt CBs=18 bl=10
> >          rcuop/2-33      [003] d..1   322.690152: rcu_batch_start: rcu_preempt CBs=10 bl=10
> >          rcuop/1-26      [002] d..1   322.698104: rcu_batch_start: rcu_preempt CBs=10 bl=10
> >          rcuop/3-40      [002] d..1   322.706167: rcu_batch_start: rcu_preempt CBs=313 bl=10
> >          rcuop/2-33      [003] d..1   322.710075: rcu_batch_start: rcu_preempt CBs=15 bl=10
> 
> The above does not look fine to me (^^^) from a Lazu-RCU PoV.  Here, RCU
> callbacks are being invoked every 10-20ms.  The batching I seek is of the
> order of seconds, when the system is relatively idle.
> 
> Why is Lazy-RCU not in effect for app launch? IOW, which callback is causing
> Lazy-RCU to not be lazy here?  Could it be queue_rcu_work()?  Whenever a
> non-lazy callback is queued, all the lazy ones are 'promoted' to non-lazy.
> That's why I am asking. Even if you queue one non-lazy callback at a high
> enough frequency, the lazy ones will no longer giving you batching or the
> benefits of laziness.
> 
> >          rcuop/3-40      [002] d..1   322.742137: rcu_batch_start: rcu_preempt CBs=13 bl=10
> >          rcuop/5-55      [000] d..1   322.754270: rcu_batch_start: rcu_preempt CBs=157 bl=10
> >          rcuop/3-40      [000] d..1   322.762182: rcu_batch_start: rcu_preempt CBs=17 bl=10
> >          rcuop/2-33      [003] d..1   322.774088: rcu_batch_start: rcu_preempt CBs=38 bl=10
> >          rcuop/3-40      [000] d..1   322.778131: rcu_batch_start: rcu_preempt CBs=23 bl=10
> >          rcuop/1-26      [002] d..1   322.790105: rcu_batch_start: rcu_preempt CBs=33 bl=10
> >          rcuop/4-48      [001] d..1   322.798074: rcu_batch_start: rcu_preempt CBs=340 bl=10
> >          rcuop/2-33      [002] d..1   322.806158: rcu_batch_start: rcu_preempt CBs=18 bl=10
> >          rcuop/1-26      [002] d..1   322.814057: rcu_batch_start: rcu_preempt CBs=18 bl=10
> >          rcuop/0-15      [001] d..1   322.822476: rcu_batch_start: rcu_preempt CBs=333 bl=10
> >          rcuop/4-48      [003] d..1   322.830102: rcu_batch_start: rcu_preempt CBs=11 bl=10
> >          rcuop/2-33      [001] d..1   322.846109: rcu_batch_start: rcu_preempt CBs=80 bl=10
> >          rcuop/3-40      [001] d..1   322.854162: rcu_batch_start: rcu_preempt CBs=145 bl=10
> >          rcuop/4-48      [003] d..1   322.874129: rcu_batch_start: rcu_preempt CBs=21 bl=10
> >          rcuop/3-40      [001] d..1   322.878149: rcu_batch_start: rcu_preempt CBs=43 bl=10
> >          rcuop/3-40      [001] d..1   322.906273: rcu_batch_start: rcu_preempt CBs=10 bl=10
> >          rcuop/4-48      [001] d..1   322.918201: rcu_batch_start: rcu_preempt CBs=23 bl=10
> >          rcuop/2-33      [001] d..1   322.926212: rcu_batch_start: rcu_preempt CBs=86 bl=10
> >          rcuop/2-33      [001] d..1   322.946251: rcu_batch_start: rcu_preempt CBs=12 bl=10
> >          rcuop/5-55      [003] d..1   322.954482: rcu_batch_start: rcu_preempt CBs=70 bl=10
> >          rcuop/2-33      [003] d..1   322.978146: rcu_batch_start: rcu_preempt CBs=20 bl=10
> >          rcuop/1-26      [002] d..1   323.014290: rcu_batch_start: rcu_preempt CBs=230 bl=10
> >          rcuop/4-48      [001] d..1   323.026119: rcu_batch_start: rcu_preempt CBs=73 bl=10
> >          rcuop/5-55      [003] d..1   323.026175: rcu_batch_start: rcu_preempt CBs=94 bl=10
> >          rcuop/3-40      [001] d..1   323.035310: rcu_batch_start: rcu_preempt CBs=70 bl=10
> >          rcuop/0-15      [001] d..1   323.046231: rcu_batch_start: rcu_preempt CBs=165 bl=10
> >          rcuop/6-62      [005] d..1   323.066132: rcu_batch_start: rcu_preempt CBs=179 bl=10
> >          rcuop/1-26      [002] d..1   323.174202: rcu_batch_start: rcu_preempt CBs=61 bl=10
> >          rcuop/2-33      [003] d..1   323.190203: rcu_batch_start: rcu_preempt CBs=80 bl=10
> >          rcuop/3-40      [003] d..1   323.206210: rcu_batch_start: rcu_preempt CBs=84 bl=10
> >          rcuop/2-33      [003] d..1   323.226880: rcu_batch_start: rcu_preempt CBs=5 bl=10
> 
> And for several seconds you have the same thing going ^^^.
> 
> > <snip>
> > 
> > It is on Android with 5.10 kernel running. I do not see that queue_rcu_work() makes
> > some noise.
> 
> Your rcu_batch_start tracepoint output above does not really reveal much
> information about which callbacks are lazy and which are not.
> rcu_invoke_callback is better in the sense at least you have the name of the
> callback and can take a guess.
> 
> > Joel Could you please post your batch_start trace point output where you see the noise?
> 
> Sure, I'll do that once I get to a device.

You guys might need to agree on the definition of "good" here.  Or maybe
understand the differences in your respective platforms' definitions of
"good".  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 15:39             ` Paul E. McKenney
@ 2022-10-24 16:25               ` Uladzislau Rezki
  2022-10-24 16:48                 ` Paul E. McKenney
  2022-10-24 16:54                 ` Joel Fernandes
  0 siblings, 2 replies; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-24 16:25 UTC (permalink / raw)
  To: Paul E. McKenney, Joel Fernandes
  Cc: Joel Fernandes, Uladzislau Rezki, rcu, linux-kernel, kernel-team,
	rostedt

>
> You guys might need to agree on the definition of "good" here.  Or maybe
> understand the differences in your respective platforms' definitions of
> "good".  ;-)
>
Indeed. Bad is when once per-millisecond infinitely :) At least in such use
workload a can detect a power delta and power gain. Anyway, below is a new
trace where i do not use "flush" variant for the kvfree_rcu():

<snip>
1. Home screen swipe:
         rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
         rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
         rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
         rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
         rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
         rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
         rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
         rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
         rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
2. App launches:
         rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
         rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
         rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
         rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
         rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
         rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
         rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
         rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
         rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
           <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
         rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
           <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
           <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
         rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
           <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
         rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
         rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
         rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
         rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
         rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
         rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
         rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
         rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
         rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
         rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
         rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
<snip>

it is much more better. But. As i wrote earlier there is a patch that i have submitted
some time ago improving kvfree_rcu() batching:

<snip>
commit 51824b780b719c53113dc39e027fbf670dc66028
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date:   Thu Jun 30 18:33:35 2022 +0200

    rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval

    Currently the monitor work is scheduled with a fixed interval of HZ/20,
    which is roughly 50 milliseconds. The drawback of this approach is
    low utilization of the 512 page slots in scenarios with infrequence
    kvfree_rcu() calls.  For example on an Android system:
<snip>

The trace that i posted was taken without it.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 16:25               ` Uladzislau Rezki
@ 2022-10-24 16:48                 ` Paul E. McKenney
  2022-10-24 16:55                   ` Uladzislau Rezki
  2022-10-28 21:23                   ` Joel Fernandes
  2022-10-24 16:54                 ` Joel Fernandes
  1 sibling, 2 replies; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-24 16:48 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: Joel Fernandes, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> >
> > You guys might need to agree on the definition of "good" here.  Or maybe
> > understand the differences in your respective platforms' definitions of
> > "good".  ;-)
> >
> Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> workload a can detect a power delta and power gain. Anyway, below is a new
> trace where i do not use "flush" variant for the kvfree_rcu():
> 
> <snip>
> 1. Home screen swipe:
>          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
>          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
>          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
>          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
>          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
>          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
>          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
>          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
>          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> 2. App launches:
>          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
>          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
>          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
>          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
>          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
>          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
>          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
>          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
>          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
>            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
>          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
>            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
>            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
>          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
>            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
>          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
>          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
>          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
>          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
>          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
>          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
>          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
>          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
>          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
>          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
>          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> <snip>
> 
> it is much more better. But. As i wrote earlier there is a patch that i have submitted
> some time ago improving kvfree_rcu() batching:
> 
> <snip>
> commit 51824b780b719c53113dc39e027fbf670dc66028
> Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Date:   Thu Jun 30 18:33:35 2022 +0200
> 
>     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> 
>     Currently the monitor work is scheduled with a fixed interval of HZ/20,
>     which is roughly 50 milliseconds. The drawback of this approach is
>     low utilization of the 512 page slots in scenarios with infrequence
>     kvfree_rcu() calls.  For example on an Android system:
> <snip>
> 
> The trace that i posted was taken without it.

And if I am not getting too confused, that patch is now in mainline.
So it does make sense to rely on it, then.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 16:25               ` Uladzislau Rezki
  2022-10-24 16:48                 ` Paul E. McKenney
@ 2022-10-24 16:54                 ` Joel Fernandes
  1 sibling, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 16:54 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> >
> > You guys might need to agree on the definition of "good" here.  Or maybe
> > understand the differences in your respective platforms' definitions of
> > "good".  ;-)
> >
> Indeed. Bad is when once per-millisecond infinitely :) At least in such use

To me once per-ms is really bad, and once per 20ms indefinitely is also not
ideal ;-). Just to give you a sense of why I feel this, I see the RCU thread
wake ups that periodically happen can disturb CPUIdle.

The act of queuing Callback + gp delay + rcu threads running is enough to
disrupt overlaps between CPUidle time and the gp delay. Further the idle
governor will refrain from entering deeper CPUidle states because it will see
timers queued in the near future to wake up the RCU grace-period kthreads.

> workload a can detect a power delta and power gain. Anyway, below is a new
> trace where i do not use "flush" variant for the kvfree_rcu():
> 
> <snip>
> 1. Home screen swipe:
>          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
>          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
>          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
>          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
>          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
>          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
>          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
>          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
>          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> 2. App launches:
>          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
>          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
>          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
>          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
>          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
>          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
>          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
>          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
>          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
>            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
>          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
>            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
>            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
>          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
>            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
>          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
>          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
>          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
>          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
>          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
>          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
>          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
>          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
>          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
>          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
>          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> <snip>
> 
> it is much more better. But. As i wrote earlier there is a patch that i have submitted
> some time ago improving kvfree_rcu() batching:

Yes it seems much better than your last traces! I'd propose to drop this
patch because as you show, it effects not only yours but ChromeOS. It appears
kvfree_rcu() use of queue_rcu_work() is a perfect candidate for call_rcu()
batching because it is purely driven by memory pressure. And we have a
shrinker for lazy-RCU as well.

For non-kvfree uses, we can introduce a queue_rcu_work_flush() if need-be.

What do you think?

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 16:48                 ` Paul E. McKenney
@ 2022-10-24 16:55                   ` Uladzislau Rezki
  2022-10-24 17:08                     ` Uladzislau Rezki
  2022-10-28 21:23                   ` Joel Fernandes
  1 sibling, 1 reply; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-24 16:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Joel Fernandes, rcu, linux-kernel, kernel-team,
	rostedt

On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > >
> > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > understand the differences in your respective platforms' definitions of
> > > "good".  ;-)
> > >
> > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > workload a can detect a power delta and power gain. Anyway, below is a new
> > trace where i do not use "flush" variant for the kvfree_rcu():
> > 
> > <snip>
> > 1. Home screen swipe:
> >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > 2. App launches:
> >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > <snip>
> > 
> > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > some time ago improving kvfree_rcu() batching:
> > 
> > <snip>
> > commit 51824b780b719c53113dc39e027fbf670dc66028
> > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > Date:   Thu Jun 30 18:33:35 2022 +0200
> > 
> >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > 
> >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> >     which is roughly 50 milliseconds. The drawback of this approach is
> >     low utilization of the 512 page slots in scenarios with infrequence
> >     kvfree_rcu() calls.  For example on an Android system:
> > <snip>
> > 
> > The trace that i posted was taken without it.
> 
> And if I am not getting too confused, that patch is now in mainline.
> So it does make sense to rely on it, then.  ;-)
> 
Right.

urezki@pc638:~/data/raid0/coding/linux.git$ git tag --contains
51824b780b719c53113dc39e027fbf670dc66028
v6.1-rc1
v6.1-rc2
urezki@pc638:~/data/raid0/coding/linux.git$

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 16:55                   ` Uladzislau Rezki
@ 2022-10-24 17:08                     ` Uladzislau Rezki
  2022-10-24 17:20                       ` Joel Fernandes
  0 siblings, 1 reply; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-24 17:08 UTC (permalink / raw)
  To: Paul E. McKenney, Joel Fernandes
  Cc: Paul E. McKenney, Joel Fernandes, rcu, linux-kernel, kernel-team,
	rostedt

On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > >
> > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > understand the differences in your respective platforms' definitions of
> > > > "good".  ;-)
> > > >
> > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > 
> > > <snip>
> > > 1. Home screen swipe:
> > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > 2. App launches:
> > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > <snip>
> > > 
> > > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > > some time ago improving kvfree_rcu() batching:
> > > 
> > > <snip>
> > > commit 51824b780b719c53113dc39e027fbf670dc66028
> > > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > Date:   Thu Jun 30 18:33:35 2022 +0200
> > > 
> > >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > > 
> > >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> > >     which is roughly 50 milliseconds. The drawback of this approach is
> > >     low utilization of the 512 page slots in scenarios with infrequence
> > >     kvfree_rcu() calls.  For example on an Android system:
> > > <snip>
> > > 
> > > The trace that i posted was taken without it.
> > 
> > And if I am not getting too confused, that patch is now in mainline.
> > So it does make sense to rely on it, then.  ;-)
> > 
> Right.
> 
> urezki@pc638:~/data/raid0/coding/linux.git$ git tag --contains
> 51824b780b719c53113dc39e027fbf670dc66028
> v6.1-rc1
> v6.1-rc2
> urezki@pc638:~/data/raid0/coding/linux.git$
> 
Just in case: 5.10 + "rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval":

<snip>
1. Home screen swipe:
         rcuop/3-40      [003] d..1    94.202849: rcu_batch_start: rcu_preempt CBs=664 bl=10
         rcuop/4-48      [001] d..1    95.999352: rcu_batch_start: rcu_preempt CBs=252 bl=10
         rcuop/6-62      [002] d..1    97.534875: rcu_batch_start: rcu_preempt CBs=152 bl=10
         rcuop/5-55      [003] d..1    98.042912: rcu_batch_start: rcu_preempt CBs=189 bl=10
         rcuop/0-15      [002] d..1    98.306769: rcu_batch_start: rcu_preempt CBs=1457 bl=11
         rcuop/1-26      [000] d..1    99.582931: rcu_batch_start: rcu_preempt CBs=2115 bl=16
         rcuop/2-33      [003] d..1    99.582935: rcu_batch_start: rcu_preempt CBs=2019 bl=15
         rcuop/3-40      [001] d..1    99.838885: rcu_batch_start: rcu_preempt CBs=1168 bl=10
         rcuop/1-26      [000] d..1   100.603496: rcu_batch_start: rcu_preempt CBs=168 bl=10
2. Apps launches:
         rcuop/4-48      [007] d..1   102.910580: rcu_batch_start: rcu_preempt CBs=1150 bl=10
         rcuop/6-62      [007] d..1   102.910682: rcu_batch_start: rcu_preempt CBs=1001 bl=10
         rcuop/5-55      [007] d..1   103.166607: rcu_batch_start: rcu_preempt CBs=939 bl=10
         rcuop/0-15      [007] d..1   104.450598: rcu_batch_start: rcu_preempt CBs=1694 bl=13
         rcuop/5-55      [006] d..1   104.478640: rcu_batch_start: rcu_preempt CBs=3125 bl=24
         rcuop/3-40      [007] d..1   104.958565: rcu_batch_start: rcu_preempt CBs=1108 bl=10
         rcuop/7-69      [007] d..1   106.238634: rcu_batch_start: rcu_preempt CBs=10275 bl=80
         rcuop/4-48      [007] d..1   107.258586: rcu_batch_start: rcu_preempt CBs=8142 bl=63
         rcuop/7-69      [007] d..1   107.260769: rcu_batch_start: rcu_preempt CBs=1880 bl=14
         rcuop/2-33      [007] d..1   107.526638: rcu_batch_start: rcu_preempt CBs=1968 bl=15
         rcuop/1-26      [007] d..1   107.542612: rcu_batch_start: rcu_preempt CBs=1796 bl=14
         rcuop/5-55      [007] d..1   108.286588: rcu_batch_start: rcu_preempt CBs=3547 bl=27
         rcuop/6-62      [007] d..1   108.287639: rcu_batch_start: rcu_preempt CBs=5820 bl=45
         rcuop/7-69      [007] d..1   108.290548: rcu_batch_start: rcu_preempt CBs=2430 bl=18
         rcuop/0-15      [000] d..1   109.826843: rcu_batch_start: rcu_preempt CBs=2282 bl=17
         rcuop/3-40      [002] d..1   110.595455: rcu_batch_start: rcu_preempt CBs=1960 bl=15
           <...>-48      [005] d..1   112.390702: rcu_batch_start: rcu_preempt CBs=5143 bl=40
         rcuop/7-69      [004] d..1   112.402607: rcu_batch_start: rcu_preempt CBs=3379 bl=26
         rcuop/2-33      [005] d..1   112.638614: rcu_batch_start: rcu_preempt CBs=3223 bl=25
         rcuop/1-26      [004] d..1   112.638617: rcu_batch_start: rcu_preempt CBs=3026 bl=23
         rcuop/5-55      [007] d..1   113.402581: rcu_batch_start: rcu_preempt CBs=7251 bl=56
         rcuop/6-62      [007] d..1   113.658582: rcu_batch_start: rcu_preempt CBs=7035 bl=54
<snip>

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 17:08                     ` Uladzislau Rezki
@ 2022-10-24 17:20                       ` Joel Fernandes
  2022-10-24 17:35                         ` Paul E. McKenney
  2022-10-24 17:40                         ` Uladzislau Rezki
  0 siblings, 2 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 17:20 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
>
> On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > >
> > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > understand the differences in your respective platforms' definitions of
> > > > > "good".  ;-)
> > > > >
> > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > >
> > > > <snip>
> > > > 1. Home screen swipe:
> > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23

> > > > 2. App launches:
> > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10

Definitely better, but I'd still ask why not just rely on the lazy
batching that we now have, since it is a memory pressure related
usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
disturb the lazy-RCU batching by queuing these "free memory" CBs; and
instead keep your improved kvfree_rcu() batching only for
!CONFIG_RCU_LAZY.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 17:20                       ` Joel Fernandes
@ 2022-10-24 17:35                         ` Paul E. McKenney
  2022-10-24 20:12                           ` Joel Fernandes
  2022-10-24 17:40                         ` Uladzislau Rezki
  1 sibling, 1 reply; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-24 17:35 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > >
> > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > understand the differences in your respective platforms' definitions of
> > > > > > "good".  ;-)
> > > > > >
> > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > >
> > > > > <snip>
> > > > > 1. Home screen swipe:
> > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> 
> > > > > 2. App launches:
> > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> 
> Definitely better, but I'd still ask why not just rely on the lazy
> batching that we now have, since it is a memory pressure related
> usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> instead keep your improved kvfree_rcu() batching only for
> !CONFIG_RCU_LAZY.

Given that making the kvfree_rcu()-level batching conditional on
CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
keeping the kvfree_rcu-level batching unconditionally?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 17:20                       ` Joel Fernandes
  2022-10-24 17:35                         ` Paul E. McKenney
@ 2022-10-24 17:40                         ` Uladzislau Rezki
  2022-10-24 20:08                           ` Joel Fernandes
  1 sibling, 1 reply; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-24 17:40 UTC (permalink / raw)
  To: Joel Fernandes, Paul E. McKenney
  Cc: Uladzislau Rezki, Paul E. McKenney, rcu, linux-kernel,
	kernel-team, rostedt

On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > >
> > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > understand the differences in your respective platforms' definitions of
> > > > > > "good".  ;-)
> > > > > >
> > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > >
> > > > > <snip>
> > > > > 1. Home screen swipe:
> > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> 
> > > > > 2. App launches:
> > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> 
> Definitely better, but I'd still ask why not just rely on the lazy
> batching that we now have, since it is a memory pressure related
> usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> instead keep your improved kvfree_rcu() batching only for
> !CONFIG_RCU_LAZY.
>

1. Double-batching?

The kvfree_rcu() interface itself keeps track of when to reclaim:
  a) when a page is full;
  b) when i high storm of freeing over rcu;
  c) when a low memory condition.

such control stays inside the kvfree_rcu(). Converting it to lazy
variant:
  a) lose the control, what will become as a problem;
  b) nothing is improved.

2. Converting the queue_rcu_work() to lazy variant breaks a humanity
interpretation when a queued work is supposed to be run. People do not
expect seconds when they queue the work. Same as in the kvfree_rcu()
we do not expect it we even used a high_prio queue in the beginning.

There are ~10 users who queue the work and they did not expect it to
be run in 10 seconds when they wrote the code.

3. With the "rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval" there
is no sense in doing it. Same data in active and idle use cases.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 17:40                         ` Uladzislau Rezki
@ 2022-10-24 20:08                           ` Joel Fernandes
  2022-10-25 10:47                             ` Uladzislau Rezki
  0 siblings, 1 reply; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 20:08 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 1:40 PM Uladzislau Rezki <urezki@gmail.com> wrote:
>
> On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > >
> > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > >
> > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > "good".  ;-)
> > > > > > >
> > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > >
> > > > > > <snip>
> > > > > > 1. Home screen swipe:
> > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> >
> > > > > > 2. App launches:
> > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> >
> > Definitely better, but I'd still ask why not just rely on the lazy
> > batching that we now have, since it is a memory pressure related
> > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > instead keep your improved kvfree_rcu() batching only for
> > !CONFIG_RCU_LAZY.
> >
>
> 1. Double-batching?
>
> The kvfree_rcu() interface itself keeps track of when to reclaim:
>   a) when a page is full;
>   b) when i high storm of freeing over rcu;
>   c) when a low memory condition.
>
> such control stays inside the kvfree_rcu(). Converting it to lazy
> variant:
>   a) lose the control, what will become as a problem;
>   b) nothing is improved.

AFAICS, the only thing being changed is when you are giving memory
back to the system. So you will be holding on to memory a bit longer.
And there's shrinkers that are already flushing those. I don't think
the users of kvfree_rcu() want to free memory instantly. If there is
such usecase, please share it.

> 2. Converting the queue_rcu_work() to lazy variant breaks a humanity
> interpretation when a queued work is supposed to be run. People do not
> expect seconds when they queue the work.

Which people? ;)

> Same as in the kvfree_rcu()
> we do not expect it we even used a high_prio queue in the beginning.
> There are ~10 users who queue the work and they did not expect it to
> be run in 10 seconds when they wrote the code.

That's a bit of a misinterpretation of what I'm saying. A variant
queue_rcu_work_flush() can be added for those users (such as ones that
are not freeing memory).

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 17:35                         ` Paul E. McKenney
@ 2022-10-24 20:12                           ` Joel Fernandes
  2022-10-24 20:16                             ` Joel Fernandes
  2022-10-24 20:19                             ` Paul E. McKenney
  0 siblings, 2 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 20:12 UTC (permalink / raw)
  To: paulmck; +Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > >
> > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > >
> > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > "good".  ;-)
> > > > > > >
> > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > >
> > > > > > <snip>
> > > > > > 1. Home screen swipe:
> > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> >
> > > > > > 2. App launches:
> > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> >
> > Definitely better, but I'd still ask why not just rely on the lazy
> > batching that we now have, since it is a memory pressure related
> > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > instead keep your improved kvfree_rcu() batching only for
> > !CONFIG_RCU_LAZY.
>
> Given that making the kvfree_rcu()-level batching conditional on
> CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> keeping the kvfree_rcu-level batching unconditionally?

The bad thing happening is power impact. There is a noticeable impact
in our testing, and when we dropped this particular patch, it got much
better results.

I also run rcutop and I see without the patch that I have several
seconds of laziness at a time, unlike with the patch.

Even in the beginning when I came up with an implementation for
call_rcu_lazy(), I had to mark queue_rcu_work() as lazy as well since
it was quite frequent (on ChromeOS). But when we introduced the
flush() API, I forgot to not use flush() on it.  But unfortunately
this patch slipped into my last series when Vlad and I were debugging
the SCSI issue, and did not really help for the SCSI issue itself.

Thanks,

 - Joel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 20:12                           ` Joel Fernandes
@ 2022-10-24 20:16                             ` Joel Fernandes
  2022-10-25 10:48                               ` Uladzislau Rezki
  2022-10-24 20:19                             ` Paul E. McKenney
  1 sibling, 1 reply; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 20:16 UTC (permalink / raw)
  To: paulmck; +Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 4:12 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > >
> > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > "good".  ;-)
> > > > > > > >
> > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > >
> > > > > > > <snip>
> > > > > > > 1. Home screen swipe:
> > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > >
> > > > > > > 2. App launches:
> > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > >
> > > Definitely better, but I'd still ask why not just rely on the lazy
> > > batching that we now have, since it is a memory pressure related
> > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > instead keep your improved kvfree_rcu() batching only for
> > > !CONFIG_RCU_LAZY.
> >
> > Given that making the kvfree_rcu()-level batching conditional on
> > CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> > keeping the kvfree_rcu-level batching unconditionally?
>
> The bad thing happening is power impact. There is a noticeable impact
> in our testing, and when we dropped this particular patch, it got much
> better results.
>
> I also run rcutop and I see without the patch that I have several
> seconds of laziness at a time, unlike with the patch.
>
> Even in the beginning when I came up with an implementation for
> call_rcu_lazy(), I had to mark queue_rcu_work() as lazy as well since
> it was quite frequent (on ChromeOS). But when we introduced the
> flush() API, I forgot to not use flush() on it.  But unfortunately
> this patch slipped into my last series when Vlad and I were debugging
> the SCSI issue, and did not really help for the SCSI issue itself.

I could try to run Vlad's other mainline patch itself and measure
power, I'll get back on that. Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 20:12                           ` Joel Fernandes
  2022-10-24 20:16                             ` Joel Fernandes
@ 2022-10-24 20:19                             ` Paul E. McKenney
  2022-10-24 20:26                               ` Joel Fernandes
  1 sibling, 1 reply; 44+ messages in thread
From: Paul E. McKenney @ 2022-10-24 20:19 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 04:12:59PM -0400, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > >
> > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > "good".  ;-)
> > > > > > > >
> > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > >
> > > > > > > <snip>
> > > > > > > 1. Home screen swipe:
> > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > >
> > > > > > > 2. App launches:
> > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > >
> > > Definitely better, but I'd still ask why not just rely on the lazy
> > > batching that we now have, since it is a memory pressure related
> > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > instead keep your improved kvfree_rcu() batching only for
> > > !CONFIG_RCU_LAZY.
> >
> > Given that making the kvfree_rcu()-level batching conditional on
> > CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> > keeping the kvfree_rcu-level batching unconditionally?
> 
> The bad thing happening is power impact. There is a noticeable impact
> in our testing, and when we dropped this particular patch, it got much
> better results.
> 
> I also run rcutop and I see without the patch that I have several
> seconds of laziness at a time, unlike with the patch.

Fair point, but is this visible at the power meter?

							Thanx, Paul

> Even in the beginning when I came up with an implementation for
> call_rcu_lazy(), I had to mark queue_rcu_work() as lazy as well since
> it was quite frequent (on ChromeOS). But when we introduced the
> flush() API, I forgot to not use flush() on it.  But unfortunately
> this patch slipped into my last series when Vlad and I were debugging
> the SCSI issue, and did not really help for the SCSI issue itself.
> 
> Thanks,
> 
>  - Joel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 20:19                             ` Paul E. McKenney
@ 2022-10-24 20:26                               ` Joel Fernandes
  0 siblings, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-24 20:26 UTC (permalink / raw)
  To: paulmck; +Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 4:19 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Oct 24, 2022 at 04:12:59PM -0400, Joel Fernandes wrote:
> > On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > >
> > > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > > >
> > > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > > "good".  ;-)
> > > > > > > > >
> > > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > > >
> > > > > > > > <snip>
> > > > > > > > 1. Home screen swipe:
> > > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > >
> > > > > > > > 2. App launches:
> > > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > >
> > > > Definitely better, but I'd still ask why not just rely on the lazy
> > > > batching that we now have, since it is a memory pressure related
> > > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > > instead keep your improved kvfree_rcu() batching only for
> > > > !CONFIG_RCU_LAZY.
> > >
> > > Given that making the kvfree_rcu()-level batching conditional on
> > > CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> > > keeping the kvfree_rcu-level batching unconditionally?
> >
> > The bad thing happening is power impact. There is a noticeable impact
> > in our testing, and when we dropped this particular patch, it got much
> > better results.
> >
> > I also run rcutop and I see without the patch that I have several
> > seconds of laziness at a time, unlike with the patch.
>
> Fair point, but is this visible at the power meter?

Yes it is and it came up as a part of debugging I did, I am not making
it up ;-) The delta in power is 10%. As you saw in Vlad's traces as
well, the kvfree_rcu() can be called quite frequently.

 - Joel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 20:08                           ` Joel Fernandes
@ 2022-10-25 10:47                             ` Uladzislau Rezki
  0 siblings, 0 replies; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-25 10:47 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Uladzislau Rezki, Paul E. McKenney, rcu, linux-kernel,
	kernel-team, rostedt

On Mon, Oct 24, 2022 at 04:08:17PM -0400, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 1:40 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > >
> > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > >
> > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > "good".  ;-)
> > > > > > > >
> > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > >
> > > > > > > <snip>
> > > > > > > 1. Home screen swipe:
> > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > >
> > > > > > > 2. App launches:
> > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > >
> > > Definitely better, but I'd still ask why not just rely on the lazy
> > > batching that we now have, since it is a memory pressure related
> > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > instead keep your improved kvfree_rcu() batching only for
> > > !CONFIG_RCU_LAZY.
> > >
> >
> > 1. Double-batching?
> >
> > The kvfree_rcu() interface itself keeps track of when to reclaim:
> >   a) when a page is full;
> >   b) when i high storm of freeing over rcu;
> >   c) when a low memory condition.
> >
> > such control stays inside the kvfree_rcu(). Converting it to lazy
> > variant:
> >   a) lose the control, what will become as a problem;
> >   b) nothing is improved.
> 
> AFAICS, the only thing being changed is when you are giving memory
> back to the system. So you will be holding on to memory a bit longer.
> And there's shrinkers that are already flushing those. I don't think
> the users of kvfree_rcu() want to free memory instantly. If there is
> such usecase, please share it.
> 
Actually, ideally, we want to free a memory asap. The problem with extra 10
seconds is a big amount of un-reclaimed memory that usually can lead
to more frequent memory pressure and doing shrinking. We do not want
ideally trigger any shrinker. Because for us it is a big slow down
in a device behaviour.

> > 2. Converting the queue_rcu_work() to lazy variant breaks a humanity
> > interpretation when a queued work is supposed to be run. People do not
> > expect seconds when they queue the work.
> 
> Which people? ;)
> 
Who wrote the code :)

> > Same as in the kvfree_rcu()
> > we do not expect it we even used a high_prio queue in the beginning.
> > There are ~10 users who queue the work and they did not expect it to
> > be run in 10 seconds when they wrote the code.
> 
> That's a bit of a misinterpretation of what I'm saying. A variant
> queue_rcu_work_flush() can be added for those users (such as ones that
> are not freeing memory).
> 
If it is added for the kvfree_rcu() it is totally fine. Because there is
a batching in place.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 20:16                             ` Joel Fernandes
@ 2022-10-25 10:48                               ` Uladzislau Rezki
  2022-10-25 15:05                                 ` Joel Fernandes
  0 siblings, 1 reply; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-25 10:48 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: paulmck, Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 04:16:20PM -0400, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 4:12 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> > On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > >
> > > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > > >
> > > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > > "good".  ;-)
> > > > > > > > >
> > > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > > >
> > > > > > > > <snip>
> > > > > > > > 1. Home screen swipe:
> > > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > >
> > > > > > > > 2. App launches:
> > > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > >
> > > > Definitely better, but I'd still ask why not just rely on the lazy
> > > > batching that we now have, since it is a memory pressure related
> > > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > > instead keep your improved kvfree_rcu() batching only for
> > > > !CONFIG_RCU_LAZY.
> > >
> > > Given that making the kvfree_rcu()-level batching conditional on
> > > CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> > > keeping the kvfree_rcu-level batching unconditionally?
> >
> > The bad thing happening is power impact. There is a noticeable impact
> > in our testing, and when we dropped this particular patch, it got much
> > better results.
> >
> > I also run rcutop and I see without the patch that I have several
> > seconds of laziness at a time, unlike with the patch.
> >
> > Even in the beginning when I came up with an implementation for
> > call_rcu_lazy(), I had to mark queue_rcu_work() as lazy as well since
> > it was quite frequent (on ChromeOS). But when we introduced the
> > flush() API, I forgot to not use flush() on it.  But unfortunately
> > this patch slipped into my last series when Vlad and I were debugging
> > the SCSI issue, and did not really help for the SCSI issue itself.
> 
> I could try to run Vlad's other mainline patch itself and measure
> power, I'll get back on that. Thanks!
>
That makes sense. It would be good to have a look at your power figures
and traces.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-25 10:48                               ` Uladzislau Rezki
@ 2022-10-25 15:05                                 ` Joel Fernandes
  2022-10-26 20:35                                   ` Uladzislau Rezki
  0 siblings, 1 reply; 44+ messages in thread
From: Joel Fernandes @ 2022-10-25 15:05 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: paulmck, rcu, linux-kernel, kernel-team, rostedt

On Tue, Oct 25, 2022 at 6:48 AM Uladzislau Rezki <urezki@gmail.com> wrote:
>
> On Mon, Oct 24, 2022 at 04:16:20PM -0400, Joel Fernandes wrote:
> > On Mon, Oct 24, 2022 at 4:12 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >
> > > On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > > > >
> > > > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > > > "good".  ;-)
> > > > > > > > > >
> > > > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > > > >
> > > > > > > > > <snip>
> > > > > > > > > 1. Home screen swipe:
> > > > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > > >
> > > > > > > > > 2. App launches:
> > > > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > > >
> > > > > Definitely better, but I'd still ask why not just rely on the lazy
> > > > > batching that we now have, since it is a memory pressure related
> > > > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > > > instead keep your improved kvfree_rcu() batching only for
> > > > > !CONFIG_RCU_LAZY.
> > > >
> > > > Given that making the kvfree_rcu()-level batching conditional on
> > > > CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> > > > keeping the kvfree_rcu-level batching unconditionally?
> > >
> > > The bad thing happening is power impact. There is a noticeable impact
> > > in our testing, and when we dropped this particular patch, it got much
> > > better results.
> > >
> > > I also run rcutop and I see without the patch that I have several
> > > seconds of laziness at a time, unlike with the patch.
> > >
> > > Even in the beginning when I came up with an implementation for
> > > call_rcu_lazy(), I had to mark queue_rcu_work() as lazy as well since
> > > it was quite frequent (on ChromeOS). But when we introduced the
> > > flush() API, I forgot to not use flush() on it.  But unfortunately
> > > this patch slipped into my last series when Vlad and I were debugging
> > > the SCSI issue, and did not really help for the SCSI issue itself.
> >
> > I could try to run Vlad's other mainline patch itself and measure
> > power, I'll get back on that. Thanks!
> >
> That makes sense. It would be good to have a look at your power figures
> and traces.

If you don't mind, could you backport that patch to 5.10?
Here is my 5.10 tree for reference (without the patch)
https://github.com/joelagnel/linux-kernel.git    (branch
5.10-v9-minus-queuework-plus-kfreebatch)

and I am getting conflicts if I cherry-pick:
51824b780b71 ("rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval")

I am assuming you have already done the backport, that's why you got
the traces above. If so, I would appreciate a link to your branch so I
don't mess the backport up!

 - Joel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-25 15:05                                 ` Joel Fernandes
@ 2022-10-26 20:35                                   ` Uladzislau Rezki
  0 siblings, 0 replies; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-26 20:35 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Uladzislau Rezki, paulmck, rcu, linux-kernel, kernel-team, rostedt

On Tue, Oct 25, 2022 at 11:05:57AM -0400, Joel Fernandes wrote:
> On Tue, Oct 25, 2022 at 6:48 AM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Mon, Oct 24, 2022 at 04:16:20PM -0400, Joel Fernandes wrote:
> > > On Mon, Oct 24, 2022 at 4:12 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > >
> > > > On Mon, Oct 24, 2022 at 1:36 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > >
> > > > > On Mon, Oct 24, 2022 at 01:20:26PM -0400, Joel Fernandes wrote:
> > > > > > On Mon, Oct 24, 2022 at 1:08 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > > > >
> > > > > > > On Mon, Oct 24, 2022 at 06:55:16PM +0200, Uladzislau Rezki wrote:
> > > > > > > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > > > > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > > > > > > >
> > > > > > > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > > > > > > understand the differences in your respective platforms' definitions of
> > > > > > > > > > > "good".  ;-)
> > > > > > > > > > >
> > > > > > > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > > > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > > > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > > > > > > >
> > > > > > > > > > <snip>
> > > > > > > > > > 1. Home screen swipe:
> > > > > > > > > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > > > > > > > > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > > > > > > > > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > > > > > > > > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > > > > > > > > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > > > > > > > > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > > > > > > > > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > > > > > > > > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > > > > > > > > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > > > >
> > > > > > > > > > 2. App launches:
> > > > > > > > > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > > > > > > > > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > > > > > > > > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > > > > > > > > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > > > > > > > > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > > > > > > > > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > > > > > > > > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > > > > > > > > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > > > > > > > > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > > > > > > > > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > > > > > > > > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > > > > > > > > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > > > > > > > > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > > > > > > > > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > > > > > > > > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > > > > > > > > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > > > > > > > > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > > > > > > > > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > > > > > > > > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > > > > > > > > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > > > > > > > > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > > > > > > > > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > > > > > > > > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > > > > > > > > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > > > > > > > > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > > > > > > > > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > > > >
> > > > > > Definitely better, but I'd still ask why not just rely on the lazy
> > > > > > batching that we now have, since it is a memory pressure related
> > > > > > usecase. Or another approach could be, for CONFIG_RCU_LAZY, don't
> > > > > > disturb the lazy-RCU batching by queuing these "free memory" CBs; and
> > > > > > instead keep your improved kvfree_rcu() batching only for
> > > > > > !CONFIG_RCU_LAZY.
> > > > >
> > > > > Given that making the kvfree_rcu()-level batching conditional on
> > > > > CONFIG_RCU_LAZY would complicate the code, what bad thing happens when
> > > > > keeping the kvfree_rcu-level batching unconditionally?
> > > >
> > > > The bad thing happening is power impact. There is a noticeable impact
> > > > in our testing, and when we dropped this particular patch, it got much
> > > > better results.
> > > >
> > > > I also run rcutop and I see without the patch that I have several
> > > > seconds of laziness at a time, unlike with the patch.
> > > >
> > > > Even in the beginning when I came up with an implementation for
> > > > call_rcu_lazy(), I had to mark queue_rcu_work() as lazy as well since
> > > > it was quite frequent (on ChromeOS). But when we introduced the
> > > > flush() API, I forgot to not use flush() on it.  But unfortunately
> > > > this patch slipped into my last series when Vlad and I were debugging
> > > > the SCSI issue, and did not really help for the SCSI issue itself.
> > >
> > > I could try to run Vlad's other mainline patch itself and measure
> > > power, I'll get back on that. Thanks!
> > >
> > That makes sense. It would be good to have a look at your power figures
> > and traces.
> 
> If you don't mind, could you backport that patch to 5.10?
> Here is my 5.10 tree for reference (without the patch)
> https://github.com/joelagnel/linux-kernel.git    (branch
> 5.10-v9-minus-queuework-plus-kfreebatch)
> 
> and I am getting conflicts if I cherry-pick:
> 51824b780b71 ("rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval")
> 
> I am assuming you have already done the backport, that's why you got
> the traces above. If so, I would appreciate a link to your branch so I
> don't mess the backport up!
> 
Sure. I sent you the patches privately so i do not want to paste
here a lot of code to make extra line-nose.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-24 16:48                 ` Paul E. McKenney
  2022-10-24 16:55                   ` Uladzislau Rezki
@ 2022-10-28 21:23                   ` Joel Fernandes
  2022-10-28 21:42                     ` Joel Fernandes
  2022-10-31 13:21                     ` Uladzislau Rezki
  1 sibling, 2 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-28 21:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > >
> > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > understand the differences in your respective platforms' definitions of
> > > "good".  ;-)
> > >
> > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > workload a can detect a power delta and power gain. Anyway, below is a new
> > trace where i do not use "flush" variant for the kvfree_rcu():
> > 
> > <snip>
> > 1. Home screen swipe:
> >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > 2. App launches:
> >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > <snip>
> > 
> > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > some time ago improving kvfree_rcu() batching:
> > 
> > <snip>
> > commit 51824b780b719c53113dc39e027fbf670dc66028
> > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > Date:   Thu Jun 30 18:33:35 2022 +0200
> > 
> >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > 
> >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> >     which is roughly 50 milliseconds. The drawback of this approach is
> >     low utilization of the 512 page slots in scenarios with infrequence
> >     kvfree_rcu() calls.  For example on an Android system:
> > <snip>
> > 
> > The trace that i posted was taken without it.
> 
> And if I am not getting too confused, that patch is now in mainline.
> So it does make sense to rely on it, then.  ;-)

Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?

One at 5 and other at 10 seems odd, especially because the former seems to
negate the effects of the latter and anyone tweaking that in the future (say
via new command line options) should probably tweak them together to increase
batching.

Testing shows significantly better batching with Vlad's updates, however I am
wondering why the rcu_callback fires in pairs like that from separate
kworkers:

     kworker/6:1-157     [006] d..1   288.861610: rcu_callback: rcu_preempt  rhp=0000000079b895f9 func=rcu_work_rcufn 1214
     kworker/4:2-158     [004]  d..1   288.861612: rcu_callback: rcu_preempt rhp=00000000d83fcc90 func=rcu_work_rcufn 798

I wonder if the queued kwork is executing twice accidentally, or something.
This kernel does have the additional trace patch below, fyi.

Another thought I have is, if we can just keep the kvfree_rcu() mapped to
call_rcu() via a config option say CONFIG_RCU_LAZY_KFREE, or something.
Because I am personally not much a fan of the kfree_rcu() induced additional
timer wake ups and kworker queue+wakeup which we don't need per-se, if we are
already batching with Lazyfied-call_rcu. Too many moving parts which might
hurt power.

---8<-----------------------

From: Joel Fernandes <joelaf@google.com>
Subject: [PATCH] debug: reorder trace_rcu_callback

Signed-off-by: Joel Fernandes <joelaf@google.com>
---
 kernel/rcu/tree.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 73feb09fd51b..a7c175e9533a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2978,10 +2978,6 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
 	}
 
 	check_cb_ovld(rdp);
-	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy))
-		return; // Enqueued onto ->nocb_bypass, so just leave.
-	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
-	rcu_segcblist_enqueue(&rdp->cblist, head);
 	if (__is_kvfree_rcu_offset((unsigned long)func))
 		trace_rcu_kvfree_callback(rcu_state.name, head,
 					 (unsigned long)func,
@@ -2990,6 +2986,11 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
 		trace_rcu_callback(rcu_state.name, head,
 				   rcu_segcblist_n_cbs(&rdp->cblist));
 
+	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy))
+		return; // Enqueued onto ->nocb_bypass, so just leave.
+	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
+	rcu_segcblist_enqueue(&rdp->cblist, head);
+
 	/* Go handle any RCU core processing required. */
 	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
 	    unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
-- 
2.38.1.273.g43a17bfeac-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-28 21:23                   ` Joel Fernandes
@ 2022-10-28 21:42                     ` Joel Fernandes
  2022-10-31 13:21                     ` Uladzislau Rezki
  1 sibling, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-28 21:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, rcu, linux-kernel, kernel-team, rostedt

On Fri, Oct 28, 2022 at 09:23:47PM +0000, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > >
> > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > understand the differences in your respective platforms' definitions of
> > > > "good".  ;-)
> > > >
> > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > 
> > > <snip>
> > > 1. Home screen swipe:
> > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > 2. App launches:
> > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > <snip>
> > > 
> > > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > > some time ago improving kvfree_rcu() batching:
> > > 
> > > <snip>
> > > commit 51824b780b719c53113dc39e027fbf670dc66028
> > > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > Date:   Thu Jun 30 18:33:35 2022 +0200
> > > 
> > >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > > 
> > >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> > >     which is roughly 50 milliseconds. The drawback of this approach is
> > >     low utilization of the 512 page slots in scenarios with infrequence
> > >     kvfree_rcu() calls.  For example on an Android system:
> > > <snip>
> > > 
> > > The trace that i posted was taken without it.
> > 
> > And if I am not getting too confused, that patch is now in mainline.
> > So it does make sense to rely on it, then.  ;-)
> 
> Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
> to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?
> 
> One at 5 and other at 10 seems odd, especially because the former seems to
> negate the effects of the latter and anyone tweaking that in the future (say
> via new command line options) should probably tweak them together to increase
> batching.
> 
> Testing shows significantly better batching with Vlad's updates, however I am
> wondering why the rcu_callback fires in pairs like that from separate
> kworkers:
> 
>      kworker/6:1-157     [006] d..1   288.861610: rcu_callback: rcu_preempt  rhp=0000000079b895f9 func=rcu_work_rcufn 1214
>      kworker/4:2-158     [004]  d..1   288.861612: rcu_callback: rcu_preempt rhp=00000000d83fcc90 func=rcu_work_rcufn 798

I think this is just 2 kvfree_call_rcu() happening on 2 different CPUs, and
then the draining happened on 2 different kworkers, so appears normal.

Here is also some more evidence from the user side, of kvfree_call_rcu()
noisiness on ChromeOS. So we definitely need the batching to happen on
ChromeOS:

   kworker/u24:6-1448  [005]    77.290344: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
 ThreadPoolForeg-5130  [011]    77.301101: function:             kvfree_call_rcu <-- cgroup_migrate_finish
 irq/144-iwlwifi-1010  [004]    77.314367: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
 ThreadPoolSingl-2050  [004]    77.330359: function:             kvfree_call_rcu <-- cgroup_migrate_finish
 ThreadPoolSingl-2050  [004]    77.330362: function:             kvfree_call_rcu <-- cgroup_migrate_finish
 ThreadPoolForeg-5130  [011]    77.331513: function:             kvfree_call_rcu <-- cgroup_migrate_finish
     patchpaneld-2195  [009]    77.337726: function:             kvfree_call_rcu <-- neigh_flush_dev
     patchpaneld-2195  [009]    77.337737: function:             kvfree_call_rcu <-- __hw_addr_del_entry
     patchpaneld-2195  [009]    77.337744: function:             kvfree_call_rcu <-- addrconf_ifdown
     patchpaneld-2195  [009]    77.337744: function:             kvfree_call_rcu <-- __hw_addr_del_entry
 irq/144-iwlwifi-1010  [004]    77.633595: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
 irq/144-iwlwifi-1010  [004]    77.633609: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
 irq/144-iwlwifi-1010  [004]    77.769844: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
   kworker/u24:1-9     [008]    77.769858: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
 irq/144-iwlwifi-1010  [004]    77.880114: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
   kworker/u24:6-1448  [005]    77.880129: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
 irq/144-iwlwifi-1010  [004]    77.880131: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
   kworker/u24:6-1448  [005]    77.880133: function:             kvfree_call_rcu <-- cfg80211_update_known_bss
      virtio_gpu-5882  [010]    78.337302: function:             kvfree_call_rcu <-- context_close
      virtio_gpu-5882  [010]    78.337303: function:             kvfree_call_rcu <-- i915_driver_postclose
      virtio_gpu-5882  [010]    78.346058: function:             kvfree_call_rcu <-- fence_notify
      virtio_gpu-5882  [010]    78.346070: function:             kvfree_call_rcu <-- fence_notify
      virtio_gpu-5882  [010]    78.346079: function:             kvfree_call_rcu <-- fence_notify
      virtio_gpu-5882  [010]    78.346086: function:             kvfree_call_rcu <-- fence_notify
      virtio_gpu-5882  [010]    78.346184: function:             kvfree_call_rcu <-- fence_notify
      virtio_gpu-5882  [010]    78.346196: function:             kvfree_call_rcu <-- fence_notify

thanks,

 - Joel

> 
> I wonder if the queued kwork is executing twice accidentally, or something.
> This kernel does have the additional trace patch below, fyi.
> 
> Another thought I have is, if we can just keep the kvfree_rcu() mapped to
> call_rcu() via a config option say CONFIG_RCU_LAZY_KFREE, or something.
> Because I am personally not much a fan of the kfree_rcu() induced additional
> timer wake ups and kworker queue+wakeup which we don't need per-se, if we are
> already batching with Lazyfied-call_rcu. Too many moving parts which might
> hurt power.
> 
> ---8<-----------------------
> 
> From: Joel Fernandes <joelaf@google.com>
> Subject: [PATCH] debug: reorder trace_rcu_callback
> 
> Signed-off-by: Joel Fernandes <joelaf@google.com>
> ---
>  kernel/rcu/tree.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 73feb09fd51b..a7c175e9533a 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2978,10 +2978,6 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
>  	}
>  
>  	check_cb_ovld(rdp);
> -	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy))
> -		return; // Enqueued onto ->nocb_bypass, so just leave.
> -	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
> -	rcu_segcblist_enqueue(&rdp->cblist, head);
>  	if (__is_kvfree_rcu_offset((unsigned long)func))
>  		trace_rcu_kvfree_callback(rcu_state.name, head,
>  					 (unsigned long)func,
> @@ -2990,6 +2986,11 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
>  		trace_rcu_callback(rcu_state.name, head,
>  				   rcu_segcblist_n_cbs(&rdp->cblist));
>  
> +	if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy))
> +		return; // Enqueued onto ->nocb_bypass, so just leave.
> +	// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
> +	rcu_segcblist_enqueue(&rdp->cblist, head);
> +
>  	/* Go handle any RCU core processing required. */
>  	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
>  	    unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
> -- 
> 2.38.1.273.g43a17bfeac-goog
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-28 21:23                   ` Joel Fernandes
  2022-10-28 21:42                     ` Joel Fernandes
@ 2022-10-31 13:21                     ` Uladzislau Rezki
  2022-10-31 13:37                       ` Joel Fernandes
  2022-10-31 18:15                       ` Joel Fernandes
  1 sibling, 2 replies; 44+ messages in thread
From: Uladzislau Rezki @ 2022-10-31 13:21 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Paul E. McKenney, Uladzislau Rezki, rcu, linux-kernel,
	kernel-team, rostedt

On Fri, Oct 28, 2022 at 09:23:47PM +0000, Joel Fernandes wrote:
> On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > >
> > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > understand the differences in your respective platforms' definitions of
> > > > "good".  ;-)
> > > >
> > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > 
> > > <snip>
> > > 1. Home screen swipe:
> > >          rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
> > >          rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
> > >          rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
> > >          rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
> > >          rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
> > >          rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
> > >          rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
> > >          rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
> > >          rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
> > > 2. App launches:
> > >          rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
> > >          rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
> > >          rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
> > >          rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
> > >          rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
> > >          rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
> > >          rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
> > >          rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
> > >          rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
> > >            <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
> > >          rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84
> > >            <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
> > >            <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
> > >          rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
> > >            <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
> > >          rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
> > >          rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
> > >          rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
> > >          rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
> > >          rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
> > >          rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
> > >          rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
> > >          rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
> > >          rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
> > >          rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
> > >          rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
> > > <snip>
> > > 
> > > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > > some time ago improving kvfree_rcu() batching:
> > > 
> > > <snip>
> > > commit 51824b780b719c53113dc39e027fbf670dc66028
> > > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > Date:   Thu Jun 30 18:33:35 2022 +0200
> > > 
> > >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > > 
> > >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> > >     which is roughly 50 milliseconds. The drawback of this approach is
> > >     low utilization of the 512 page slots in scenarios with infrequence
> > >     kvfree_rcu() calls.  For example on an Android system:
> > > <snip>
> > > 
> > > The trace that i posted was taken without it.
> > 
> > And if I am not getting too confused, that patch is now in mainline.
> > So it does make sense to rely on it, then.  ;-)
> 
> Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
> to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?
> 
This is very good.

Below is a plot that i have taken during one use-case. It is about three
apps usage in parallel. It was done by running "monkey" test:

wget ftp://vps418301.ovh.net/incoming/monkey_3_apps_slab_usage_5_minutes.png

i set up three apps as usage scenario: Google Chrome, YoTube and Camera.
I logged the Slab metric from the /proc/meminfo. Sampling rate is 0.1 second.

Please have a look at results. It reflects what i am saying. non-flush
kvfree RCU variant makes a memory usage higher. What is not acceptable
for our mobile devices and workloads.

>
> One at 5 and other at 10 seems odd, especially because the former seems to
> negate the effects of the latter and anyone tweaking that in the future (say
> via new command line options) should probably tweak them together to increase
> batching.
> 
Well. Convert 5 seconds to 10? What will it solve for you? We can do it
and from a kvfree_rcu() perspective nothing really is changed.

> Testing shows significantly better batching with Vlad's updates, however I am
> wondering why the rcu_callback fires in pairs like that from separate
> kworkers:
> 
>      kworker/6:1-157     [006] d..1   288.861610: rcu_callback: rcu_preempt  rhp=0000000079b895f9 func=rcu_work_rcufn 1214
>      kworker/4:2-158     [004]  d..1   288.861612: rcu_callback: rcu_preempt rhp=00000000d83fcc90 func=rcu_work_rcufn 798
> 
> I wonder if the queued kwork is executing twice accidentally, or something.
>
Because a kfree_rcu_cpu() is a per-cpu thing.

> This kernel does have the additional trace patch below, fyi.
> 
> Another thought I have is, if we can just keep the kvfree_rcu() mapped to
> call_rcu() via a config option say CONFIG_RCU_LAZY_KFREE, or something.
>
I am not sure you need it, really. If you wake-up "rcuop" or whatever
with 0.5 second interval or with 5 seconds interval you will not notice
anything in terms of power between both.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-31 13:21                     ` Uladzislau Rezki
@ 2022-10-31 13:37                       ` Joel Fernandes
  2022-10-31 18:15                       ` Joel Fernandes
  1 sibling, 0 replies; 44+ messages in thread
From: Joel Fernandes @ 2022-10-31 13:37 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, rcu, linux-kernel, kernel-team, rostedt


> On Oct 31, 2022, at 9:21 AM, Uladzislau Rezki <urezki@gmail.com> wrote:
> 
> On Fri, Oct 28, 2022 at 09:23:47PM +0000, Joel Fernandes wrote:
>>> On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
>>> On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
>>>>> 
>>>>> You guys might need to agree on the definition of "good" here.  Or maybe
>>>>> understand the differences in your respective platforms' definitions of
>>>>> "good".  ;-)
>>>>> 
>>>> Indeed. Bad is when once per-millisecond infinitely :) At least in such use
>>>> workload a can detect a power delta and power gain. Anyway, below is a new
>>>> trace where i do not use "flush" variant for the kvfree_rcu():
>>>> 
>>>> <snip>
>>>> 1. Home screen swipe:
>>>>         rcuop/0-15      [003] d..1  1792.767750: rcu_batch_start: rcu_preempt CBs=1003 bl=10
>>>>         rcuop/2-33      [002] d..1  1792.771717: rcu_batch_start: rcu_preempt CBs=934 bl=10
>>>>         rcuop/3-40      [001] d..1  1794.811816: rcu_batch_start: rcu_preempt CBs=1508 bl=11
>>>>         rcuop/1-26      [003] d..1  1797.116382: rcu_batch_start: rcu_preempt CBs=2127 bl=16
>>>>         rcuop/4-48      [001] d..1  1797.124422: rcu_batch_start: rcu_preempt CBs=95 bl=10
>>>>         rcuop/5-55      [002] d..1  1797.124731: rcu_batch_start: rcu_preempt CBs=143 bl=10
>>>>         rcuop/6-62      [005] d..1  1798.911719: rcu_batch_start: rcu_preempt CBs=132 bl=10
>>>>         rcuop/2-33      [002] d..1  1803.003966: rcu_batch_start: rcu_preempt CBs=3797 bl=29
>>>>         rcuop/0-15      [003] d..1  1803.004707: rcu_batch_start: rcu_preempt CBs=2969 bl=23
>>>> 2. App launches:
>>>>         rcuop/4-48      [005] d..1  1831.087612: rcu_batch_start: rcu_preempt CBs=6141 bl=47
>>>>         rcuop/7-69      [007] d..1  1831.095578: rcu_batch_start: rcu_preempt CBs=5464 bl=42
>>>>         rcuop/5-55      [004] d..1  1832.703571: rcu_batch_start: rcu_preempt CBs=8461 bl=66
>>>>         rcuop/0-15      [004] d..1  1833.731603: rcu_batch_start: rcu_preempt CBs=2548 bl=19
>>>>         rcuop/1-26      [006] d..1  1833.743691: rcu_batch_start: rcu_preempt CBs=2567 bl=20
>>>>         rcuop/2-33      [006] d..1  1833.744005: rcu_batch_start: rcu_preempt CBs=2359 bl=18
>>>>         rcuop/3-40      [006] d..1  1833.744286: rcu_batch_start: rcu_preempt CBs=3681 bl=28
>>>>         rcuop/4-48      [002] d..1  1838.079777: rcu_batch_start: rcu_preempt CBs=10444 bl=81
>>>>         rcuop/7-69      [001] d..1  1838.080375: rcu_batch_start: rcu_preempt CBs=12572 bl=98
>>>>           <...>-62      [002] d..1  1838.080646: rcu_batch_start: rcu_preempt CBs=14135 bl=110
>>>>         rcuop/6-62      [000] d..1  1838.087722: rcu_batch_start: rcu_preempt CBs=10839 bl=84

Please let us try to trim emails. That goes for me too.

>>>>           <...>-62      [003] d..1  1839.227022: rcu_batch_start: rcu_preempt CBs=1834 bl=14
>>>>           <...>-26      [001] d..1  1839.963315: rcu_batch_start: rcu_preempt CBs=5769 bl=45
>>>>         rcuop/2-33      [001] d..1  1839.966485: rcu_batch_start: rcu_preempt CBs=3789 bl=29
>>>>           <...>-40      [001] d..1  1839.966596: rcu_batch_start: rcu_preempt CBs=6425 bl=50
>>>>         rcuop/2-33      [005] d..1  1840.541272: rcu_batch_start: rcu_preempt CBs=825 bl=10
>>>>         rcuop/2-33      [005] d..1  1840.547724: rcu_batch_start: rcu_preempt CBs=44 bl=10
>>>>         rcuop/2-33      [005] d..1  1841.075759: rcu_batch_start: rcu_preempt CBs=516 bl=10
>>>>         rcuop/0-15      [002] d..1  1841.695716: rcu_batch_start: rcu_preempt CBs=6312 bl=49
>>>>         rcuop/0-15      [003] d..1  1841.709714: rcu_batch_start: rcu_preempt CBs=39 bl=10
>>>>         rcuop/5-55      [004] d..1  1843.112442: rcu_batch_start: rcu_preempt CBs=16007 bl=125
>>>>         rcuop/5-55      [004] d..1  1843.115444: rcu_batch_start: rcu_preempt CBs=7901 bl=61
>>>>         rcuop/6-62      [001] dn.1  1843.123983: rcu_batch_start: rcu_preempt CBs=8427 bl=65
>>>>         rcuop/6-62      [006] d..1  1843.412383: rcu_batch_start: rcu_preempt CBs=981 bl=10
>>>>         rcuop/0-15      [003] d..1  1844.659812: rcu_batch_start: rcu_preempt CBs=1851 bl=14
>>>>         rcuop/0-15      [003] d..1  1844.667790: rcu_batch_start: rcu_preempt CBs=135 bl=10
>>>> <snip>
>>>> 
>>>> it is much more better. But. As i wrote earlier there is a patch that i have submitted
>>>> some time ago improving kvfree_rcu() batching:
>>>> 
>>>> <snip>
>>>> commit 51824b780b719c53113dc39e027fbf670dc66028
>>>> Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
>>>> Date:   Thu Jun 30 18:33:35 2022 +0200
>>>> 
>>>>    rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
>>>> 
>>>>    Currently the monitor work is scheduled with a fixed interval of HZ/20,
>>>>    which is roughly 50 milliseconds. The drawback of this approach is
>>>>    low utilization of the 512 page slots in scenarios with infrequence
>>>>    kvfree_rcu() calls.  For example on an Android system:
>>>> <snip>
>>>> 
>>>> The trace that i posted was taken without it.
>>> 
>>> And if I am not getting too confused, that patch is now in mainline.
>>> So it does make sense to rely on it, then.  ;-)
>> 
>> Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
>> to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?
>> 
> This is very good.
> 
> Below is a plot that i have taken during one use-case. It is about three
> apps usage in parallel. It was done by running "monkey" test:
> 
> wget ftp://vps418301.ovh.net/incoming/monkey_3_apps_slab_usage_5_minutes.png
> 
> i set up three apps as usage scenario: Google Chrome, YoTube and Camera.
> I logged the Slab metric from the /proc/meminfo. Sampling rate is 0.1 second.
> 
> Please have a look at results. It reflects what i am saying. non-flush
> kvfree RCU variant makes a memory usage higher. What is not acceptable
> for our mobile devices and workloads.

Thanks I’ll take a closer look at the data (currently commuting) but here’s a quick reply:

I am curious that with the 5 second timer, you are delaying RCU anyway. Are you saying that, adding another 10 on top (due to lazyfying) seems to be causing issues? I find it hard to believe that you cannot give the shrinker enough work within 5 seconds, such that it also triggers the issues you’re seeing. However the workload and data speaks.

>> One at 5 and other at 10 seems odd, especially because the former seems to
>> negate the effects of the latter and anyone tweaking that in the future (say
>> via new command line options) should probably tweak them together to increase
>> batching.
>> 
> Well. Convert 5 seconds to 10? What will it solve for you? We can do it
> and from a kvfree_rcu() perspective nothing really is changed.

True. In fact with my last patch, I see almost never even the need to go to RCU. However my point with unification is just to keep it simple for user for 2 knobs that do the same thing. Granted this is a compiler knob but that might change in the future. We already have enough knobs in RCU and as you guys know, I’m a fan of not letting the user mess things up too much.

>> Testing shows significantly better batching with Vlad's updates, however I am
>> wondering why the rcu_callback fires in pairs like that from separate
>> kworkers:
>> 
>>     kworker/6:1-157     [006] d..1   288.861610: rcu_callback: rcu_preempt  rhp=0000000079b895f9 func=rcu_work_rcufn 1214
>>     kworker/4:2-158     [004]  d..1   288.861612: rcu_callback: rcu_preempt rhp=00000000d83fcc90 func=rcu_work_rcufn 798
>> 
>> I wonder if the queued kwork is executing twice accidentally, or something.
>> 
> Because a kfree_rcu_cpu() is a per-cpu thing.

Right, got it.

>> This kernel does have the additional trace patch below, fyi.
>> 
>> Another thought I have is, if we can just keep the kvfree_rcu() mapped to
>> call_rcu() via a config option say CONFIG_RCU_LAZY_KFREE, or something.
>> 
> I am not sure you need it, really. If you wake-up "rcuop" or whatever
> with 0.5 second interval or with 5 seconds interval you will not notice
> anything in terms of power between both.

Yes, you are right. This is not needed considering the improvements you recently made.

Cheers,

 - Joel 


> 
> --
> Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-31 13:21                     ` Uladzislau Rezki
  2022-10-31 13:37                       ` Joel Fernandes
@ 2022-10-31 18:15                       ` Joel Fernandes
  2022-11-01  4:49                         ` Uladzislau Rezki
  1 sibling, 1 reply; 44+ messages in thread
From: Joel Fernandes @ 2022-10-31 18:15 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, rcu, linux-kernel, kernel-team, rostedt

On Mon, Oct 31, 2022 at 9:21 AM Uladzislau Rezki <urezki@gmail.com> wrote:
>
> On Fri, Oct 28, 2022 at 09:23:47PM +0000, Joel Fernandes wrote:
> > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > >
> > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > understand the differences in your respective platforms' definitions of
> > > > > "good".  ;-)
> > > > >
> > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > >
> > > > <snip>
> > > > 1. Home screen swipe:
[...]
> > > > 2. App launches:
[...]
> > > > <snip>
> > > >
> > > > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > > > some time ago improving kvfree_rcu() batching:
> > > >
> > > > <snip>
> > > > commit 51824b780b719c53113dc39e027fbf670dc66028
> > > > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > > Date:   Thu Jun 30 18:33:35 2022 +0200
> > > >
> > > >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > > >
> > > >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> > > >     which is roughly 50 milliseconds. The drawback of this approach is
> > > >     low utilization of the 512 page slots in scenarios with infrequence
> > > >     kvfree_rcu() calls.  For example on an Android system:
> > > > <snip>
> > > >
> > > > The trace that i posted was taken without it.
> > >
> > > And if I am not getting too confused, that patch is now in mainline.
> > > So it does make sense to rely on it, then.  ;-)
> >
> > Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
> > to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?
> >
> This is very good.
>
> Below is a plot that i have taken during one use-case. It is about three
> apps usage in parallel. It was done by running "monkey" test:
>
> wget ftp://vps418301.ovh.net/incoming/monkey_3_apps_slab_usage_5_minutes.png
>
> i set up three apps as usage scenario: Google Chrome, YoTube and Camera.
> I logged the Slab metric from the /proc/meminfo. Sampling rate is 0.1 second.
>
> Please have a look at results. It reflects what i am saying. non-flush
> kvfree RCU variant makes a memory usage higher. What is not acceptable
> for our mobile devices and workloads.

That does look higher, though honestly about ~5%. But that's just the
effect of more "laziness". The graph itself does not show a higher
number of shrinker invocations, in fact I think shrinker invocations
are not happening much that's why the slab holds more memory. The
system may not be under memory pressure?

Anyway, I agree with your point of view and I think my concern does
not even occur with the latest patch on avoiding RCU that I posted
[1], so I come in peace.

[1] https://lore.kernel.org/rcu/20221029132856.3752018-1-joel@joelfernandes.org/

I am going to start merging all the lazy patches to ChromeOS 5.10 now
including your kfree updates, except for [1] while we discuss it.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush()
  2022-10-31 18:15                       ` Joel Fernandes
@ 2022-11-01  4:49                         ` Uladzislau Rezki
  0 siblings, 0 replies; 44+ messages in thread
From: Uladzislau Rezki @ 2022-11-01  4:49 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Uladzislau Rezki, Paul E. McKenney, rcu, linux-kernel,
	kernel-team, rostedt

> On Mon, Oct 31, 2022 at 9:21 AM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Fri, Oct 28, 2022 at 09:23:47PM +0000, Joel Fernandes wrote:
> > > On Mon, Oct 24, 2022 at 09:48:19AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Oct 24, 2022 at 06:25:30PM +0200, Uladzislau Rezki wrote:
> > > > > >
> > > > > > You guys might need to agree on the definition of "good" here.  Or maybe
> > > > > > understand the differences in your respective platforms' definitions of
> > > > > > "good".  ;-)
> > > > > >
> > > > > Indeed. Bad is when once per-millisecond infinitely :) At least in such use
> > > > > workload a can detect a power delta and power gain. Anyway, below is a new
> > > > > trace where i do not use "flush" variant for the kvfree_rcu():
> > > > >
> > > > > <snip>
> > > > > 1. Home screen swipe:
> [...]
> > > > > 2. App launches:
> [...]
> > > > > <snip>
> > > > >
> > > > > it is much more better. But. As i wrote earlier there is a patch that i have submitted
> > > > > some time ago improving kvfree_rcu() batching:
> > > > >
> > > > > <snip>
> > > > > commit 51824b780b719c53113dc39e027fbf670dc66028
> > > > > Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > > > Date:   Thu Jun 30 18:33:35 2022 +0200
> > > > >
> > > > >     rcu/kvfree: Update KFREE_DRAIN_JIFFIES interval
> > > > >
> > > > >     Currently the monitor work is scheduled with a fixed interval of HZ/20,
> > > > >     which is roughly 50 milliseconds. The drawback of this approach is
> > > > >     low utilization of the 512 page slots in scenarios with infrequence
> > > > >     kvfree_rcu() calls.  For example on an Android system:
> > > > > <snip>
> > > > >
> > > > > The trace that i posted was taken without it.
> > > >
> > > > And if I am not getting too confused, that patch is now in mainline.
> > > > So it does make sense to rely on it, then.  ;-)
> > >
> > > Vlad's patch to change the KFREE_DRAIN_JIFFIES to 5 seconds seems reasonable
> > > to me. However, can we unify KFREE_DRAIN_JIFFIES and LAZY_FLUSH_JIFFIES ?
> > >
> > This is very good.
> >
> > Below is a plot that i have taken during one use-case. It is about three
> > apps usage in parallel. It was done by running "monkey" test:
> >
> > wget ftp://vps418301.ovh.net/incoming/monkey_3_apps_slab_usage_5_minutes.png
> >
> > i set up three apps as usage scenario: Google Chrome, YoTube and Camera.
> > I logged the Slab metric from the /proc/meminfo. Sampling rate is 0.1 second.
> >
> > Please have a look at results. It reflects what i am saying. non-flush
> > kvfree RCU variant makes a memory usage higher. What is not acceptable
> > for our mobile devices and workloads.
> 
> That does look higher, though honestly about ~5%. But that's just the
> effect of more "laziness". The graph itself does not show a higher
> number of shrinker invocations, in fact I think shrinker invocations
> are not happening much that's why the slab holds more memory. The
> system may not be under memory pressure?
> 
The idea is to minimize a possibility of entering into a low memory
condition mode. This is bad from a sluggishness point of view for users.
I am saying it in a context of android devices.

> Anyway, I agree with your point of view and I think my concern does
> not even occur with the latest patch on avoiding RCU that I posted
> [1], so I come in peace.
> 
> [1] https://lore.kernel.org/rcu/20221029132856.3752018-1-joel@joelfernandes.org/
> 
I will have a look at it.

>
> I am going to start merging all the lazy patches to ChromeOS 5.10 now
> including your kfree updates, except for [1] while we discuss it.
>
Good for ChromeOS users :)

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2022-11-01  4:49 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-19 22:51 [PATCH rcu 0/14] Lazy call_rcu() updates for v6.2 Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 01/14] rcu: Simplify rcu_init_nohz() cpumask handling Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 02/14] rcu: Fix late wakeup when flush of bypass cblist happens Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 03/14] rcu: Fix missing nocb gp wake on rcu_barrier() Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 04/14] rcu: Make call_rcu() lazy to save power Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 05/14] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 06/14] rcu: Shrinker for lazy rcu Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 07/14] rcuscale: Add laziness and kfree tests Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 08/14] percpu-refcount: Use call_rcu_flush() for atomic switch Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 09/14] rcu/sync: Use call_rcu_flush() instead of call_rcu Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 10/14] rcu/rcuscale: Use call_rcu_flush() for async reader test Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 11/14] rcu/rcutorture: Use call_rcu_flush() where needed Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 12/14] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu() Paul E. McKenney
2022-10-19 22:51 ` [PATCH rcu 13/14] workqueue: Make queue_rcu_work() use call_rcu_flush() Paul E. McKenney
2022-10-24  0:36   ` Joel Fernandes
2022-10-24  3:15     ` Paul E. McKenney
2022-10-24 10:49       ` Uladzislau Rezki
2022-10-24 12:23         ` Uladzislau Rezki
2022-10-24 14:31           ` Joel Fernandes
2022-10-24 15:39             ` Paul E. McKenney
2022-10-24 16:25               ` Uladzislau Rezki
2022-10-24 16:48                 ` Paul E. McKenney
2022-10-24 16:55                   ` Uladzislau Rezki
2022-10-24 17:08                     ` Uladzislau Rezki
2022-10-24 17:20                       ` Joel Fernandes
2022-10-24 17:35                         ` Paul E. McKenney
2022-10-24 20:12                           ` Joel Fernandes
2022-10-24 20:16                             ` Joel Fernandes
2022-10-25 10:48                               ` Uladzislau Rezki
2022-10-25 15:05                                 ` Joel Fernandes
2022-10-26 20:35                                   ` Uladzislau Rezki
2022-10-24 20:19                             ` Paul E. McKenney
2022-10-24 20:26                               ` Joel Fernandes
2022-10-24 17:40                         ` Uladzislau Rezki
2022-10-24 20:08                           ` Joel Fernandes
2022-10-25 10:47                             ` Uladzislau Rezki
2022-10-28 21:23                   ` Joel Fernandes
2022-10-28 21:42                     ` Joel Fernandes
2022-10-31 13:21                     ` Uladzislau Rezki
2022-10-31 13:37                       ` Joel Fernandes
2022-10-31 18:15                       ` Joel Fernandes
2022-11-01  4:49                         ` Uladzislau Rezki
2022-10-24 16:54                 ` Joel Fernandes
2022-10-19 22:51 ` [PATCH rcu 14/14] rxrpc: Use call_rcu_flush() instead of call_rcu() Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).