rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/48] RCU changes for v6.10
@ 2024-05-07  9:34 Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 01/48] scftorture: Increase memory provided to guest OS Uladzislau Rezki (Sony)
                   ` (47 more replies)
  0 siblings, 48 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

This is for 6.10 kernel:

fixes: Fix a lockdep complain for lazy-preemptible kernel,
remove redundant BH disable for TINY_RCU, remove redundant READ_ONCE()
in tree.c, fix false positives KCSAN splat and fix buffer overflow in
the print_cpu_stall_info().

misc: Misc updates related to bpf, tracing and update the
MAINTAINERS file.

rcu-sync-normal-improve: An improvement of a normal
synchronize_rcu() call in terms of latency. It maintains a separate
track for sync. users only. This approach bypasses per-cpu nocb-lists
thus sync-users do not depend on nocb-list length and how fast regular
callbacks are processed.

rcu-tasks: RCU tasks, switch tasks RCU grace periods to
sleep at TASK_IDLE priority, fix some comments, add some diagnostic
warning to the exit_tasks_rcu_start() and fix a buffer overflow in
the show_rcu_tasks_trace_gp_kthread().

rcutorture: Increase memory to guest OS, fix a Tasks
Rude RCU testing, some updates for TREE09, dump mode information
to debug GP kthread state, remove redundant READ_ONCE(), fix some
comments about RCU_TORTURE_PIPE_LEN and pipe_count, remove some
redundant pointer initialization, fix a hung splat task by when
the rcutorture tests start to exit, fix invalid context warning,
add '--do-kvfree' parameter to torture test and use slow register
unregister callbacks only for rcutype test.

Johannes Berg (1):
  rcu: Mollify sparse with RCU guard

Neeraj Upadhyay (1):
  MAINTAINERS: Update Neeraj's email address

Nikita Kiryushin (2):
  rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow
  rcu: Fix buffer overflow in print_cpu_stall_info()

Paul E. McKenney (29):
  scftorture: Increase memory provided to guest OS
  rcutorture: Disable tracing to permit Tasks Rude RCU testing
  rcu: Add lockdep checks and kernel-doc header to rcu_softirq_qs()
  rcutorture: Enable RCU priority boosting for TREE09
  rcutorture: Dump # online CPUs on insufficient cb-flood laundering
  rcutorture: Dump GP kthread state on insufficient cb-flood laundering
  rcutorture: ASSERT_EXCLUSIVE_WRITER() for ->rtort_pipe_count updates
  rcu-tasks: Make Tasks RCU wait idly for grace-period delays
  bpf: Select new NEED_TASKS_RCU Kconfig option
  arch: Select new NEED_TASKS_RCU Kconfig option
  tracing: Select new NEED_TASKS_RCU Kconfig option
  bpf: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION
  ftrace: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION
  rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION
  srcu: Make Tiny SRCU explicitly disable preemption
  rcu: Create NEED_TASKS_RCU to factor out enablement logic
  rcu: Remove redundant BH disabling in TINY_RCU
  rcu: Make Tiny RCU explicitly disable preemption
  rcu: Remove redundant READ_ONCE() of rcu_state.gp_flags in tree.c
  rcu: Bring diagnostic read of rcu_state.gp_flags into alignment
  rcu: Mark writes to rcu_sync ->gp_count field
  rcu: Mark loads from rcu_state.n_online_cpus
  rcu: Make hotplug operations track GP state, not flags
  rcu: Inform KCSAN of one-byte cmpxchg() in rcu_trc_cmpxchg_need_qs()
  rcu: Remove redundant CONFIG_PROVE_RCU #if condition
  rcu-tasks: Replace exit_tasks_rcu_start() initialization with
    WARN_ON_ONCE()
  rcutorture: Remove extraneous rcu_torture_pipe_update_one()
    READ_ONCE()
  rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment
  torture: Scale --do-kvfree test time

Uladzislau Rezki (Sony) (6):
  rcu: Add data structures for synchronize_rcu()
  rcu: Update lockdep while in RCU read-side critical section
  rcu: Reduce synchronize_rcu() latency
  rcu: Add a trace event for synchronize_rcu_normal()
  rcu: Support direct wake-up of synchronize_rcu() users
  rcu: Allocate WQ with WQ_MEM_RECLAIM bit set

Zenghui Yu (1):
  doc: Remove references to arrayRCU.rst

Zqiang (7):
  rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer
  rcutorture: Use the gp_kthread_dbg operation specified by cur_ops
  rcutorture: Make rcutorture support print rcu-tasks gp state
  rcutorture: Removing redundant function pointer initialization
  rcutorture: Make stall-tasks directly exit when rcutorture tests end
  rcutorture: Fix invalid context warning when enable srcu barrier
    testing
  rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype
    test

linke li (1):
  rcutorture: Re-use value stored to ->rtort_pipe_count instead of
    re-reading

 .mailmap                                              |    3 -
 Documentation/RCU/whatisRCU.rst                       |    6 +-
 Documentation/admin-guide/kernel-parameters.txt       |   14 ++++
 MAINTAINERS                                           |    2
 arch/Kconfig                                          |    4 -
 include/linux/rcupdate.h                              |    2
 include/linux/rcupdate_wait.h                         |   18 +++---
 include/linux/srcutiny.h                              |    2
 include/trace/events/rcu.h                            |   27 +++++++++
 kernel/bpf/Kconfig                                    |    2
 kernel/bpf/trampoline.c                               |    2
 kernel/rcu/Kconfig                                    |    2
 kernel/rcu/rcu.h                                      |   20 +++---
 kernel/rcu/rcutorture.c                               |    4 -
 kernel/rcu/srcutiny.c                                 |   31 ++++++++--
 kernel/rcu/srcutree.c                                 |    5 -
 kernel/rcu/sync.c                                     |    8 ++
 kernel/rcu/tasks.h                                    |    6 +-
 kernel/rcu/tiny.c                                     |    2
 kernel/rcu/tree.c                                     |   28 +++++++++
 kernel/rcu/tree.h                                     |   14 ++++
 kernel/rcu/tree_exp.h                                 |    2
 kernel/rcu/tree_plugin.h                              |    4 -
 kernel/rcu/tree_stall.h                               |    2
 kernel/rcu/update.c                                   |    4 -
 kernel/trace/Kconfig                                  |    4 -
 kernel/trace/ftrace.c                                 |    3 -
 tools/testing/selftests/rcutorture/bin/torture.sh     |    2
 tools/testing/selftests/rcutorture/configs/rcu/TREE09 |    5 +
 include/linux/rcupdate.h                                |   20 +++++-
 kernel/rcu/Kconfig                                      |    6 +-
 kernel/rcu/rcutorture.c                                 |   81 +++++++++++++++------------
 kernel/rcu/tasks.h                                      |   38 +++++++++++-
 kernel/rcu/tiny.c                                       |    2
 kernel/rcu/tree.c                                       |  408 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------
 kernel/rcu/tree.h                                       |   10 ++-
 kernel/rcu/tree_stall.h                                 |    9 ++-
 tools/testing/selftests/rcutorture/bin/torture.sh       |    4 -
 38 files changed, 666 insertions(+), 140 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 01/48] scftorture: Increase memory provided to guest OS
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 02/48] rcutorture: Disable tracing to permit Tasks Rude RCU testing Uladzislau Rezki (Sony)
                   ` (46 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

The tradition, extending back almost a full year, has been 2GB plus an
additional number of GBs equal to the number of CPUs divided by sixteen.
This tradition has served scftorture well, even the CONFIG_PREEMPT=y
version running KASAN within guest OSes having 40 CPUs.  However, this
test recently started OOMing on larger systems, and this commit therefore
gives this test an additional GB of memory.

It is quite possible that further testing on larger systems will show
a need to decrease the divisor from 16 to (say) 8, but that is a change
to make once it has been demonstrated to be required.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 tools/testing/selftests/rcutorture/bin/torture.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/rcutorture/bin/torture.sh b/tools/testing/selftests/rcutorture/bin/torture.sh
index bbac5f4b03d0..42f0aee09e51 100755
--- a/tools/testing/selftests/rcutorture/bin/torture.sh
+++ b/tools/testing/selftests/rcutorture/bin/torture.sh
@@ -425,7 +425,7 @@ fi
 if test "$do_scftorture" = "yes"
 then
 	# Scale memory based on the number of CPUs.
-	scfmem=$((2+HALF_ALLOTED_CPUS/16))
+	scfmem=$((3+HALF_ALLOTED_CPUS/16))
 	torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
 	torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
 fi
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 02/48] rcutorture: Disable tracing to permit Tasks Rude RCU testing
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 01/48] scftorture: Increase memory provided to guest OS Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 03/48] rcu: Add lockdep checks and kernel-doc header to rcu_softirq_qs() Uladzislau Rezki (Sony)
                   ` (45 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

Now that the KPROBES, TRACING, BLK_DEV_IO_TRACE, and UPROBE_EVENTS
Kconfig options select the TASKS_TRACE_RCU option, the torture.sh tests
of enabling exactly one of the RCU Tasks flavors fail.  This commit
therefore disables these options to allow this testing to succeed.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 tools/testing/selftests/rcutorture/bin/torture.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/rcutorture/bin/torture.sh b/tools/testing/selftests/rcutorture/bin/torture.sh
index 42f0aee09e51..13875ee7b050 100755
--- a/tools/testing/selftests/rcutorture/bin/torture.sh
+++ b/tools/testing/selftests/rcutorture/bin/torture.sh
@@ -391,7 +391,7 @@ __EOF__
 		forceflavor="`echo $flavor | sed -e 's/^CONFIG/CONFIG_FORCE/'`"
 		deselectedflavors="`grep -v $flavor $T/rcutasksflavors | tr '\012' ' ' | tr -s ' ' | sed -e 's/ *$//'`"
 		echo " --- Running RCU Tasks Trace flavor $flavor `date`" >> $rtfdir/log
-		tools/testing/selftests/rcutorture/bin/kvm.sh --datestamp "$ds/results-rcutasksflavors/$flavor" --buildonly --configs "TINY01 TREE04" --kconfig "CONFIG_RCU_EXPERT=y CONFIG_RCU_SCALE_TEST=y $forceflavor=y $deselectedflavors" --trust-make > $T/$flavor.out 2>&1
+		tools/testing/selftests/rcutorture/bin/kvm.sh --datestamp "$ds/results-rcutasksflavors/$flavor" --buildonly --configs "TINY01 TREE04" --kconfig "CONFIG_RCU_EXPERT=y CONFIG_RCU_SCALE_TEST=y CONFIG_KPROBES=n CONFIG_RCU_TRACE=n CONFIG_TRACING=n CONFIG_BLK_DEV_IO_TRACE=n CONFIG_UPROBE_EVENTS=n $forceflavor=y $deselectedflavors" --trust-make > $T/$flavor.out 2>&1
 		retcode=$?
 		if test "$retcode" -ne 0
 		then
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 03/48] rcu: Add lockdep checks and kernel-doc header to rcu_softirq_qs()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 01/48] scftorture: Increase memory provided to guest OS Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 02/48] rcutorture: Disable tracing to permit Tasks Rude RCU testing Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 04/48] rcutorture: Enable RCU priority boosting for TREE09 Uladzislau Rezki (Sony)
                   ` (44 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Eric Dumazet, Jakub Kicinski, David S. Miller, Yan Zhai, netdev

From: "Paul E. McKenney" <paulmck@kernel.org>

There is some indications that rcu_softirq_qs() might be more generally
used than anticipated.  This commit therefore adds some lockdep assertions
and some cautionary tales in a new kernel-doc header.

Link: https://lore.kernel.org/all/Zd4DXTyCf17lcTfq@debian.debian/

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Yan Zhai <yan@cloudflare.com>
Cc: <netdev@vger.kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d9642dd06c25..2795a1457acf 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -240,8 +240,36 @@ static long rcu_get_n_cbs_cpu(int cpu)
 	return 0;
 }
 
+/**
+ * rcu_softirq_qs - Provide a set of RCU quiescent states in softirq processing
+ *
+ * Mark a quiescent state for RCU, Tasks RCU, and Tasks Trace RCU.
+ * This is a special-purpose function to be used in the softirq
+ * infrastructure and perhaps the occasional long-running softirq
+ * handler.
+ *
+ * Note that from RCU's viewpoint, a call to rcu_softirq_qs() is
+ * equivalent to momentarily completely enabling preemption.  For
+ * example, given this code::
+ *
+ *	local_bh_disable();
+ *	do_something();
+ *	rcu_softirq_qs();  // A
+ *	do_something_else();
+ *	local_bh_enable();  // B
+ *
+ * A call to synchronize_rcu() that began concurrently with the
+ * call to do_something() would be guaranteed to wait only until
+ * execution reached statement A.  Without that rcu_softirq_qs(),
+ * that same synchronize_rcu() would instead be guaranteed to wait
+ * until execution reached statement B.
+ */
 void rcu_softirq_qs(void)
 {
+	RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) ||
+			 lock_is_held(&rcu_lock_map) ||
+			 lock_is_held(&rcu_sched_lock_map),
+			 "Illegal rcu_softirq_qs() in RCU read-side critical section");
 	rcu_qs();
 	rcu_preempt_deferred_qs(current);
 	rcu_tasks_qs(current, false);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 04/48] rcutorture: Enable RCU priority boosting for TREE09
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (2 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 03/48] rcu: Add lockdep checks and kernel-doc header to rcu_softirq_qs() Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 05/48] rcutorture: Dump # online CPUs on insufficient cb-flood laundering Uladzislau Rezki (Sony)
                   ` (43 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

The TREE09 rcutorture scenario exhausts memory from time to time, and
this is due to a reader being preempted and blocking grace periods,
thus preventing recycling of the memory used in callback-flooding tests.
This commit therefore enables RCU priority boosting and sets the boosting
delay to 100 milliseconds after grace-period start.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 tools/testing/selftests/rcutorture/configs/rcu/TREE09 | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE09 b/tools/testing/selftests/rcutorture/configs/rcu/TREE09
index fc45645bb5f4..9ecd1b4e653d 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE09
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE09
@@ -10,8 +10,9 @@ CONFIG_NO_HZ_FULL=n
 CONFIG_RCU_TRACE=n
 CONFIG_RCU_NOCB_CPU=n
 CONFIG_DEBUG_LOCK_ALLOC=n
-CONFIG_RCU_BOOST=n
+CONFIG_RCU_BOOST=y
+CONFIG_RCU_BOOST_DELAY=100
 CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
-#CHECK#CONFIG_RCU_EXPERT=n
+CONFIG_RCU_EXPERT=y
 CONFIG_KPROBES=n
 CONFIG_FTRACE=n
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 05/48] rcutorture: Dump # online CPUs on insufficient cb-flood laundering
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (3 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 04/48] rcutorture: Enable RCU priority boosting for TREE09 Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 06/48] rcutorture: Dump GP kthread state " Uladzislau Rezki (Sony)
                   ` (42 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit adds the number of online CPUs to the state dump following
an unsuccesful callback-flood test.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 45d6b4c3d199..6611ef3e71c3 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2833,12 +2833,12 @@ static void rcu_torture_fwd_prog_cr(struct rcu_fwd *rfp)
 	if (!torture_must_stop() && !READ_ONCE(rcu_fwd_emergency_stop) &&
 	    !shutdown_time_arrived()) {
 		WARN_ON(n_max_gps < MIN_FWD_CBS_LAUNDERED);
-		pr_alert("%s Duration %lu barrier: %lu pending %ld n_launders: %ld n_launders_sa: %ld n_max_gps: %ld n_max_cbs: %ld cver %ld gps %ld\n",
+		pr_alert("%s Duration %lu barrier: %lu pending %ld n_launders: %ld n_launders_sa: %ld n_max_gps: %ld n_max_cbs: %ld cver %ld gps %ld #online %u\n",
 			 __func__,
 			 stoppedat - rfp->rcu_fwd_startat, jiffies - stoppedat,
 			 n_launders + n_max_cbs - n_launders_cb_snap,
 			 n_launders, n_launders_sa,
-			 n_max_gps, n_max_cbs, cver, gps);
+			 n_max_gps, n_max_cbs, cver, gps, num_online_cpus());
 		atomic_long_add(n_max_cbs, &rcu_fwd_max_cbs);
 		mutex_lock(&rcu_fwd_mutex); // Serialize histograms.
 		rcu_torture_fwd_cb_hist(rfp);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 06/48] rcutorture: Dump GP kthread state on insufficient cb-flood laundering
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (4 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 05/48] rcutorture: Dump # online CPUs on insufficient cb-flood laundering Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 07/48] rcutorture: ASSERT_EXCLUSIVE_WRITER() for ->rtort_pipe_count updates Uladzislau Rezki (Sony)
                   ` (41 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

If a callback flood prevents grace period from completing, rcutorture
does a WARN_ON().  Avoiding this WARN_ON() currently requires that at
least three grace periods elapse during an eight-second callback-flood
interval.  Unfortunately, the current debug information does not include
anything about the grace-period state.  This commit therefore adds a
call to cur_ops->gp_kthread_dbg(), if this function pointer is non-NULL.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 6611ef3e71c3..eff51a26216f 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2832,7 +2832,8 @@ static void rcu_torture_fwd_prog_cr(struct rcu_fwd *rfp)
 
 	if (!torture_must_stop() && !READ_ONCE(rcu_fwd_emergency_stop) &&
 	    !shutdown_time_arrived()) {
-		WARN_ON(n_max_gps < MIN_FWD_CBS_LAUNDERED);
+		if (WARN_ON(n_max_gps < MIN_FWD_CBS_LAUNDERED) && cur_ops->gp_kthread_dbg)
+			cur_ops->gp_kthread_dbg();
 		pr_alert("%s Duration %lu barrier: %lu pending %ld n_launders: %ld n_launders_sa: %ld n_max_gps: %ld n_max_cbs: %ld cver %ld gps %ld #online %u\n",
 			 __func__,
 			 stoppedat - rfp->rcu_fwd_startat, jiffies - stoppedat,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 07/48] rcutorture: ASSERT_EXCLUSIVE_WRITER() for ->rtort_pipe_count updates
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (5 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 06/48] rcutorture: Dump GP kthread state " Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 08/48] rcu-tasks: Make Tasks RCU wait idly for grace-period delays Uladzislau Rezki (Sony)
                   ` (40 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Linus Torvalds

From: "Paul E. McKenney" <paulmck@kernel.org>

It turns out that only one CPU at a time will ever invoke
rcu_torture_pipe_update_one() on a given rcu_torture structure.
This commit therefore adds three ASSERT_EXCLUSIVE_WRITER() calls
to enlist KCSAN's aid in checking this.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index eff51a26216f..d8c12eba35b7 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -466,6 +466,7 @@ rcu_torture_pipe_update_one(struct rcu_torture *rp)
 		i = RCU_TORTURE_PIPE_LEN;
 	atomic_inc(&rcu_torture_wcount[i]);
 	WRITE_ONCE(rp->rtort_pipe_count, i + 1);
+	ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count);
 	if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
 		rp->rtort_mbtest = 0;
 		return true;
@@ -1399,6 +1400,7 @@ rcu_torture_writer(void *arg)
 		if (rp == NULL)
 			continue;
 		rp->rtort_pipe_count = 0;
+		ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count);
 		rcu_torture_writer_state = RTWS_DELAY;
 		udelay(torture_random(&rand) & 0x3ff);
 		rcu_torture_writer_state = RTWS_REPLACE;
@@ -1414,6 +1416,7 @@ rcu_torture_writer(void *arg)
 			atomic_inc(&rcu_torture_wcount[i]);
 			WRITE_ONCE(old_rp->rtort_pipe_count,
 				   old_rp->rtort_pipe_count + 1);
+			ASSERT_EXCLUSIVE_WRITER(old_rp->rtort_pipe_count);
 
 			// Make sure readers block polled grace periods.
 			if (cur_ops->get_gp_state && cur_ops->poll_gp_state) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 08/48] rcu-tasks: Make Tasks RCU wait idly for grace-period delays
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (6 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 07/48] rcutorture: ASSERT_EXCLUSIVE_WRITER() for ->rtort_pipe_count updates Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 09/48] doc: Remove references to arrayRCU.rst Uladzislau Rezki (Sony)
                   ` (39 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Breno Leitao

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, all waits for grace periods sleep at TASK_UNINTERRUPTIBLE,
regardless of RCU flavor.  This has worked well, but there have been
cases where a longer-than-average Tasks RCU grace period has triggered
softlockup splats, many of them, before the Tasks RCU CPU stall warning
appears.  These softlockup splats unnecessarily consume console bandwidth
and complicate diagnosis of the underlying problem.  Plus a long but not
pathologically long Tasks RCU grace period might trigger a few softlockup
splats before completing normally, which generates noise for no good
reason.

This commit therefore causes Tasks RCU grace periods to sleep at TASK_IDLE
priority.  If there really is a persistent problem, the eventual Tasks
RCU CPU stall warning will flag it, and without the extra noise.

Reported-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/rcupdate_wait.h | 18 +++++++++---------
 kernel/rcu/tasks.h            |  6 +++++-
 kernel/rcu/update.c           |  4 ++--
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/include/linux/rcupdate_wait.h b/include/linux/rcupdate_wait.h
index d07f0848802e..303ab9bee155 100644
--- a/include/linux/rcupdate_wait.h
+++ b/include/linux/rcupdate_wait.h
@@ -19,18 +19,18 @@ struct rcu_synchronize {
 };
 void wakeme_after_rcu(struct rcu_head *head);
 
-void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
+void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array,
 		   struct rcu_synchronize *rs_array);
 
-#define _wait_rcu_gp(checktiny, ...) \
-do {									\
-	call_rcu_func_t __crcu_array[] = { __VA_ARGS__ };		\
-	struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)];	\
-	__wait_rcu_gp(checktiny, ARRAY_SIZE(__crcu_array),		\
-			__crcu_array, __rs_array);			\
+#define _wait_rcu_gp(checktiny, state, ...) \
+do {												\
+	call_rcu_func_t __crcu_array[] = { __VA_ARGS__ };					\
+	struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)];				\
+	__wait_rcu_gp(checktiny, state, ARRAY_SIZE(__crcu_array), __crcu_array, __rs_array);	\
 } while (0)
 
-#define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__)
+#define wait_rcu_gp(...) _wait_rcu_gp(false, TASK_UNINTERRUPTIBLE, __VA_ARGS__)
+#define wait_rcu_gp_state(state, ...) _wait_rcu_gp(false, state, __VA_ARGS__)
 
 /**
  * synchronize_rcu_mult - Wait concurrently for multiple grace periods
@@ -54,7 +54,7 @@ do {									\
  * grace period.
  */
 #define synchronize_rcu_mult(...) \
-	_wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__)
+	_wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), TASK_UNINTERRUPTIBLE, __VA_ARGS__)
 
 static inline void cond_resched_rcu(void)
 {
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 147b5945d67a..82e458ea0728 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -74,6 +74,7 @@ struct rcu_tasks_percpu {
  * @holdouts_func: This flavor's holdout-list scan function (optional).
  * @postgp_func: This flavor's post-grace-period function (optional).
  * @call_func: This flavor's call_rcu()-equivalent function.
+ * @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE).
  * @rtpcpu: This flavor's rcu_tasks_percpu structure.
  * @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks.
  * @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing.
@@ -107,6 +108,7 @@ struct rcu_tasks {
 	holdouts_func_t holdouts_func;
 	postgp_func_t postgp_func;
 	call_rcu_func_t call_func;
+	unsigned int wait_state;
 	struct rcu_tasks_percpu __percpu *rtpcpu;
 	int percpu_enqueue_shift;
 	int percpu_enqueue_lim;
@@ -134,6 +136,7 @@ static struct rcu_tasks rt_name =							\
 	.tasks_gp_mutex = __MUTEX_INITIALIZER(rt_name.tasks_gp_mutex),			\
 	.gp_func = gp,									\
 	.call_func = call,								\
+	.wait_state = TASK_UNINTERRUPTIBLE,						\
 	.rtpcpu = &rt_name ## __percpu,							\
 	.lazy_jiffies = DIV_ROUND_UP(HZ, 4),						\
 	.name = n,									\
@@ -638,7 +641,7 @@ static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
 
 	// If the grace-period kthread is running, use it.
 	if (READ_ONCE(rtp->kthread_ptr)) {
-		wait_rcu_gp(rtp->call_func);
+		wait_rcu_gp_state(rtp->wait_state, rtp->call_func);
 		return;
 	}
 	rcu_tasks_one_gp(rtp, true);
@@ -1160,6 +1163,7 @@ static int __init rcu_spawn_tasks_kthread(void)
 	rcu_tasks.postscan_func = rcu_tasks_postscan;
 	rcu_tasks.holdouts_func = check_all_holdout_tasks;
 	rcu_tasks.postgp_func = rcu_tasks_postgp;
+	rcu_tasks.wait_state = TASK_IDLE;
 	rcu_spawn_tasks_kthread_generic(&rcu_tasks);
 	return 0;
 }
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 46aaaa9fe339..f8436969e0c8 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -408,7 +408,7 @@ void wakeme_after_rcu(struct rcu_head *head)
 }
 EXPORT_SYMBOL_GPL(wakeme_after_rcu);
 
-void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
+void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array,
 		   struct rcu_synchronize *rs_array)
 {
 	int i;
@@ -440,7 +440,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
 			if (crcu_array[j] == crcu_array[i])
 				break;
 		if (j == i) {
-			wait_for_completion(&rs_array[i].completion);
+			wait_for_completion_state(&rs_array[i].completion, state);
 			destroy_rcu_head_on_stack(&rs_array[i].head);
 		}
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 09/48] doc: Remove references to arrayRCU.rst
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (7 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 08/48] rcu-tasks: Make Tasks RCU wait idly for grace-period delays Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 10/48] MAINTAINERS: Update Neeraj's email address Uladzislau Rezki (Sony)
                   ` (38 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zenghui Yu

From: Zenghui Yu <zenghui.yu@linux.dev>

arrayRCU.rst has been removed since commit ef2555cf68c3 ("doc: Remove
arrayRCU.rst") but is still referenced by whatisRCU.rst. Update it to
reflect the current state of the documentation.

Signed-off-by: Zenghui Yu <zenghui.yu@linux.dev>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 Documentation/RCU/whatisRCU.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst
index 872ac665223f..94838c65c7d9 100644
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -427,7 +427,7 @@ their assorted primitives.
 
 This section shows a simple use of the core RCU API to protect a
 global pointer to a dynamically allocated structure.  More-typical
-uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst.
+uses of RCU may be found in listRCU.rst and NMI-RCU.rst.
 ::
 
 	struct foo {
@@ -510,8 +510,8 @@ So, to sum up:
 	data item.
 
 See checklist.rst for additional rules to follow when using RCU.
-And again, more-typical uses of RCU may be found in listRCU.rst,
-arrayRCU.rst, and NMI-RCU.rst.
+And again, more-typical uses of RCU may be found in listRCU.rst
+and NMI-RCU.rst.
 
 .. _4_whatisRCU:
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 10/48] MAINTAINERS: Update Neeraj's email address
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (8 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 09/48] doc: Remove references to arrayRCU.rst Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 11/48] bpf: Select new NEED_TASKS_RCU Kconfig option Uladzislau Rezki (Sony)
                   ` (37 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Neeraj Upadhyay

From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>

Update my email-address in MAINTAINERS and .mailmap entries to my
kernel.org account.

Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Reviewed-by: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 .mailmap    | 3 ++-
 MAINTAINERS | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/.mailmap b/.mailmap
index 59c9a841bf71..32e12c26bdda 100644
--- a/.mailmap
+++ b/.mailmap
@@ -445,7 +445,8 @@ Nadav Amit <nadav.amit@gmail.com> <namit@cs.technion.ac.il>
 Nadia Yvette Chambers <nyc@holomorphy.com> William Lee Irwin III <wli@holomorphy.com>
 Naoya Horiguchi <naoya.horiguchi@nec.com> <n-horiguchi@ah.jp.nec.com>
 Nathan Chancellor <nathan@kernel.org> <natechancellor@gmail.com>
-Neeraj Upadhyay <quic_neeraju@quicinc.com> <neeraju@codeaurora.org>
+Neeraj Upadhyay <neeraj.upadhyay@kernel.org> <quic_neeraju@quicinc.com>
+Neeraj Upadhyay <neeraj.upadhyay@kernel.org> <neeraju@codeaurora.org>
 Neil Armstrong <neil.armstrong@linaro.org> <narmstrong@baylibre.com>
 Nguyen Anh Quynh <aquynh@gmail.com>
 Nicholas Piggin <npiggin@gmail.com> <npiggen@suse.de>
diff --git a/MAINTAINERS b/MAINTAINERS
index 7c121493f43d..0370e571f312 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18591,7 +18591,7 @@ F:	tools/testing/selftests/resctrl/
 READ-COPY UPDATE (RCU)
 M:	"Paul E. McKenney" <paulmck@kernel.org>
 M:	Frederic Weisbecker <frederic@kernel.org> (kernel/rcu/tree_nocb.h)
-M:	Neeraj Upadhyay <quic_neeraju@quicinc.com> (kernel/rcu/tasks.h)
+M:	Neeraj Upadhyay <neeraj.upadhyay@kernel.org> (kernel/rcu/tasks.h)
 M:	Joel Fernandes <joel@joelfernandes.org>
 M:	Josh Triplett <josh@joshtriplett.org>
 M:	Boqun Feng <boqun.feng@gmail.com>
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 11/48] bpf: Select new NEED_TASKS_RCU Kconfig option
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (9 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 10/48] MAINTAINERS: Update Neeraj's email address Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 12/48] arch: " Uladzislau Rezki (Sony)
                   ` (36 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Ankur Arora, Thomas Gleixner, Steven Rostedt, Mark Rutland

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, if a Kconfig option depends on TASKS_RCU, it conditionally does
"select TASKS_RCU if PREEMPTION".  This works, but requires any change in
this enablement logic to be replicated across all such "select" clauses.
A new NEED_TASKS_RCU Kconfig option has been created to allow this
enablement logic to be in one place in kernel/rcu/Kconfig.

Therefore, make BPF select the new NEED_TASKS_RCU Kconfig option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: <bpf@vger.kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/bpf/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
index bc25f5098a25..4100df44c665 100644
--- a/kernel/bpf/Kconfig
+++ b/kernel/bpf/Kconfig
@@ -28,7 +28,7 @@ config BPF_SYSCALL
 	bool "Enable bpf() system call"
 	select BPF
 	select IRQ_WORK
-	select TASKS_RCU if PREEMPTION
+	select NEED_TASKS_RCU
 	select TASKS_TRACE_RCU
 	select BINARY_PRINTF
 	select NET_SOCK_MSG if NET
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 12/48] arch: Select new NEED_TASKS_RCU Kconfig option
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (10 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 11/48] bpf: Select new NEED_TASKS_RCU Kconfig option Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 13/48] rcu: Add data structures for synchronize_rcu() Uladzislau Rezki (Sony)
                   ` (35 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Andrew Morton, Thomas Gleixner, Heiko Carstens, Arnd Bergmann,
	Douglas Anderson, Ankur Arora, Mark Rutland, Steven Rostedt

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, if a Kconfig option depends on TASKS_RCU, it conditionally does
"select TASKS_RCU if PREEMPTION".  This works, but requires any change in
this enablement logic to be replicated across all such "select" clauses.
A new NEED_TASKS_RCU Kconfig option has been created to allow this
enablement logic to be in one place in kernel/rcu/Kconfig.

Therefore, select the new NEED_TASKS_RCU Kconfig option instead of the
old TASKS_RCU option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 arch/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 9f066785bb71..ae4a4f37bbf0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -55,7 +55,7 @@ config KPROBES
 	depends on MODULES
 	depends on HAVE_KPROBES
 	select KALLSYMS
-	select TASKS_RCU if PREEMPTION
+	select NEED_TASKS_RCU
 	help
 	  Kprobes allows you to trap at almost any kernel address and
 	  execute a callback function.  register_kprobe() establishes
@@ -104,7 +104,7 @@ config STATIC_CALL_SELFTEST
 config OPTPROBES
 	def_bool y
 	depends on KPROBES && HAVE_OPTPROBES
-	select TASKS_RCU if PREEMPTION
+	select NEED_TASKS_RCU
 
 config KPROBES_ON_FTRACE
 	def_bool y
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 13/48] rcu: Add data structures for synchronize_rcu()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (11 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 12/48] arch: " Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 14/48] tracing: Select new NEED_TASKS_RCU Kconfig option Uladzislau Rezki (Sony)
                   ` (34 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

The synchronize_rcu() call is going to be reworked, thus
this patch adds dedicated fields into the rcu_state structure.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index df48160b3136..b942b9437438 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -315,6 +315,13 @@ do {									\
 	__set_current_state(TASK_RUNNING);				\
 } while (0)
 
+#define SR_NORMAL_GP_WAIT_HEAD_MAX 5
+
+struct sr_wait_node {
+	atomic_t inuse;
+	struct llist_node node;
+};
+
 /*
  * RCU global state, including node hierarchy.  This hierarchy is
  * represented in "heap" form in a dense array.  The root (first level)
@@ -400,6 +407,13 @@ struct rcu_state {
 						/* Synchronize offline with */
 						/*  GP pre-initialization. */
 	int nocb_is_setup;			/* nocb is setup from boot */
+
+	/* synchronize_rcu() part. */
+	struct llist_head srs_next;	/* request a GP users. */
+	struct llist_node *srs_wait_tail; /* wait for GP users. */
+	struct llist_node *srs_done_tail; /* ready for GP users. */
+	struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX];
+	struct work_struct srs_cleanup_work;
 };
 
 /* Values for rcu_state structure's gp_flags field. */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 14/48] tracing: Select new NEED_TASKS_RCU Kconfig option
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (12 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 13/48] rcu: Add data structures for synchronize_rcu() Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 15/48] bpf: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION Uladzislau Rezki (Sony)
                   ` (33 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	linux-trace-kernel, Ankur Arora, Thomas Gleixner, Mark Rutland

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, if a Kconfig option depends on TASKS_RCU, it conditionally does
"select TASKS_RCU if PREEMPTION".  This works, but requires any change in
this enablement logic to be replicated across all such "select" clauses.
A new NEED_TASKS_RCU Kconfig option has been created to allow this
enablement logic to be in one place in kernel/rcu/Kconfig.

Therefore, select the new NEED_TASKS_RCU Kconfig option instead of the
old TASKS_RCU option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <linux-trace-kernel@vger.kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/trace/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 61c541c36596..6cdc5ff919b0 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -163,7 +163,7 @@ config TRACING
 	select BINARY_PRINTF
 	select EVENT_TRACING
 	select TRACE_CLOCK
-	select TASKS_RCU if PREEMPTION
+	select NEED_TASKS_RCU
 
 config GENERIC_TRACER
 	bool
@@ -204,7 +204,7 @@ config FUNCTION_TRACER
 	select GENERIC_TRACER
 	select CONTEXT_SWITCH_TRACER
 	select GLOB
-	select TASKS_RCU if PREEMPTION
+	select NEED_TASKS_RCU
 	select TASKS_RUDE_RCU
 	help
 	  Enable the kernel to trace every kernel function. This is done
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 15/48] bpf: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (13 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 14/48] tracing: Select new NEED_TASKS_RCU Kconfig option Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 16/48] ftrace: " Uladzislau Rezki (Sony)
                   ` (32 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Ankur Arora,
	Thomas Gleixner, bpf

From: "Paul E. McKenney" <paulmck@kernel.org>

The advent of CONFIG_PREEMPT_AUTO, AKA lazy preemption, will mean that
even kernels built with CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY
might see the occasional preemption, and that this preemption just might
happen within a trampoline.

Therefore, update bpf_tramp_image_put() to choose call_rcu_tasks()
based on CONFIG_TASKS_RCU instead of CONFIG_PREEMPTION.

This change might enable further simplifications, but the goal of this
effort is to make the code safe, not necessarily optimal.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/bpf/trampoline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index db7599c59c78..88673a4267eb 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -333,7 +333,7 @@ static void bpf_tramp_image_put(struct bpf_tramp_image *im)
 		int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_JUMP,
 					     NULL, im->ip_epilogue);
 		WARN_ON(err);
-		if (IS_ENABLED(CONFIG_PREEMPTION))
+		if (IS_ENABLED(CONFIG_TASKS_RCU))
 			call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu_tasks);
 		else
 			percpu_ref_kill(&im->pcref);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 16/48] ftrace: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (14 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 15/48] bpf: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:34 ` [PATCH 17/48] rcu: Update lockdep while in RCU read-side critical section Uladzislau Rezki (Sony)
                   ` (31 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Steven Rostedt, Masami Hiramatsu, Mark Rutland,
	Mathieu Desnoyers, Ankur Arora, Thomas Gleixner,
	linux-trace-kernel

From: "Paul E. McKenney" <paulmck@kernel.org>

The advent of CONFIG_PREEMPT_AUTO, AKA lazy preemption, will mean that
even kernels built with CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY
might see the occasional preemption, and that this preemption just might
happen within a trampoline.

Therefore, update ftrace_shutdown() to invoke synchronize_rcu_tasks()
based on CONFIG_TASKS_RCU instead of CONFIG_PREEMPTION.

[ paulmck: Apply Steven Rostedt feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <linux-trace-kernel@vger.kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/trace/ftrace.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index da1710499698..6c96b30f3d63 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3157,8 +3157,7 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command)
 		 * synchronize_rcu_tasks() will wait for those tasks to
 		 * execute and either schedule voluntarily or enter user space.
 		 */
-		if (IS_ENABLED(CONFIG_PREEMPTION))
-			synchronize_rcu_tasks();
+		synchronize_rcu_tasks();
 
 		ftrace_trampoline_free(ops);
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 17/48] rcu: Update lockdep while in RCU read-side critical section
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (15 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 16/48] ftrace: " Uladzislau Rezki (Sony)
@ 2024-05-07  9:34 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 18/48] rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION Uladzislau Rezki (Sony)
                   ` (30 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:34 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Ankur Arora, Thomas Gleixner

With Ankur's lazy-/auto-preemption patches applied and with a
lazy-preemptible kernel in combination with a non-preemptible RCU,
lockdep sometimes complains about context switches within RCU read-side
critical sections.  This is a false positive due to rcu_read_unlock()
updating lockdep state too late:

	__release(RCU);
	__rcu_read_unlock();
	// Context switch here results in lockdep false positive!!!
	rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */

Although this complaint could also happen with preemptible RCU
in a preemptible kernel, the odds of that happening aer quite low.
In constrast, with non-preemptible RCU, a long critical section has a
high probability of performing a context switch from the preempt_enable()
in __rcu_read_unlock().

The fix is straightforward, just move the rcu_lock_release()
within rcu_read_unlock() to obtain the reverse order from that of
rcu_read_lock():

	rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
	__release(RCU);
	__rcu_read_unlock();

This commit makes this change.

Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Co-developed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/rcupdate.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 17d7ed5f3ae6..2c54750e36a0 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -809,9 +809,9 @@ static inline void rcu_read_unlock(void)
 {
 	RCU_LOCKDEP_WARN(!rcu_is_watching(),
 			 "rcu_read_unlock() used illegally while idle");
+	rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
 	__release(RCU);
 	__rcu_read_unlock();
-	rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
 }
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 18/48] rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (16 preceding siblings ...)
  2024-05-07  9:34 ` [PATCH 17/48] rcu: Update lockdep while in RCU read-side critical section Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 19/48] srcu: Make Tiny SRCU explicitly disable preemption Uladzislau Rezki (Sony)
                   ` (29 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Ankur Arora, Thomas Gleixner

From: "Paul E. McKenney" <paulmck@kernel.org>

Right now, TINY_RCU depends on (!PREEMPTION && !SMP), which has served the
kernel well for many years due to the fact that PREEMPT_RCU is normally
a synonym for PREEMPTION.  But with the advent of lazy preemption,
it will be possible to have non-preemptible RCU in a preemptible kernel,
so that kernels could be built with PREEMPT_RCU=n and PREEMPTION=y.

This commit therefore makes TINY_RCU depend on (!PREEMPT_RCU && !SMP),
thus allowing for a non-preemptible RCU in preemptible kernels.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index e7d2dd267593..7dca0138260c 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -31,7 +31,7 @@ config PREEMPT_RCU
 
 config TINY_RCU
 	bool
-	default y if !PREEMPTION && !SMP
+	default y if !PREEMPT_RCU && !SMP
 	help
 	  This option selects the RCU implementation that is
 	  designed for UP systems from which real-time response
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 19/48] srcu: Make Tiny SRCU explicitly disable preemption
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (17 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 18/48] rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 20/48] rcu: Create NEED_TASKS_RCU to factor out enablement logic Uladzislau Rezki (Sony)
                   ` (28 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Ankur Arora, Thomas Gleixner

From: "Paul E. McKenney" <paulmck@kernel.org>

Because Tiny SRCU is used only in kernels built with either
CONFIG_PREEMPT_NONE=y or CONFIG_PREEMPT_VOLUNTARY=y, there has not
been any need for TINY SRCU to explicitly disable preemption.  However,
the prospect of lazy preemption changes that, and the lazy-preemption
patches do result in rcutorture runs finding both too-short grace periods
and grace-period hangs for Tiny SRCU.

This commit therefore adds the needed preempt_disable() and
preempt_enable() calls to Tiny SRCU.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/srcutiny.h |  2 ++
 kernel/rcu/srcutiny.c    | 31 ++++++++++++++++++++++++++-----
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 447133171d95..4d96bbdb45f0 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -64,8 +64,10 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
 {
 	int idx;
 
+	preempt_disable();  // Needed for PREEMPT_AUTO
 	idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
 	WRITE_ONCE(ssp->srcu_lock_nesting[idx], READ_ONCE(ssp->srcu_lock_nesting[idx]) + 1);
+	preempt_enable();
 	return idx;
 }
 
diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
index c38e5933a5d6..5afd5cf494db 100644
--- a/kernel/rcu/srcutiny.c
+++ b/kernel/rcu/srcutiny.c
@@ -96,9 +96,12 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
  */
 void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
 {
-	int newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
+	int newval;
 
+	preempt_disable();  // Needed for PREEMPT_AUTO
+	newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
 	WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
+	preempt_enable();
 	if (!newval && READ_ONCE(ssp->srcu_gp_waiting) && in_task())
 		swake_up_one(&ssp->srcu_wq);
 }
@@ -117,8 +120,11 @@ void srcu_drive_gp(struct work_struct *wp)
 	struct srcu_struct *ssp;
 
 	ssp = container_of(wp, struct srcu_struct, srcu_work);
-	if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
+	preempt_disable();  // Needed for PREEMPT_AUTO
+	if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max))) {
 		return; /* Already running or nothing to do. */
+		preempt_enable();
+	}
 
 	/* Remove recently arrived callbacks and wait for readers. */
 	WRITE_ONCE(ssp->srcu_gp_running, true);
@@ -130,9 +136,12 @@ void srcu_drive_gp(struct work_struct *wp)
 	idx = (ssp->srcu_idx & 0x2) / 2;
 	WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
 	WRITE_ONCE(ssp->srcu_gp_waiting, true);  /* srcu_read_unlock() wakes! */
+	preempt_enable();
 	swait_event_exclusive(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx]));
+	preempt_disable();  // Needed for PREEMPT_AUTO
 	WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
 	WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
+	preempt_enable();
 
 	/* Invoke the callbacks we removed above. */
 	while (lh) {
@@ -150,8 +159,11 @@ void srcu_drive_gp(struct work_struct *wp)
 	 * at interrupt level, but the ->srcu_gp_running checks will
 	 * straighten that out.
 	 */
+	preempt_disable();  // Needed for PREEMPT_AUTO
 	WRITE_ONCE(ssp->srcu_gp_running, false);
-	if (ULONG_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
+	idx = ULONG_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max));
+	preempt_enable();
+	if (idx)
 		schedule_work(&ssp->srcu_work);
 }
 EXPORT_SYMBOL_GPL(srcu_drive_gp);
@@ -160,9 +172,12 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
 {
 	unsigned long cookie;
 
+	preempt_disable();  // Needed for PREEMPT_AUTO
 	cookie = get_state_synchronize_srcu(ssp);
-	if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie))
+	if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie)) {
+		preempt_enable();
 		return;
+	}
 	WRITE_ONCE(ssp->srcu_idx_max, cookie);
 	if (!READ_ONCE(ssp->srcu_gp_running)) {
 		if (likely(srcu_init_done))
@@ -170,6 +185,7 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
 		else if (list_empty(&ssp->srcu_work.entry))
 			list_add(&ssp->srcu_work.entry, &srcu_boot_list);
 	}
+	preempt_enable();
 }
 
 /*
@@ -183,11 +199,13 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
 
 	rhp->func = func;
 	rhp->next = NULL;
+	preempt_disable();  // Needed for PREEMPT_AUTO
 	local_irq_save(flags);
 	*ssp->srcu_cb_tail = rhp;
 	ssp->srcu_cb_tail = &rhp->next;
 	local_irq_restore(flags);
 	srcu_gp_start_if_needed(ssp);
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(call_srcu);
 
@@ -241,9 +259,12 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
  */
 unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
 {
-	unsigned long ret = get_state_synchronize_srcu(ssp);
+	unsigned long ret;
 
+	preempt_disable();  // Needed for PREEMPT_AUTO
+	ret = get_state_synchronize_srcu(ssp);
 	srcu_gp_start_if_needed(ssp);
+	preempt_enable();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 20/48] rcu: Create NEED_TASKS_RCU to factor out enablement logic
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (18 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 19/48] srcu: Make Tiny SRCU explicitly disable preemption Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 21/48] rcu: Remove redundant BH disabling in TINY_RCU Uladzislau Rezki (Sony)
                   ` (27 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Ankur Arora, Thomas Gleixner, Steven Rostedt, Mark Rutland

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, if a Kconfig option depends on TASKS_RCU, it conditionally does
"select TASKS_RCU if PREEMPTION".  This works, but requires any change in
this enablement logic to be replicated across all such "select" clauses.
This commit therefore creates a new NEED_TASKS_RCU Kconfig option so
that the default value of TASKS_RCU can depend on a combination of this
new option and any needed enablement logic, so that this logic is in
one place.

While in the area, also anticipate a likely future change by adding
PREEMPT_AUTO to that logic.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/Kconfig | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 7dca0138260c..3e079de0f5b4 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -85,9 +85,13 @@ config FORCE_TASKS_RCU
 	  idle, and user-mode execution as quiescent states.  Not for
 	  manual selection in most cases.
 
-config TASKS_RCU
+config NEED_TASKS_RCU
 	bool
 	default n
+
+config TASKS_RCU
+	bool
+	default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
 	select IRQ_WORK
 
 config FORCE_TASKS_RUDE_RCU
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 21/48] rcu: Remove redundant BH disabling in TINY_RCU
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (19 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 20/48] rcu: Create NEED_TASKS_RCU to factor out enablement logic Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 22/48] rcu: Make Tiny RCU explicitly disable preemption Uladzislau Rezki (Sony)
                   ` (26 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

The TINY_RCU rcu_process_callbacks() function is only ever invoked from
a softirq handler, which means that BH is already disabled.  This commit
therefore removes the redundant local_bh_disable() and local_bh_ennable()
from this function.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tiny.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 705c0d16850a..4470af926a34 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -130,9 +130,7 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
 		next = list->next;
 		prefetch(next);
 		debug_rcu_head_unqueue(list);
-		local_bh_disable();
 		rcu_reclaim_tiny(list);
-		local_bh_enable();
 		list = next;
 	}
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 22/48] rcu: Make Tiny RCU explicitly disable preemption
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (20 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 21/48] rcu: Remove redundant BH disabling in TINY_RCU Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 23/48] rcu: Remove redundant READ_ONCE() of rcu_state.gp_flags in tree.c Uladzislau Rezki (Sony)
                   ` (25 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Ankur Arora, Thomas Gleixner

From: "Paul E. McKenney" <paulmck@kernel.org>

Because Tiny RCU is used only in kernels built with either
CONFIG_PREEMPT_NONE=y or CONFIG_PREEMPT_VOLUNTARY=y, there has not been
any need for TINY RCU to explicitly disable preemption.  However, the
prospect of lazy preemption changes that, and preemption means that
the non-atomic increment in synchronize_rcu() can be preempted, with
the possibility that one of the increments is lost.  This could cause
failures for users of the APIs that poll RCU grace periods.

This commit therefore adds the needed preempt_disable() and
preempt_enable() call to Tiny RCU.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tiny.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 4470af926a34..4402d6f5f857 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -153,7 +153,9 @@ void synchronize_rcu(void)
 			 lock_is_held(&rcu_lock_map) ||
 			 lock_is_held(&rcu_sched_lock_map),
 			 "Illegal synchronize_rcu() in RCU read-side critical section");
+	preempt_disable();
 	WRITE_ONCE(rcu_ctrlblk.gp_seq, rcu_ctrlblk.gp_seq + 2);
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 23/48] rcu: Remove redundant READ_ONCE() of rcu_state.gp_flags in tree.c
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (21 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 22/48] rcu: Make Tiny RCU explicitly disable preemption Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 24/48] rcu: Bring diagnostic read of rcu_state.gp_flags into alignment Uladzislau Rezki (Sony)
                   ` (24 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Linus Torvalds, Julia Lawall

From: "Paul E. McKenney" <paulmck@kernel.org>

Although it is functionally OK to do READ_ONCE() of a variable that
cannot change, it is confusing and at best an accident waiting to happen.
This commit therefore removes a number of READ_ONCE(rcu_state.gp_flags)
instances from kernel/rcu/tree.c that are not needed due to updates
to this field being excluded by virtue of holding the root rcu_node
structure's ->lock.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/lkml/4857c5ef-bd8f-4670-87ac-0600a1699d05@paulmck-laptop/T/#mccb23c2a4902da4d3c750165329f8de056903c58
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/lkml/4857c5ef-bd8f-4670-87ac-0600a1699d05@paulmck-laptop/T/#md1b5c026584f9c3c7b0fbc9240dd7de584597b73
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2795a1457acf..559f2d0d271f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1463,7 +1463,7 @@ static noinline_for_stack bool rcu_gp_init(void)
 
 	WRITE_ONCE(rcu_state.gp_activity, jiffies);
 	raw_spin_lock_irq_rcu_node(rnp);
-	if (!READ_ONCE(rcu_state.gp_flags)) {
+	if (!rcu_state.gp_flags) {
 		/* Spurious wakeup, tell caller to go back to sleep.  */
 		raw_spin_unlock_irq_rcu_node(rnp);
 		return false;
@@ -1648,8 +1648,7 @@ static void rcu_gp_fqs(bool first_time)
 	/* Clear flag to prevent immediate re-entry. */
 	if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) {
 		raw_spin_lock_irq_rcu_node(rnp);
-		WRITE_ONCE(rcu_state.gp_flags,
-			   READ_ONCE(rcu_state.gp_flags) & ~RCU_GP_FLAG_FQS);
+		WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags & ~RCU_GP_FLAG_FQS);
 		raw_spin_unlock_irq_rcu_node(rnp);
 	}
 }
@@ -1910,8 +1909,7 @@ static void rcu_report_qs_rsp(unsigned long flags)
 {
 	raw_lockdep_assert_held_rcu_node(rcu_get_root());
 	WARN_ON_ONCE(!rcu_gp_in_progress());
-	WRITE_ONCE(rcu_state.gp_flags,
-		   READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS);
+	WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags | RCU_GP_FLAG_FQS);
 	raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(), flags);
 	rcu_gp_kthread_wake();
 }
@@ -2426,8 +2424,7 @@ void rcu_force_quiescent_state(void)
 		raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags);
 		return;  /* Someone beat us to it. */
 	}
-	WRITE_ONCE(rcu_state.gp_flags,
-		   READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS);
+	WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags | RCU_GP_FLAG_FQS);
 	raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags);
 	rcu_gp_kthread_wake();
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 24/48] rcu: Bring diagnostic read of rcu_state.gp_flags into alignment
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (22 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 23/48] rcu: Remove redundant READ_ONCE() of rcu_state.gp_flags in tree.c Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field Uladzislau Rezki (Sony)
                   ` (23 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

This commit adds READ_ONCE() to a lockless diagnostic read from
rcu_state.gp_flags to avoid giving the compiler any chance whatsoever
of confusing the diagnostic state printed.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree_stall.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 5d666428546b..62b2c4858028 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -579,7 +579,7 @@ static void rcu_check_gp_kthread_expired_fqs_timer(void)
 		pr_err("%s kthread timer wakeup didn't happen for %ld jiffies! g%ld f%#x %s(%d) ->state=%#x\n",
 		       rcu_state.name, (jiffies - jiffies_fqs),
 		       (long)rcu_seq_current(&rcu_state.gp_seq),
-		       data_race(rcu_state.gp_flags),
+		       data_race(READ_ONCE(rcu_state.gp_flags)), // Diagnostic read
 		       gp_state_getname(RCU_GP_WAIT_FQS), RCU_GP_WAIT_FQS,
 		       data_race(READ_ONCE(gpk->__state)));
 		pr_err("\tPossible timer handling issue on cpu=%d timer-softirq=%u\n",
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (23 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 24/48] rcu: Bring diagnostic read of rcu_state.gp_flags into alignment Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07 14:54   ` Oleg Nesterov
  2024-05-12 11:19   ` [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless Oleg Nesterov
  2024-05-07  9:35 ` [PATCH 26/48] rcu: Mark loads from rcu_state.n_online_cpus Uladzislau Rezki (Sony)
                   ` (22 subsequent siblings)
  47 siblings, 2 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Oleg Nesterov, Peter Zijlstra

From: "Paul E. McKenney" <paulmck@kernel.org>

The rcu_sync structure's ->gp_count field is updated under the protection
of ->rss_lock, but read locklessly, and KCSAN noted the data race.
This commit therefore uses WRITE_ONCE() to do this update to clearly
document its racy nature.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/sync.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 86df878a2fee..6c2bd9001adc 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
 		 * we are called at early boot time but this shouldn't happen.
 		 */
 	}
-	rsp->gp_count++;
+	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
 	spin_unlock_irq(&rsp->rss_lock);
 
 	if (gp_state == GP_IDLE) {
@@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp)
  */
 void rcu_sync_exit(struct rcu_sync *rsp)
 {
+	int gpc;
+
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
 
 	spin_lock_irq(&rsp->rss_lock);
-	if (!--rsp->gp_count) {
+	gpc = rsp->gp_count - 1;
+	WRITE_ONCE(rsp->gp_count, gpc);
+	if (!gpc) {
 		if (rsp->gp_state == GP_PASSED) {
 			WRITE_ONCE(rsp->gp_state, GP_EXIT);
 			rcu_sync_call(rsp);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 26/48] rcu: Mark loads from rcu_state.n_online_cpus
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (24 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 27/48] rcu: Make hotplug operations track GP state, not flags Uladzislau Rezki (Sony)
                   ` (21 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

The rcu_state.n_online_cpus value is only ever updated by CPU-hotplug
operations, which are serialized.  However, this value is read locklessly.
This commit therefore marks those reads.  While in the area, it also
adds ASSERT_EXCLUSIVE_WRITER() calls just in case parallel CPU hotplug
becomes a thing.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c       | 4 +++-
 kernel/rcu/tree_stall.h | 6 ++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 559f2d0d271f..7149b2d5cdd6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4328,7 +4328,7 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
 // whether spinlocks may be acquired safely.
 static bool rcu_init_invoked(void)
 {
-	return !!rcu_state.n_online_cpus;
+	return !!READ_ONCE(rcu_state.n_online_cpus);
 }
 
 /*
@@ -4538,6 +4538,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	rcu_spawn_rnp_kthreads(rnp);
 	rcu_spawn_cpu_nocb_kthread(cpu);
+	ASSERT_EXCLUSIVE_WRITER(rcu_state.n_online_cpus);
 	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus + 1);
 
 	return 0;
@@ -4806,6 +4807,7 @@ void rcutree_migrate_callbacks(int cpu)
  */
 int rcutree_dead_cpu(unsigned int cpu)
 {
+	ASSERT_EXCLUSIVE_WRITER(rcu_state.n_online_cpus);
 	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
 	// Stop-machine done, so allow nohz_full to disable tick.
 	tick_dep_clear(TICK_DEP_BIT_RCU);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 62b2c4858028..8a2edf6a1ef5 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -628,7 +628,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 		totqlen += rcu_get_n_cbs_cpu(cpu);
 	pr_err("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu ncpus=%d)\n",
 	       smp_processor_id(), (long)(jiffies - gps),
-	       (long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
+	       (long)rcu_seq_current(&rcu_state.gp_seq), totqlen,
+	       data_race(rcu_state.n_online_cpus)); // Diagnostic read
 	if (ndetected) {
 		rcu_dump_cpu_stacks();
 
@@ -689,7 +690,8 @@ static void print_cpu_stall(unsigned long gps)
 		totqlen += rcu_get_n_cbs_cpu(cpu);
 	pr_err("\t(t=%lu jiffies g=%ld q=%lu ncpus=%d)\n",
 		jiffies - gps,
-		(long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
+		(long)rcu_seq_current(&rcu_state.gp_seq), totqlen,
+		data_race(rcu_state.n_online_cpus)); // Diagnostic read
 
 	rcu_check_gp_kthread_expired_fqs_timer();
 	rcu_check_gp_kthread_starvation();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 27/48] rcu: Make hotplug operations track GP state, not flags
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (25 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 26/48] rcu: Mark loads from rcu_state.n_online_cpus Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 28/48] rcu: Inform KCSAN of one-byte cmpxchg() in rcu_trc_cmpxchg_need_qs() Uladzislau Rezki (Sony)
                   ` (20 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, there are rcu_data structure fields named ->rcu_onl_gp_seq
and ->rcu_ofl_gp_seq that track the rcu_state.gp_flags field at the
time of the corresponding CPU's last online or offline operation,
respectively.  However, this information is not particularly useful.
It would be better to instead track the grace period state kept
in rcu_state.gp_state.  This would also be consistent with the
initialization in rcu_boot_init_percpu_data(), which is to RCU_GP_CLEANED
(an rcu_state.gp_state value), and also with the diagnostics in
rcu_implicit_dynticks_qs(), whose format is consistent with an integer,
not a bitmask.

This commit therefore makes this change and changes the names to
->rcu_onl_gp_flags and ->rcu_ofl_gp_flags, respectively.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c        | 12 ++++++------
 kernel/rcu/tree.h        |  4 ++--
 kernel/rcu/tree_plugin.h |  4 ++--
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7149b2d5cdd6..306f55b81d10 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -841,8 +841,8 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 				__func__, rnp1->grplo, rnp1->grphi, rnp1->qsmask, rnp1->qsmaskinit, rnp1->qsmaskinitnext, rnp1->rcu_gp_init_mask);
 		pr_info("%s %d: %c online: %ld(%d) offline: %ld(%d)\n",
 			__func__, rdp->cpu, ".o"[rcu_rdp_cpu_online(rdp)],
-			(long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_flags,
-			(long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_flags);
+			(long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_state,
+			(long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_state);
 		return 1; /* Break things loose after complaining. */
 	}
 
@@ -4420,9 +4420,9 @@ rcu_boot_init_percpu_data(int cpu)
 	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu)));
 	rdp->barrier_seq_snap = rcu_state.barrier_sequence;
 	rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
-	rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED;
+	rdp->rcu_ofl_gp_state = RCU_GP_CLEANED;
 	rdp->rcu_onl_gp_seq = rcu_state.gp_seq;
-	rdp->rcu_onl_gp_flags = RCU_GP_CLEANED;
+	rdp->rcu_onl_gp_state = RCU_GP_CLEANED;
 	rdp->last_sched_clock = jiffies;
 	rdp->cpu = cpu;
 	rcu_boot_init_nocb_percpu_data(rdp);
@@ -4682,7 +4682,7 @@ void rcutree_report_cpu_starting(unsigned int cpu)
 	ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus);
 	rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
 	rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
-	rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
+	rdp->rcu_onl_gp_state = READ_ONCE(rcu_state.gp_state);
 
 	/* An incoming CPU should never be blocking a grace period. */
 	if (WARN_ON_ONCE(rnp->qsmask & mask)) { /* RCU waiting on incoming CPU? */
@@ -4733,7 +4733,7 @@ void rcutree_report_cpu_dead(void)
 	arch_spin_lock(&rcu_state.ofl_lock);
 	raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */
 	rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq);
-	rdp->rcu_ofl_gp_flags = READ_ONCE(rcu_state.gp_flags);
+	rdp->rcu_ofl_gp_state = READ_ONCE(rcu_state.gp_state);
 	if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */
 		/* Report quiescent state -before- changing ->qsmaskinitnext! */
 		rcu_disable_urgency_upon_qs(rdp);
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index df48160b3136..ff4d8b60554b 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -273,9 +273,9 @@ struct rcu_data {
 	bool rcu_iw_pending;		/* Is ->rcu_iw pending? */
 	unsigned long rcu_iw_gp_seq;	/* ->gp_seq associated with ->rcu_iw. */
 	unsigned long rcu_ofl_gp_seq;	/* ->gp_seq at last offline. */
-	short rcu_ofl_gp_flags;		/* ->gp_flags at last offline. */
+	short rcu_ofl_gp_state;		/* ->gp_state at last offline. */
 	unsigned long rcu_onl_gp_seq;	/* ->gp_seq at last online. */
-	short rcu_onl_gp_flags;		/* ->gp_flags at last online. */
+	short rcu_onl_gp_state;		/* ->gp_state at last online. */
 	unsigned long last_fqs_resched;	/* Time of last rcu_resched(). */
 	unsigned long last_sched_clock;	/* Jiffies of last rcu_sched_clock_irq(). */
 	struct rcu_snap_record snap_record; /* Snapshot of core stats at half of */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 36a8b5dbf5b5..340bbefe5f65 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -805,8 +805,8 @@ dump_blkd_tasks(struct rcu_node *rnp, int ncheck)
 		rdp = per_cpu_ptr(&rcu_data, cpu);
 		pr_info("\t%d: %c online: %ld(%d) offline: %ld(%d)\n",
 			cpu, ".o"[rcu_rdp_cpu_online(rdp)],
-			(long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_flags,
-			(long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_flags);
+			(long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_state,
+			(long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_state);
 	}
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 28/48] rcu: Inform KCSAN of one-byte cmpxchg() in rcu_trc_cmpxchg_need_qs()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (26 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 27/48] rcu: Make hotplug operations track GP state, not flags Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 29/48] rcu: Remove redundant CONFIG_PROVE_RCU #if condition Uladzislau Rezki (Sony)
                   ` (19 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Marco Elver

From: "Paul E. McKenney" <paulmck@kernel.org>

Tasks Trace RCU needs a single-byte cmpxchg(), but no such thing exists.
Therefore, rcu_trc_cmpxchg_need_qs() emulates one using field substitution
and a four-byte cmpxchg(), such that the other three bytes are always
atomically updated to their old values.  This works, but results in
false-positive KCSAN failures because as far as KCSAN knows, this
cmpxchg() operation is updating all four bytes.

This commit therefore encloses the cmpxchg() in a data_race() and adds
a single-byte instrument_atomic_read_write(), thus telling KCSAN exactly
what is going on so as to avoid the false positives.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tasks.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 147b5945d67a..327fbfc999c8 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1457,6 +1457,7 @@ static void rcu_st_need_qs(struct task_struct *t, u8 v)
 /*
  * Do a cmpxchg() on ->trc_reader_special.b.need_qs, allowing for
  * the four-byte operand-size restriction of some platforms.
+ *
  * Returns the old value, which is often ignored.
  */
 u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
@@ -1468,7 +1469,14 @@ u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
 	if (trs_old.b.need_qs != old)
 		return trs_old.b.need_qs;
 	trs_new.b.need_qs = new;
-	ret.s = cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s);
+
+	// Although cmpxchg() appears to KCSAN to update all four bytes,
+	// only the .b.need_qs byte actually changes.
+	instrument_atomic_read_write(&t->trc_reader_special.b.need_qs,
+				     sizeof(t->trc_reader_special.b.need_qs));
+	// Avoid false-positive KCSAN failures.
+	ret.s = data_race(cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s));
+
 	return ret.b.need_qs;
 }
 EXPORT_SYMBOL_GPL(rcu_trc_cmpxchg_need_qs);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 29/48] rcu: Remove redundant CONFIG_PROVE_RCU #if condition
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (27 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 28/48] rcu: Inform KCSAN of one-byte cmpxchg() in rcu_trc_cmpxchg_need_qs() Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 30/48] rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE() Uladzislau Rezki (Sony)
                   ` (18 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

The #if condition controlling the rcu_preempt_sleep_check() definition
has a redundant check for CONFIG_PREEMPT_RCU, which is already checked
for by an enclosing #ifndef.  This commit therefore removes this redundant
condition from the inner #if.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/rcupdate.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 2c54750e36a0..382780bb60f4 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -401,15 +401,15 @@ static inline int debug_lockdep_rcu_enabled(void)
 		}							\
 	} while (0)
 
-#if defined(CONFIG_PROVE_RCU) && !defined(CONFIG_PREEMPT_RCU)
+#ifndef CONFIG_PREEMPT_RCU
 static inline void rcu_preempt_sleep_check(void)
 {
 	RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map),
 			 "Illegal context switch in RCU read-side critical section");
 }
-#else /* #ifdef CONFIG_PROVE_RCU */
+#else // #ifndef CONFIG_PREEMPT_RCU
 static inline void rcu_preempt_sleep_check(void) { }
-#endif /* #else #ifdef CONFIG_PROVE_RCU */
+#endif // #else // #ifndef CONFIG_PREEMPT_RCU
 
 #define rcu_sleep_check()						\
 	do {								\
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 30/48] rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (28 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 29/48] rcu: Remove redundant CONFIG_PROVE_RCU #if condition Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 31/48] rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer Uladzislau Rezki (Sony)
                   ` (17 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

Because the Tasks RCU ->rtp_exit_list is initialized at rcu_init()
time while there is only one CPU running with interrupts disabled, it
is not possible for an exiting task to encounter an uninitialized list.
This commit therefore replaces the conditional initialization with
a WARN_ON_ONCE().

Reported-by: Frederic Weisbecker <frederic@kernel.org>
Closes: https://lore.kernel.org/all/ZdiNXmO3wRvmzPsr@lothringen/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tasks.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 82e458ea0728..78d74c81cc24 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1203,8 +1203,7 @@ void exit_tasks_rcu_start(void)
 	rtpcp = this_cpu_ptr(rcu_tasks.rtpcpu);
 	t->rcu_tasks_exit_cpu = smp_processor_id();
 	raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
-	if (!rtpcp->rtp_exit_list.next)
-		INIT_LIST_HEAD(&rtpcp->rtp_exit_list);
+	WARN_ON_ONCE(!rtpcp->rtp_exit_list.next);
 	list_add(&t->rcu_tasks_exit_list, &rtpcp->rtp_exit_list);
 	raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
 	preempt_enable();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 31/48] rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (29 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 30/48] rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE() Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 32/48] rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow Uladzislau Rezki (Sony)
                   ` (16 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

The synchronize_srcu() has been removed by commit("rcu-tasks: Eliminate
deadlocks involving do_exit() and RCU tasks") in rcu_tasks_postscan.
This commit therefore fixes the tasks_rcu_exit_srcu_stall_timer comment.

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tasks.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 78d74c81cc24..d5319bbe8c98 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -150,7 +150,7 @@ static struct rcu_tasks rt_name =							\
 
 #ifdef CONFIG_TASKS_RCU
 
-/* Report delay in synchronize_srcu() completion in rcu_tasks_postscan(). */
+/* Report delay of scan exiting tasklist in rcu_tasks_postscan(). */
 static void tasks_rcu_exit_srcu_stall(struct timer_list *unused);
 static DEFINE_TIMER(tasks_rcu_exit_srcu_stall_timer, tasks_rcu_exit_srcu_stall);
 #endif
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 32/48] rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (30 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 31/48] rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 33/48] rcu: Mollify sparse with RCU guard Uladzislau Rezki (Sony)
                   ` (15 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Nikita Kiryushin, Steven Rostedt

From: Nikita Kiryushin <kiryushin@ancud.ru>

There is a possibility of buffer overflow in
show_rcu_tasks_trace_gp_kthread() if counters, passed
to sprintf() are huge. Counter numbers, needed for this
are unrealistically high, but buffer overflow is still
possible.

Use snprintf() with buffer size instead of sprintf().

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: edf3775f0ad6 ("rcu-tasks: Add count for idle tasks on offline CPUs")
Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tasks.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index d5319bbe8c98..08a92bffeb84 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1997,7 +1997,7 @@ void show_rcu_tasks_trace_gp_kthread(void)
 {
 	char buf[64];
 
-	sprintf(buf, "N%lu h:%lu/%lu/%lu",
+	snprintf(buf, sizeof(buf), "N%lu h:%lu/%lu/%lu",
 		data_race(n_trc_holdouts),
 		data_race(n_heavy_reader_ofl_updates),
 		data_race(n_heavy_reader_updates),
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 33/48] rcu: Mollify sparse with RCU guard
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (31 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 32/48] rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 34/48] rcu: Fix buffer overflow in print_cpu_stall_info() Uladzislau Rezki (Sony)
                   ` (14 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

When using "guard(rcu)();" sparse will complain, because even
though it now understands the cleanup attribute, it doesn't
evaluate the calls from it at function exit, and thus doesn't
count the context correctly.

Given that there's a conditional in the resulting code:

  static inline void class_rcu_destructor(class_rcu_t *_T)
  {
      if (_T->lock) {
          rcu_read_unlock();
      }
  }

it seems that even trying to teach sparse to evalulate the
cleanup attribute function it'd still be difficult to really
make it understand the full context here.

Suppress the sparse warning by just releasing the context in
the acquisition part of the function, after all we know it's
safe with the guard, that's the whole point of it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/linux/rcupdate.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 382780bb60f4..dfd2399f2cde 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -1090,6 +1090,18 @@ rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f)
 extern int rcu_expedited;
 extern int rcu_normal;
 
-DEFINE_LOCK_GUARD_0(rcu, rcu_read_lock(), rcu_read_unlock())
+DEFINE_LOCK_GUARD_0(rcu,
+	do {
+		rcu_read_lock();
+		/*
+		 * sparse doesn't call the cleanup function,
+		 * so just release immediately and don't track
+		 * the context. We don't need to anyway, since
+		 * the whole point of the guard is to not need
+		 * the explicit unlock.
+		 */
+		__release(RCU);
+	} while (0),
+	rcu_read_unlock())
 
 #endif /* __LINUX_RCUPDATE_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 34/48] rcu: Fix buffer overflow in print_cpu_stall_info()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (32 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 33/48] rcu: Mollify sparse with RCU guard Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 35/48] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
                   ` (13 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Nikita Kiryushin, Steven Rostedt

From: Nikita Kiryushin <kiryushin@ancud.ru>

The rcuc-starvation output from print_cpu_stall_info() might overflow the
buffer if there is a huge difference in jiffies difference.  The situation
might seem improbable, but computers sometimes get very confused about
time, which can result in full-sized integers, and, in this case,
buffer overflow.

Also, the unsigned jiffies difference is printed using %ld, which is
normally for signed integers.  This is intentional for debugging purposes,
but it is not obvious from the code.

This commit therefore changes sprintf() to snprintf() and adds a
clarifying comment about intention of %ld format.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 245a62982502 ("rcu: Dump rcuc kthread status for CPUs not reporting quiescent state")
Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree_stall.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 8a2edf6a1ef5..460efecd077b 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -504,7 +504,8 @@ static void print_cpu_stall_info(int cpu)
 			rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
 	rcuc_starved = rcu_is_rcuc_kthread_starving(rdp, &j);
 	if (rcuc_starved)
-		sprintf(buf, " rcuc=%ld jiffies(starved)", j);
+		// Print signed value, as negative values indicate a probable bug.
+		snprintf(buf, sizeof(buf), " rcuc=%ld jiffies(starved)", j);
 	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%04x/%ld/%#lx softirq=%u/%u fqs=%ld%s%s\n",
 	       cpu,
 	       "O."[!!cpu_online(cpu)],
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 35/48] rcu: Reduce synchronize_rcu() latency
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (33 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 34/48] rcu: Fix buffer overflow in print_cpu_stall_info() Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 36/48] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
                   ` (12 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Neeraj Upadhyay

A call to a synchronize_rcu() can be optimized from a latency
point of view. Workloads which depend on this can benefit of it.

The delay of wakeme_after_rcu() callback, which unblocks a waiter,
depends on several factors:

- how fast a process of offloading is started. Combination of:
    - !CONFIG_RCU_NOCB_CPU/CONFIG_RCU_NOCB_CPU;
    - !CONFIG_RCU_LAZY/CONFIG_RCU_LAZY;
    - other.
- when started, invoking path is interrupted due to:
    - time limit;
    - need_resched();
    - if limit is reached.
- where in a nocb list it is located;
- how fast previous callbacks completed;

Example:

1. On our embedded devices i can easily trigger the scenario when
it is a last in the list out of ~3600 callbacks:

<snip>
  <...>-29      [001] d..1. 21950.145313: rcu_batch_start: rcu_preempt CBs=3613 bl=28
...
  <...>-29      [001] ..... 21950.152578: rcu_invoke_callback: rcu_preempt rhp=00000000b2d6dee8 func=__free_vm_area_struct.cfi_jt
  <...>-29      [001] ..... 21950.152579: rcu_invoke_callback: rcu_preempt rhp=00000000a446f607 func=__free_vm_area_struct.cfi_jt
  <...>-29      [001] ..... 21950.152580: rcu_invoke_callback: rcu_preempt rhp=00000000a5cab03b func=__free_vm_area_struct.cfi_jt
  <...>-29      [001] ..... 21950.152581: rcu_invoke_callback: rcu_preempt rhp=0000000013b7e5ee func=__free_vm_area_struct.cfi_jt
  <...>-29      [001] ..... 21950.152582: rcu_invoke_callback: rcu_preempt rhp=000000000a8ca6f9 func=__free_vm_area_struct.cfi_jt
  <...>-29      [001] ..... 21950.152583: rcu_invoke_callback: rcu_preempt rhp=000000008f162ca8 func=wakeme_after_rcu.cfi_jt
  <...>-29      [001] d..1. 21950.152625: rcu_batch_end: rcu_preempt CBs-invoked=3612 idle=....
<snip>

2. We use cpuset/cgroup to classify tasks and assign them into
different cgroups. For example "backgrond" group which binds tasks
only to little CPUs or "foreground" which makes use of all CPUs.
Tasks can be migrated between groups by a request if an acceleration
is needed.

See below an example how "surfaceflinger" task gets migrated.
Initially it is located in the "system-background" cgroup which
allows to run only on little cores. In order to speed it up it
can be temporary moved into "foreground" cgroup which allows
to use big/all CPUs:

cgroup_attach_task():
 -> cgroup_migrate_execute()
   -> cpuset_can_attach()
     -> percpu_down_write()
       -> rcu_sync_enter()
         -> synchronize_rcu()
   -> now move tasks to the new cgroup.
 -> cgroup_migrate_finish()

<snip>
         rcuop/1-29      [000] .....  7030.528570: rcu_invoke_callback: rcu_preempt rhp=00000000461605e0 func=wakeme_after_rcu.cfi_jt
    PERFD-SERVER-1855    [000] d..1.  7030.530293: cgroup_attach_task: dst_root=3 dst_id=22 dst_level=1 dst_path=/foreground pid=1900 comm=surfaceflinger
   TimerDispatch-2768    [002] d..5.  7030.537542: sched_migrate_task: comm=surfaceflinger pid=1900 prio=98 orig_cpu=0 dest_cpu=4
<snip>

"Boosting a task" depends on synchronize_rcu() latency:

- first trace shows a completion of synchronize_rcu();
- second shows attaching a task to a new group;
- last shows a final step when migration occurs.

3. To address this drawback, maintain a separate track that consists
of synchronize_rcu() callers only. After completion of a grace period
users are deferred to a dedicated worker to process requests.

4. This patch reduces the latency of synchronize_rcu() approximately
by ~30-40% on synthetic tests. The real test case, camera launch time,
shows(time is in milliseconds):

1-run 542 vs 489 improvement 9%
2-run 540 vs 466 improvement 13%
3-run 518 vs 468 improvement 9%
4-run 531 vs 457 improvement 13%
5-run 548 vs 475 improvement 13%
6-run 509 vs 484 improvement 4%

Synthetic test(no "noise" from other callbacks):
Hardware: x86_64 64 CPUs, 64GB of memory
Linux-6.6

- 10K tasks(simultaneous);
- each task does(1000 loops)
     synchronize_rcu();
     kfree(p);

default: CONFIG_RCU_NOCB_CPU: takes 54 seconds to complete all users;
patch: CONFIG_RCU_NOCB_CPU: takes 35 seconds to complete all users.

Running 60K gives approximately same results on my setup. Please note
it is without any interaction with another type of callbacks, otherwise
it will impact a lot a default case.

5. By default it is disabled. To enable this perform one of the
below sequence:

echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp
or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1"

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Co-developed-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 .../admin-guide/kernel-parameters.txt         |  14 +
 kernel/rcu/tree.c                             | 331 +++++++++++++++++-
 kernel/rcu/tree_exp.h                         |   2 +-
 3 files changed, 345 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bb884c14b2f6..0a3b0fd1910e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5091,6 +5091,20 @@
 			delay, memory pressure or callback list growing too
 			big.
 
+	rcutree.rcu_normal_wake_from_gp= [KNL]
+			Reduces a latency of synchronize_rcu() call. This approach
+			maintains its own track of synchronize_rcu() callers, so it
+			does not interact with regular callbacks because it does not
+			use a call_rcu[_hurry]() path. Please note, this is for a
+			normal grace period.
+
+			How to enable it:
+
+			echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp
+			or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1"
+
+			Default is 0.
+
 	rcuscale.gp_async= [KNL]
 			Measure performance of asynchronous
 			grace-period primitives such as call_rcu().
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d9642dd06c25..f65255205e44 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -75,6 +75,7 @@
 #define MODULE_PARAM_PREFIX "rcutree."
 
 /* Data structures. */
+static void rcu_sr_normal_gp_cleanup_work(struct work_struct *);
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
 	.gpwrap = true,
@@ -93,6 +94,8 @@ static struct rcu_state rcu_state = {
 	.exp_mutex = __MUTEX_INITIALIZER(rcu_state.exp_mutex),
 	.exp_wake_mutex = __MUTEX_INITIALIZER(rcu_state.exp_wake_mutex),
 	.ofl_lock = __ARCH_SPIN_LOCK_UNLOCKED,
+	.srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work,
+		rcu_sr_normal_gp_cleanup_work),
 };
 
 /* Dump rcu_node combining tree at boot to verify correct setup. */
@@ -1422,6 +1425,282 @@ static void rcu_poll_gp_seq_end_unlocked(unsigned long *snap)
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 }
 
+/*
+ * There is a single llist, which is used for handling
+ * synchronize_rcu() users' enqueued rcu_synchronize nodes.
+ * Within this llist, there are two tail pointers:
+ *
+ * wait tail: Tracks the set of nodes, which need to
+ *            wait for the current GP to complete.
+ * done tail: Tracks the set of nodes, for which grace
+ *            period has elapsed. These nodes processing
+ *            will be done as part of the cleanup work
+ *            execution by a kworker.
+ *
+ * At every grace period init, a new wait node is added
+ * to the llist. This wait node is used as wait tail
+ * for this new grace period. Given that there are a fixed
+ * number of wait nodes, if all wait nodes are in use
+ * (which can happen when kworker callback processing
+ * is delayed) and additional grace period is requested.
+ * This means, a system is slow in processing callbacks.
+ *
+ * TODO: If a slow processing is detected, a first node
+ * in the llist should be used as a wait-tail for this
+ * grace period, therefore users which should wait due
+ * to a slow process are handled by _this_ grace period
+ * and not next.
+ *
+ * Below is an illustration of how the done and wait
+ * tail pointers move from one set of rcu_synchronize nodes
+ * to the other, as grace periods start and finish and
+ * nodes are processed by kworker.
+ *
+ *
+ * a. Initial llist callbacks list:
+ *
+ * +----------+           +--------+          +-------+
+ * |          |           |        |          |       |
+ * |   head   |---------> |   cb2  |--------->| cb1   |
+ * |          |           |        |          |       |
+ * +----------+           +--------+          +-------+
+ *
+ *
+ *
+ * b. New GP1 Start:
+ *
+ *                    WAIT TAIL
+ *                      |
+ *                      |
+ *                      v
+ * +----------+     +--------+      +--------+        +-------+
+ * |          |     |        |      |        |        |       |
+ * |   head   ------> wait   |------>   cb2  |------> |  cb1  |
+ * |          |     | head1  |      |        |        |       |
+ * +----------+     +--------+      +--------+        +-------+
+ *
+ *
+ *
+ * c. GP completion:
+ *
+ * WAIT_TAIL == DONE_TAIL
+ *
+ *                   DONE TAIL
+ *                     |
+ *                     |
+ *                     v
+ * +----------+     +--------+      +--------+        +-------+
+ * |          |     |        |      |        |        |       |
+ * |   head   ------> wait   |------>   cb2  |------> |  cb1  |
+ * |          |     | head1  |      |        |        |       |
+ * +----------+     +--------+      +--------+        +-------+
+ *
+ *
+ *
+ * d. New callbacks and GP2 start:
+ *
+ *                    WAIT TAIL                          DONE TAIL
+ *                      |                                 |
+ *                      |                                 |
+ *                      v                                 v
+ * +----------+     +------+    +------+    +------+    +-----+    +-----+    +-----+
+ * |          |     |      |    |      |    |      |    |     |    |     |    |     |
+ * |   head   ------> wait |--->|  cb4 |--->| cb3  |--->|wait |--->| cb2 |--->| cb1 |
+ * |          |     | head2|    |      |    |      |    |head1|    |     |    |     |
+ * +----------+     +------+    +------+    +------+    +-----+    +-----+    +-----+
+ *
+ *
+ *
+ * e. GP2 completion:
+ *
+ * WAIT_TAIL == DONE_TAIL
+ *                   DONE TAIL
+ *                      |
+ *                      |
+ *                      v
+ * +----------+     +------+    +------+    +------+    +-----+    +-----+    +-----+
+ * |          |     |      |    |      |    |      |    |     |    |     |    |     |
+ * |   head   ------> wait |--->|  cb4 |--->| cb3  |--->|wait |--->| cb2 |--->| cb1 |
+ * |          |     | head2|    |      |    |      |    |head1|    |     |    |     |
+ * +----------+     +------+    +------+    +------+    +-----+    +-----+    +-----+
+ *
+ *
+ * While the llist state transitions from d to e, a kworker
+ * can start executing rcu_sr_normal_gp_cleanup_work() and
+ * can observe either the old done tail (@c) or the new
+ * done tail (@e). So, done tail updates and reads need
+ * to use the rel-acq semantics. If the concurrent kworker
+ * observes the old done tail, the newly queued work
+ * execution will process the updated done tail. If the
+ * concurrent kworker observes the new done tail, then
+ * the newly queued work will skip processing the done
+ * tail, as workqueue semantics guarantees that the new
+ * work is executed only after the previous one completes.
+ *
+ * f. kworker callbacks processing complete:
+ *
+ *
+ *                   DONE TAIL
+ *                     |
+ *                     |
+ *                     v
+ * +----------+     +--------+
+ * |          |     |        |
+ * |   head   ------> wait   |
+ * |          |     | head2  |
+ * +----------+     +--------+
+ *
+ */
+static bool rcu_sr_is_wait_head(struct llist_node *node)
+{
+	return &(rcu_state.srs_wait_nodes)[0].node <= node &&
+		node <= &(rcu_state.srs_wait_nodes)[SR_NORMAL_GP_WAIT_HEAD_MAX - 1].node;
+}
+
+static struct llist_node *rcu_sr_get_wait_head(void)
+{
+	struct sr_wait_node *sr_wn;
+	int i;
+
+	for (i = 0; i < SR_NORMAL_GP_WAIT_HEAD_MAX; i++) {
+		sr_wn = &(rcu_state.srs_wait_nodes)[i];
+
+		if (!atomic_cmpxchg_acquire(&sr_wn->inuse, 0, 1))
+			return &sr_wn->node;
+	}
+
+	return NULL;
+}
+
+static void rcu_sr_put_wait_head(struct llist_node *node)
+{
+	struct sr_wait_node *sr_wn = container_of(node, struct sr_wait_node, node);
+
+	atomic_set_release(&sr_wn->inuse, 0);
+}
+
+/* Disabled by default. */
+static int rcu_normal_wake_from_gp;
+module_param(rcu_normal_wake_from_gp, int, 0644);
+
+static void rcu_sr_normal_complete(struct llist_node *node)
+{
+	struct rcu_synchronize *rs = container_of(
+		(struct rcu_head *) node, struct rcu_synchronize, head);
+	unsigned long oldstate = (unsigned long) rs->head.func;
+
+	WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) &&
+		!poll_state_synchronize_rcu(oldstate),
+		"A full grace period is not passed yet: %lu",
+		rcu_seq_diff(get_state_synchronize_rcu(), oldstate));
+
+	/* Finally. */
+	complete(&rs->completion);
+}
+
+static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
+{
+	struct llist_node *done, *rcu, *next, *head;
+
+	/*
+	 * This work execution can potentially execute
+	 * while a new done tail is being updated by
+	 * grace period kthread in rcu_sr_normal_gp_cleanup().
+	 * So, read and updates of done tail need to
+	 * follow acq-rel semantics.
+	 *
+	 * Given that wq semantics guarantees that a single work
+	 * cannot execute concurrently by multiple kworkers,
+	 * the done tail list manipulations are protected here.
+	 */
+	done = smp_load_acquire(&rcu_state.srs_done_tail);
+	if (!done)
+		return;
+
+	WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
+	head = done->next;
+	done->next = NULL;
+
+	/*
+	 * The dummy node, which is pointed to by the
+	 * done tail which is acq-read above is not removed
+	 * here.  This allows lockless additions of new
+	 * rcu_synchronize nodes in rcu_sr_normal_add_req(),
+	 * while the cleanup work executes. The dummy
+	 * nodes is removed, in next round of cleanup
+	 * work execution.
+	 */
+	llist_for_each_safe(rcu, next, head) {
+		if (!rcu_sr_is_wait_head(rcu)) {
+			rcu_sr_normal_complete(rcu);
+			continue;
+		}
+
+		rcu_sr_put_wait_head(rcu);
+	}
+}
+
+/*
+ * Helper function for rcu_gp_cleanup().
+ */
+static void rcu_sr_normal_gp_cleanup(void)
+{
+	struct llist_node *wait_tail;
+
+	wait_tail = rcu_state.srs_wait_tail;
+	if (wait_tail == NULL)
+		return;
+
+	rcu_state.srs_wait_tail = NULL;
+	ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_wait_tail);
+
+	// concurrent sr_normal_gp_cleanup work might observe this update.
+	smp_store_release(&rcu_state.srs_done_tail, wait_tail);
+	ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail);
+
+	schedule_work(&rcu_state.srs_cleanup_work);
+}
+
+/*
+ * Helper function for rcu_gp_init().
+ */
+static bool rcu_sr_normal_gp_init(void)
+{
+	struct llist_node *first;
+	struct llist_node *wait_head;
+	bool start_new_poll = false;
+
+	first = READ_ONCE(rcu_state.srs_next.first);
+	if (!first || rcu_sr_is_wait_head(first))
+		return start_new_poll;
+
+	wait_head = rcu_sr_get_wait_head();
+	if (!wait_head) {
+		// Kick another GP to retry.
+		start_new_poll = true;
+		return start_new_poll;
+	}
+
+	/* Inject a wait-dummy-node. */
+	llist_add(wait_head, &rcu_state.srs_next);
+
+	/*
+	 * A waiting list of rcu_synchronize nodes should be empty on
+	 * this step, since a GP-kthread, rcu_gp_init() -> gp_cleanup(),
+	 * rolls it over. If not, it is a BUG, warn a user.
+	 */
+	WARN_ON_ONCE(rcu_state.srs_wait_tail != NULL);
+	rcu_state.srs_wait_tail = wait_head;
+	ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_wait_tail);
+
+	return start_new_poll;
+}
+
+static void rcu_sr_normal_add_req(struct rcu_synchronize *rs)
+{
+	llist_add((struct llist_node *) &rs->head, &rcu_state.srs_next);
+}
+
 /*
  * Initialize a new grace period.  Return false if no grace period required.
  */
@@ -1432,6 +1711,7 @@ static noinline_for_stack bool rcu_gp_init(void)
 	unsigned long mask;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root();
+	bool start_new_poll;
 
 	WRITE_ONCE(rcu_state.gp_activity, jiffies);
 	raw_spin_lock_irq_rcu_node(rnp);
@@ -1456,10 +1736,24 @@ static noinline_for_stack bool rcu_gp_init(void)
 	/* Record GP times before starting GP, hence rcu_seq_start(). */
 	rcu_seq_start(&rcu_state.gp_seq);
 	ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
+	start_new_poll = rcu_sr_normal_gp_init();
 	trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start"));
 	rcu_poll_gp_seq_start(&rcu_state.gp_seq_polled_snap);
 	raw_spin_unlock_irq_rcu_node(rnp);
 
+	/*
+	 * The "start_new_poll" is set to true, only when this GP is not able
+	 * to handle anything and there are outstanding users. It happens when
+	 * the rcu_sr_normal_gp_init() function was not able to insert a dummy
+	 * separator to the llist, because there were no left any dummy-nodes.
+	 *
+	 * Number of dummy-nodes is fixed, it could be that we are run out of
+	 * them, if so we start a new pool request to repeat a try. It is rare
+	 * and it means that a system is doing a slow processing of callbacks.
+	 */
+	if (start_new_poll)
+		(void) start_poll_synchronize_rcu();
+
 	/*
 	 * Apply per-leaf buffered online and offline operations to
 	 * the rcu_node tree. Note that this new grace period need not
@@ -1825,6 +2119,9 @@ static noinline void rcu_gp_cleanup(void)
 	}
 	raw_spin_unlock_irq_rcu_node(rnp);
 
+	// Make synchronize_rcu() users aware of the end of old grace period.
+	rcu_sr_normal_gp_cleanup();
+
 	// If strict, make all CPUs aware of the end of the old grace period.
 	if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD))
 		on_each_cpu(rcu_strict_gp_boundary, NULL, 0);
@@ -3559,6 +3856,38 @@ static int rcu_blocking_is_gp(void)
 	return true;
 }
 
+/*
+ * Helper function for the synchronize_rcu() API.
+ */
+static void synchronize_rcu_normal(void)
+{
+	struct rcu_synchronize rs;
+
+	if (!READ_ONCE(rcu_normal_wake_from_gp)) {
+		wait_rcu_gp(call_rcu_hurry);
+		return;
+	}
+
+	init_rcu_head_on_stack(&rs.head);
+	init_completion(&rs.completion);
+
+	/*
+	 * This code might be preempted, therefore take a GP
+	 * snapshot before adding a request.
+	 */
+	if (IS_ENABLED(CONFIG_PROVE_RCU))
+		rs.head.func = (void *) get_state_synchronize_rcu();
+
+	rcu_sr_normal_add_req(&rs);
+
+	/* Kick a GP and start waiting. */
+	(void) start_poll_synchronize_rcu();
+
+	/* Now we can wait. */
+	wait_for_completion(&rs.completion);
+	destroy_rcu_head_on_stack(&rs.head);
+}
+
 /**
  * synchronize_rcu - wait until a grace period has elapsed.
  *
@@ -3610,7 +3939,7 @@ void synchronize_rcu(void)
 		if (rcu_gp_is_expedited())
 			synchronize_rcu_expedited();
 		else
-			wait_rcu_gp(call_rcu_hurry);
+			synchronize_rcu_normal();
 		return;
 	}
 
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 6b83537480b1..8a1d9c8bd9f7 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -930,7 +930,7 @@ void synchronize_rcu_expedited(void)
 
 	/* If expedited grace periods are prohibited, fall back to normal. */
 	if (rcu_gp_is_normal()) {
-		wait_rcu_gp(call_rcu_hurry);
+		synchronize_rcu_normal();
 		return;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 36/48] rcu: Add a trace event for synchronize_rcu_normal()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (34 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 35/48] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 37/48] rcu: Support direct wake-up of synchronize_rcu() users Uladzislau Rezki (Sony)
                   ` (11 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

Add an rcu_sr_normal() trace event. It takes three arguments
first one is the name of RCU flavour, second one is a user id
which triggeres synchronize_rcu_normal() and last one is an
event.

There are two traces in the synchronize_rcu_normal(). On entry,
when a new request is registered and on exit point when request
is completed.

Please note, CONFIG_RCU_TRACE=y is required to activate traces.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 include/trace/events/rcu.h | 27 +++++++++++++++++++++++++++
 kernel/rcu/tree.c          |  7 ++++++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 2ef9c719772a..31b3e0d3e65f 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -707,6 +707,33 @@ TRACE_EVENT_RCU(rcu_invoke_kfree_bulk_callback,
 		__entry->rcuname, __entry->p, __entry->nr_records)
 );
 
+/*
+ * Tracepoint for a normal synchronize_rcu() states. The first argument
+ * is the RCU flavor, the second argument is a pointer to rcu_head the
+ * last one is an event.
+ */
+TRACE_EVENT_RCU(rcu_sr_normal,
+
+	TP_PROTO(const char *rcuname, struct rcu_head *rhp, const char *srevent),
+
+	TP_ARGS(rcuname, rhp, srevent),
+
+	TP_STRUCT__entry(
+		__field(const char *, rcuname)
+		__field(void *, rhp)
+		__field(const char *, srevent)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->rhp = rhp;
+		__entry->srevent = srevent;
+	),
+
+	TP_printk("%s rhp=0x%p event=%s",
+		__entry->rcuname, __entry->rhp, __entry->srevent)
+);
+
 /*
  * Tracepoint for exiting rcu_do_batch after RCU callbacks have been
  * invoked.  The first argument is the name of the RCU flavor,
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f65255205e44..2e1c5be6d64b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3863,9 +3863,11 @@ static void synchronize_rcu_normal(void)
 {
 	struct rcu_synchronize rs;
 
+	trace_rcu_sr_normal(rcu_state.name, &rs.head, TPS("request"));
+
 	if (!READ_ONCE(rcu_normal_wake_from_gp)) {
 		wait_rcu_gp(call_rcu_hurry);
-		return;
+		goto trace_complete_out;
 	}
 
 	init_rcu_head_on_stack(&rs.head);
@@ -3886,6 +3888,9 @@ static void synchronize_rcu_normal(void)
 	/* Now we can wait. */
 	wait_for_completion(&rs.completion);
 	destroy_rcu_head_on_stack(&rs.head);
+
+trace_complete_out:
+	trace_rcu_sr_normal(rcu_state.name, &rs.head, TPS("complete"));
 }
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 37/48] rcu: Support direct wake-up of synchronize_rcu() users
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (35 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 36/48] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 38/48] rcu: Allocate WQ with WQ_MEM_RECLAIM bit set Uladzislau Rezki (Sony)
                   ` (10 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

This patch introduces a small enhancement which allows to do a
direct wake-up of synchronize_rcu() callers. It occurs after a
completion of grace period, thus by the gp-kthread.

Number of clients is limited by the hard-coded maximum allowed
threshold. The remaining part, if still exists is deferred to
a main worker.

Link: https://lore.kernel.org/lkml/Zd0ZtNu+Rt0qXkfS@lothringen/

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 24 +++++++++++++++++++++++-
 kernel/rcu/tree.h |  6 ++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2e1c5be6d64b..2a270abade4d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1645,7 +1645,8 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
  */
 static void rcu_sr_normal_gp_cleanup(void)
 {
-	struct llist_node *wait_tail;
+	struct llist_node *wait_tail, *next, *rcu;
+	int done = 0;
 
 	wait_tail = rcu_state.srs_wait_tail;
 	if (wait_tail == NULL)
@@ -1653,11 +1654,32 @@ static void rcu_sr_normal_gp_cleanup(void)
 
 	rcu_state.srs_wait_tail = NULL;
 	ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_wait_tail);
+	WARN_ON_ONCE(!rcu_sr_is_wait_head(wait_tail));
+
+	/*
+	 * Process (a) and (d) cases. See an illustration.
+	 */
+	llist_for_each_safe(rcu, next, wait_tail->next) {
+		if (rcu_sr_is_wait_head(rcu))
+			break;
+
+		rcu_sr_normal_complete(rcu);
+		// It can be last, update a next on this step.
+		wait_tail->next = next;
+
+		if (++done == SR_MAX_USERS_WAKE_FROM_GP)
+			break;
+	}
 
 	// concurrent sr_normal_gp_cleanup work might observe this update.
 	smp_store_release(&rcu_state.srs_done_tail, wait_tail);
 	ASSERT_EXCLUSIVE_WRITER(rcu_state.srs_done_tail);
 
+	/*
+	 * We schedule a work in order to perform a final processing
+	 * of outstanding users(if still left) and releasing wait-heads
+	 * added by rcu_sr_normal_gp_init() call.
+	 */
 	schedule_work(&rcu_state.srs_cleanup_work);
 }
 
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index b942b9437438..2832787cee1d 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -315,6 +315,12 @@ do {									\
 	__set_current_state(TASK_RUNNING);				\
 } while (0)
 
+/*
+ * A max threshold for synchronize_rcu() users which are
+ * awaken directly by the rcu_gp_kthread(). Left part is
+ * deferred to the main worker.
+ */
+#define SR_MAX_USERS_WAKE_FROM_GP 5
 #define SR_NORMAL_GP_WAIT_HEAD_MAX 5
 
 struct sr_wait_node {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 38/48] rcu: Allocate WQ with WQ_MEM_RECLAIM bit set
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (36 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 37/48] rcu: Support direct wake-up of synchronize_rcu() users Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 39/48] rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE() Uladzislau Rezki (Sony)
                   ` (9 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

synchronize_rcu() users have to be processed regardless
of memory pressure so our private WQ needs to have at least
one execution context what WQ_MEM_RECLAIM flag guarantees.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/tree.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2a270abade4d..1d5c000e5c7a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1582,6 +1582,7 @@ static void rcu_sr_put_wait_head(struct llist_node *node)
 /* Disabled by default. */
 static int rcu_normal_wake_from_gp;
 module_param(rcu_normal_wake_from_gp, int, 0644);
+static struct workqueue_struct *sync_wq;
 
 static void rcu_sr_normal_complete(struct llist_node *node)
 {
@@ -1680,7 +1681,7 @@ static void rcu_sr_normal_gp_cleanup(void)
 	 * of outstanding users(if still left) and releasing wait-heads
 	 * added by rcu_sr_normal_gp_init() call.
 	 */
-	schedule_work(&rcu_state.srs_cleanup_work);
+	queue_work(sync_wq, &rcu_state.srs_cleanup_work);
 }
 
 /*
@@ -5585,6 +5586,9 @@ void __init rcu_init(void)
 	rcu_gp_wq = alloc_workqueue("rcu_gp", WQ_MEM_RECLAIM, 0);
 	WARN_ON(!rcu_gp_wq);
 
+	sync_wq = alloc_workqueue("sync_wq", WQ_MEM_RECLAIM, 0);
+	WARN_ON(!sync_wq);
+
 	/* Fill in default value for rcutree.qovld boot parameter. */
 	/* -After- the rcu_node ->lock fields are initialized! */
 	if (qovld < 0)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 39/48] rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE()
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (37 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 38/48] rcu: Allocate WQ with WQ_MEM_RECLAIM bit set Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 40/48] rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment Uladzislau Rezki (Sony)
                   ` (8 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Linus Torvalds

From: "Paul E. McKenney" <paulmck@kernel.org>

The rcu_torture_pipe_update_one() cannot run concurrently with any updates
of ->rtort_pipe_count, so this commit removes the extraneous READ_ONCE()
from the read from this field.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/lkml/CAHk-=wiX_zF5Mpt8kUm_LFQpYY-mshrXJPOe+wKNwiVhEUcU9g@mail.gmail.com/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index d8c12eba35b7..6b821a7037b0 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -461,7 +461,7 @@ rcu_torture_pipe_update_one(struct rcu_torture *rp)
 		WRITE_ONCE(rp->rtort_chkp, NULL);
 		smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire().
 	}
-	i = READ_ONCE(rp->rtort_pipe_count);
+	i = rp->rtort_pipe_count;
 	if (i > RCU_TORTURE_PIPE_LEN)
 		i = RCU_TORTURE_PIPE_LEN;
 	atomic_inc(&rcu_torture_wcount[i]);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 40/48] rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (38 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 39/48] rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE() Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 41/48] rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading Uladzislau Rezki (Sony)
                   ` (7 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Linus Torvalds

From: "Paul E. McKenney" <paulmck@kernel.org>

The "pipe_count > RCU_TORTURE_PIPE_LEN" check has a comment saying "Should
not happen, but...".  This is only true when testing an RCU whose grace
periods are always long enough.  This commit therefore fixes this comment.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/lkml/CAHk-=wi7rJ-eGq+xaxVfzFEgbL9tdf6Kc8Z89rCpfcQOKm74Tw@mail.gmail.com/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 6b821a7037b0..0cb5452ecd94 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2000,7 +2000,8 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp, long myid)
 	preempt_disable();
 	pipe_count = READ_ONCE(p->rtort_pipe_count);
 	if (pipe_count > RCU_TORTURE_PIPE_LEN) {
-		/* Should not happen, but... */
+		// Should not happen in a correct RCU implementation,
+		// happens quite often for torture_type=busted.
 		pipe_count = RCU_TORTURE_PIPE_LEN;
 	}
 	completed = cur_ops->get_gp_seq();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 41/48] rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (39 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 40/48] rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 42/48] rcutorture: Use the gp_kthread_dbg operation specified by cur_ops Uladzislau Rezki (Sony)
                   ` (6 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	linke li, Linus Torvalds

From: linke li <lilinke99@qq.com>

Currently, the rcu_torture_pipe_update_one() writes the value (i + 1)
to rp->rtort_pipe_count, then immediately re-reads it in order to compare
it to RCU_TORTURE_PIPE_LEN.  This re-read is pointless because no other
update to rp->rtort_pipe_count can occur at this point.  This commit
therefore instead re-uses the (i + 1) value stored in the comparison
instead of re-reading rp->rtort_pipe_count.

Signed-off-by: linke li <lilinke99@qq.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 0cb5452ecd94..dd7d5ba45740 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -467,7 +467,7 @@ rcu_torture_pipe_update_one(struct rcu_torture *rp)
 	atomic_inc(&rcu_torture_wcount[i]);
 	WRITE_ONCE(rp->rtort_pipe_count, i + 1);
 	ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count);
-	if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
+	if (i + 1 >= RCU_TORTURE_PIPE_LEN) {
 		rp->rtort_mbtest = 0;
 		return true;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 42/48] rcutorture: Use the gp_kthread_dbg operation specified by cur_ops
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (40 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 41/48] rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 43/48] rcutorture: Make rcutorture support print rcu-tasks gp state Uladzislau Rezki (Sony)
                   ` (5 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

Despite there being a cur_ops->gp_kthread_dbg(), rcu_torture_writer()
unconditionally invokes vanilla RCU's show_rcu_gp_kthreads().  This is not
at all helpful when some other flavor of RCU is being tested.  This commit
therefore makes rcu_torture_writer() invoke cur_ops->gp_kthread_dbg()
for RCU implementations providing this function.

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index dd7d5ba45740..2f43d31fb7a5 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -1589,7 +1589,8 @@ rcu_torture_writer(void *arg)
 				if (list_empty(&rcu_tortures[i].rtort_free) &&
 				    rcu_access_pointer(rcu_torture_current) != &rcu_tortures[i]) {
 					tracing_off();
-					show_rcu_gp_kthreads();
+					if (cur_ops->gp_kthread_dbg)
+						cur_ops->gp_kthread_dbg();
 					WARN(1, "%s: rtort_pipe_count: %d\n", __func__, rcu_tortures[i].rtort_pipe_count);
 					rcu_ftrace_dump(DUMP_ALL);
 				}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 43/48] rcutorture: Make rcutorture support print rcu-tasks gp state
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (41 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 42/48] rcutorture: Use the gp_kthread_dbg operation specified by cur_ops Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 44/48] rcutorture: Removing redundant function pointer initialization Uladzislau Rezki (Sony)
                   ` (4 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

This commit make rcu-tasks related rcutorture test support rcu-tasks
gp state printing when the writer stall occurs or the at the end of
rcutorture test, and generate rcu_ops->get_gp_data() operation to
simplify the acquisition of gp state for different types of rcutorture
tests.

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcu.h        | 20 ++++++++++----------
 kernel/rcu/rcutorture.c | 26 ++++++++++++++++++--------
 kernel/rcu/srcutree.c   |  5 +----
 kernel/rcu/tasks.h      | 21 +++++++++++++++++++++
 kernel/rcu/tree.c       | 13 +++----------
 5 files changed, 53 insertions(+), 32 deletions(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 86fce206560e..38238e595a61 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -522,12 +522,18 @@ static inline void show_rcu_tasks_gp_kthreads(void) {}
 
 #ifdef CONFIG_TASKS_RCU
 struct task_struct *get_rcu_tasks_gp_kthread(void);
+void rcu_tasks_get_gp_data(int *flags, unsigned long *gp_seq);
 #endif // # ifdef CONFIG_TASKS_RCU
 
 #ifdef CONFIG_TASKS_RUDE_RCU
 struct task_struct *get_rcu_tasks_rude_gp_kthread(void);
+void rcu_tasks_rude_get_gp_data(int *flags, unsigned long *gp_seq);
 #endif // # ifdef CONFIG_TASKS_RUDE_RCU
 
+#ifdef CONFIG_TASKS_TRACE_RCU
+void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq);
+#endif
+
 #ifdef CONFIG_TASKS_RCU_GENERIC
 void tasks_cblist_init_generic(void);
 #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
@@ -557,8 +563,7 @@ static inline void rcu_set_jiffies_lazy_flush(unsigned long j) { }
 #endif
 
 #if defined(CONFIG_TREE_RCU)
-void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
-			    unsigned long *gp_seq);
+void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq);
 void do_trace_rcu_torture_read(const char *rcutorturename,
 			       struct rcu_head *rhp,
 			       unsigned long secs,
@@ -566,8 +571,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
 			       unsigned long c);
 void rcu_gp_set_torture_wait(int duration);
 #else
-static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
-					  int *flags, unsigned long *gp_seq)
+static inline void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq)
 {
 	*flags = 0;
 	*gp_seq = 0;
@@ -587,20 +591,16 @@ static inline void rcu_gp_set_torture_wait(int duration) { }
 
 #ifdef CONFIG_TINY_SRCU
 
-static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
-					   struct srcu_struct *sp, int *flags,
+static inline void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags,
 					   unsigned long *gp_seq)
 {
-	if (test_type != SRCU_FLAVOR)
-		return;
 	*flags = 0;
 	*gp_seq = sp->srcu_idx;
 }
 
 #elif defined(CONFIG_TREE_SRCU)
 
-void srcutorture_get_gp_data(enum rcutorture_type test_type,
-			     struct srcu_struct *sp, int *flags,
+void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags,
 			     unsigned long *gp_seq);
 
 #endif
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 2f43d31fb7a5..85ff8a32f75a 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -381,6 +381,7 @@ struct rcu_torture_ops {
 	void (*gp_kthread_dbg)(void);
 	bool (*check_boost_failed)(unsigned long gp_state, int *cpup);
 	int (*stall_dur)(void);
+	void (*get_gp_data)(int *flags, unsigned long *gp_seq);
 	long cbflood_max;
 	int irq_capable;
 	int can_boost;
@@ -569,6 +570,7 @@ static struct rcu_torture_ops rcu_ops = {
 	.gp_kthread_dbg		= show_rcu_gp_kthreads,
 	.check_boost_failed	= rcu_check_boost_fail,
 	.stall_dur		= rcu_jiffies_till_stall_check,
+	.get_gp_data		= rcutorture_get_gp_data,
 	.irq_capable		= 1,
 	.can_boost		= IS_ENABLED(CONFIG_RCU_BOOST),
 	.extendables		= RCUTORTURE_MAX_EXTEND,
@@ -628,6 +630,11 @@ static struct srcu_struct srcu_ctld;
 static struct srcu_struct *srcu_ctlp = &srcu_ctl;
 static struct rcu_torture_ops srcud_ops;
 
+static void srcu_get_gp_data(int *flags, unsigned long *gp_seq)
+{
+	srcutorture_get_gp_data(srcu_ctlp, flags, gp_seq);
+}
+
 static int srcu_torture_read_lock(void)
 {
 	if (cur_ops == &srcud_ops)
@@ -736,6 +743,7 @@ static struct rcu_torture_ops srcu_ops = {
 	.call		= srcu_torture_call,
 	.cb_barrier	= srcu_torture_barrier,
 	.stats		= srcu_torture_stats,
+	.get_gp_data	= srcu_get_gp_data,
 	.cbflood_max	= 50000,
 	.irq_capable	= 1,
 	.no_pi_lock	= IS_ENABLED(CONFIG_TINY_SRCU),
@@ -774,6 +782,7 @@ static struct rcu_torture_ops srcud_ops = {
 	.call		= srcu_torture_call,
 	.cb_barrier	= srcu_torture_barrier,
 	.stats		= srcu_torture_stats,
+	.get_gp_data	= srcu_get_gp_data,
 	.cbflood_max	= 50000,
 	.irq_capable	= 1,
 	.no_pi_lock	= IS_ENABLED(CONFIG_TINY_SRCU),
@@ -882,6 +891,7 @@ static struct rcu_torture_ops tasks_ops = {
 	.call		= call_rcu_tasks,
 	.cb_barrier	= rcu_barrier_tasks,
 	.gp_kthread_dbg	= show_rcu_tasks_classic_gp_kthread,
+	.get_gp_data	= rcu_tasks_get_gp_data,
 	.fqs		= NULL,
 	.stats		= NULL,
 	.irq_capable	= 1,
@@ -922,6 +932,7 @@ static struct rcu_torture_ops tasks_rude_ops = {
 	.call		= call_rcu_tasks_rude,
 	.cb_barrier	= rcu_barrier_tasks_rude,
 	.gp_kthread_dbg	= show_rcu_tasks_rude_gp_kthread,
+	.get_gp_data	= rcu_tasks_rude_get_gp_data,
 	.cbflood_max	= 50000,
 	.fqs		= NULL,
 	.stats		= NULL,
@@ -974,6 +985,7 @@ static struct rcu_torture_ops tasks_tracing_ops = {
 	.call		= call_rcu_tasks_trace,
 	.cb_barrier	= rcu_barrier_tasks_trace,
 	.gp_kthread_dbg	= show_rcu_tasks_trace_gp_kthread,
+	.get_gp_data    = rcu_tasks_trace_get_gp_data,
 	.cbflood_max	= 50000,
 	.fqs		= NULL,
 	.stats		= NULL,
@@ -2264,10 +2276,8 @@ rcu_torture_stats_print(void)
 		int __maybe_unused flags = 0;
 		unsigned long __maybe_unused gp_seq = 0;
 
-		rcutorture_get_gp_data(cur_ops->ttype,
-				       &flags, &gp_seq);
-		srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
-					&flags, &gp_seq);
+		if (cur_ops->get_gp_data)
+			cur_ops->get_gp_data(&flags, &gp_seq);
 		wtp = READ_ONCE(writer_task);
 		pr_alert("??? Writer stall state %s(%d) g%lu f%#x ->state %#x cpu %d\n",
 			 rcu_torture_writer_state_getname(),
@@ -3390,8 +3400,8 @@ rcu_torture_cleanup(void)
 		fakewriter_tasks = NULL;
 	}
 
-	rcutorture_get_gp_data(cur_ops->ttype, &flags, &gp_seq);
-	srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp, &flags, &gp_seq);
+	if (cur_ops->get_gp_data)
+		cur_ops->get_gp_data(&flags, &gp_seq);
 	pr_alert("%s:  End-test grace-period state: g%ld f%#x total-gps=%ld\n",
 		 cur_ops->name, (long)gp_seq, flags,
 		 rcutorture_seq_diff(gp_seq, start_gp_seq));
@@ -3762,8 +3772,8 @@ rcu_torture_init(void)
 			nrealreaders = 1;
 	}
 	rcu_torture_print_module_parms(cur_ops, "Start of test");
-	rcutorture_get_gp_data(cur_ops->ttype, &flags, &gp_seq);
-	srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp, &flags, &gp_seq);
+	if (cur_ops->get_gp_data)
+		cur_ops->get_gp_data(&flags, &gp_seq);
 	start_gp_seq = gp_seq;
 	pr_alert("%s:  Start-test grace-period state: g%ld f%#x\n",
 		 cur_ops->name, (long)gp_seq, flags);
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index e4d673fc30f4..bc4b58b0204e 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1826,12 +1826,9 @@ static void process_srcu(struct work_struct *work)
 	srcu_reschedule(ssp, curdelay);
 }
 
-void srcutorture_get_gp_data(enum rcutorture_type test_type,
-			     struct srcu_struct *ssp, int *flags,
+void srcutorture_get_gp_data(struct srcu_struct *ssp, int *flags,
 			     unsigned long *gp_seq)
 {
-	if (test_type != SRCU_FLAVOR)
-		return;
 	*flags = 0;
 	*gp_seq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
 }
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 147b5945d67a..a1af7dadc0f7 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1178,6 +1178,13 @@ struct task_struct *get_rcu_tasks_gp_kthread(void)
 }
 EXPORT_SYMBOL_GPL(get_rcu_tasks_gp_kthread);
 
+void rcu_tasks_get_gp_data(int *flags, unsigned long *gp_seq)
+{
+	*flags = 0;
+	*gp_seq = rcu_seq_current(&rcu_tasks.tasks_gp_seq);
+}
+EXPORT_SYMBOL_GPL(rcu_tasks_get_gp_data);
+
 /*
  * Protect against tasklist scan blind spot while the task is exiting and
  * may be removed from the tasklist.  Do this by adding the task to yet
@@ -1358,6 +1365,13 @@ struct task_struct *get_rcu_tasks_rude_gp_kthread(void)
 }
 EXPORT_SYMBOL_GPL(get_rcu_tasks_rude_gp_kthread);
 
+void rcu_tasks_rude_get_gp_data(int *flags, unsigned long *gp_seq)
+{
+	*flags = 0;
+	*gp_seq = rcu_seq_current(&rcu_tasks_rude.tasks_gp_seq);
+}
+EXPORT_SYMBOL_GPL(rcu_tasks_rude_get_gp_data);
+
 #endif /* #ifdef CONFIG_TASKS_RUDE_RCU */
 
 ////////////////////////////////////////////////////////////////////////
@@ -2010,6 +2024,13 @@ struct task_struct *get_rcu_tasks_trace_gp_kthread(void)
 }
 EXPORT_SYMBOL_GPL(get_rcu_tasks_trace_gp_kthread);
 
+void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq)
+{
+	*flags = 0;
+	*gp_seq = rcu_seq_current(&rcu_tasks_trace.tasks_gp_seq);
+}
+EXPORT_SYMBOL_GPL(rcu_tasks_trace_get_gp_data);
+
 #else /* #ifdef CONFIG_TASKS_TRACE_RCU */
 static void exit_tasks_rcu_finish_trace(struct task_struct *t) { }
 #endif /* #else #ifdef CONFIG_TASKS_TRACE_RCU */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d9642dd06c25..60e79ed73700 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -508,17 +508,10 @@ static struct rcu_node *rcu_get_root(void)
 /*
  * Send along grace-period-related data for rcutorture diagnostics.
  */
-void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
-			    unsigned long *gp_seq)
+void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq)
 {
-	switch (test_type) {
-	case RCU_FLAVOR:
-		*flags = READ_ONCE(rcu_state.gp_flags);
-		*gp_seq = rcu_seq_current(&rcu_state.gp_seq);
-		break;
-	default:
-		break;
-	}
+	*flags = READ_ONCE(rcu_state.gp_flags);
+	*gp_seq = rcu_seq_current(&rcu_state.gp_seq);
 }
 EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 44/48] rcutorture: Removing redundant function pointer initialization
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (42 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 43/48] rcutorture: Make rcutorture support print rcu-tasks gp state Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 45/48] rcutorture: Make stall-tasks directly exit when rcutorture tests end Uladzislau Rezki (Sony)
                   ` (3 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

For these rcu_torture_ops structure's objects defined by using static,
if the value of the function pointer in its member is not set, the default
value will be NULL, this commit therefore remove the pre-existing
initialization of function pointers to NULL.

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 85ff8a32f75a..3f9c3766f52b 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -566,7 +566,6 @@ static struct rcu_torture_ops rcu_ops = {
 	.call			= call_rcu_hurry,
 	.cb_barrier		= rcu_barrier,
 	.fqs			= rcu_force_quiescent_state,
-	.stats			= NULL,
 	.gp_kthread_dbg		= show_rcu_gp_kthreads,
 	.check_boost_failed	= rcu_check_boost_fail,
 	.stall_dur		= rcu_jiffies_till_stall_check,
@@ -614,9 +613,6 @@ static struct rcu_torture_ops rcu_busted_ops = {
 	.sync		= synchronize_rcu_busted,
 	.exp_sync	= synchronize_rcu_busted,
 	.call		= call_rcu_busted,
-	.cb_barrier	= NULL,
-	.fqs		= NULL,
-	.stats		= NULL,
 	.irq_capable	= 1,
 	.name		= "busted"
 };
@@ -847,8 +843,6 @@ static struct rcu_torture_ops trivial_ops = {
 	.get_gp_seq	= rcu_no_completed,
 	.sync		= synchronize_rcu_trivial,
 	.exp_sync	= synchronize_rcu_trivial,
-	.fqs		= NULL,
-	.stats		= NULL,
 	.irq_capable	= 1,
 	.name		= "trivial"
 };
@@ -892,8 +886,6 @@ static struct rcu_torture_ops tasks_ops = {
 	.cb_barrier	= rcu_barrier_tasks,
 	.gp_kthread_dbg	= show_rcu_tasks_classic_gp_kthread,
 	.get_gp_data	= rcu_tasks_get_gp_data,
-	.fqs		= NULL,
-	.stats		= NULL,
 	.irq_capable	= 1,
 	.slow_gps	= 1,
 	.name		= "tasks"
@@ -934,8 +926,6 @@ static struct rcu_torture_ops tasks_rude_ops = {
 	.gp_kthread_dbg	= show_rcu_tasks_rude_gp_kthread,
 	.get_gp_data	= rcu_tasks_rude_get_gp_data,
 	.cbflood_max	= 50000,
-	.fqs		= NULL,
-	.stats		= NULL,
 	.irq_capable	= 1,
 	.name		= "tasks-rude"
 };
@@ -987,8 +977,6 @@ static struct rcu_torture_ops tasks_tracing_ops = {
 	.gp_kthread_dbg	= show_rcu_tasks_trace_gp_kthread,
 	.get_gp_data    = rcu_tasks_trace_get_gp_data,
 	.cbflood_max	= 50000,
-	.fqs		= NULL,
-	.stats		= NULL,
 	.irq_capable	= 1,
 	.slow_gps	= 1,
 	.name		= "tasks-tracing"
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 45/48] rcutorture: Make stall-tasks directly exit when rcutorture tests end
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (43 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 44/48] rcutorture: Removing redundant function pointer initialization Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 46/48] rcutorture: Fix invalid context warning when enable srcu barrier testing Uladzislau Rezki (Sony)
                   ` (2 subsequent siblings)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

When the rcutorture tests start to exit, the rcu_torture_cleanup() is
invoked to stop kthreads and release resources, if the stall-task
kthreads exist, cpu-stall has started and the rcutorture.stall_cpu
is set to a larger value, the rcu_torture_cleanup() will be blocked
for a long time and the hung-task may occur, this commit therefore
add kthread_should_stop() to the loop of cpu-stall operation, when
rcutorture tests ends, no need to wait for cpu-stall to end, exit
directly.

Use the following command to test:

insmod rcutorture.ko torture_type=srcu fwd_progress=0 stat_interval=4
stall_cpu_block=1 stall_cpu=200 stall_cpu_holdoff=10 read_exit_burst=0
object_debug=1
rmmod rcutorture

[15361.918610] INFO: task rmmod:878 blocked for more than 122 seconds.
[15361.918613]       Tainted: G        W
6.8.0-rc2-yoctodev-standard+ #25
[15361.918615] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[15361.918616] task:rmmod           state:D stack:0     pid:878
tgid:878   ppid:773    flags:0x00004002
[15361.918621] Call Trace:
[15361.918623]  <TASK>
[15361.918626]  __schedule+0xc0d/0x28f0
[15361.918631]  ? __pfx___schedule+0x10/0x10
[15361.918635]  ? rcu_is_watching+0x19/0xb0
[15361.918638]  ? schedule+0x1f6/0x290
[15361.918642]  ? __pfx_lock_release+0x10/0x10
[15361.918645]  ? schedule+0xc9/0x290
[15361.918648]  ? schedule+0xc9/0x290
[15361.918653]  ? trace_preempt_off+0x54/0x100
[15361.918657]  ? schedule+0xc9/0x290
[15361.918661]  schedule+0xd0/0x290
[15361.918665]  schedule_timeout+0x56d/0x7d0
[15361.918669]  ? debug_smp_processor_id+0x1b/0x30
[15361.918672]  ? rcu_is_watching+0x19/0xb0
[15361.918676]  ? __pfx_schedule_timeout+0x10/0x10
[15361.918679]  ? debug_smp_processor_id+0x1b/0x30
[15361.918683]  ? rcu_is_watching+0x19/0xb0
[15361.918686]  ? wait_for_completion+0x179/0x4c0
[15361.918690]  ? __pfx_lock_release+0x10/0x10
[15361.918693]  ? __kasan_check_write+0x18/0x20
[15361.918696]  ? wait_for_completion+0x9d/0x4c0
[15361.918700]  ? _raw_spin_unlock_irq+0x36/0x50
[15361.918703]  ? wait_for_completion+0x179/0x4c0
[15361.918707]  ? _raw_spin_unlock_irq+0x36/0x50
[15361.918710]  ? wait_for_completion+0x179/0x4c0
[15361.918714]  ? trace_preempt_on+0x54/0x100
[15361.918718]  ? wait_for_completion+0x179/0x4c0
[15361.918723]  wait_for_completion+0x181/0x4c0
[15361.918728]  ? __pfx_wait_for_completion+0x10/0x10
[15361.918738]  kthread_stop+0x152/0x470
[15361.918742]  _torture_stop_kthread+0x44/0xc0 [torture
7af7f9cbba28271a10503b653f9e05d518fbc8c3]
[15361.918752]  rcu_torture_cleanup+0x2ac/0xe90 [rcutorture
f2cb1f556ee7956270927183c4c2c7749a336529]
[15361.918766]  ? __pfx_rcu_torture_cleanup+0x10/0x10 [rcutorture
f2cb1f556ee7956270927183c4c2c7749a336529]
[15361.918777]  ? __kasan_check_write+0x18/0x20
[15361.918781]  ? __mutex_unlock_slowpath+0x17c/0x670
[15361.918789]  ? __might_fault+0xcd/0x180
[15361.918793]  ? find_module_all+0x104/0x1d0
[15361.918799]  __x64_sys_delete_module+0x2a4/0x3f0
[15361.918803]  ? __pfx___x64_sys_delete_module+0x10/0x10
[15361.918807]  ? syscall_exit_to_user_mode+0x149/0x280

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 3f9c3766f52b..456185d9e6c0 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2489,8 +2489,8 @@ static int rcu_torture_stall(void *args)
 			preempt_disable();
 		pr_alert("%s start on CPU %d.\n",
 			  __func__, raw_smp_processor_id());
-		while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
-				    stop_at))
+		while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) &&
+		       !kthread_should_stop())
 			if (stall_cpu_block) {
 #ifdef CONFIG_PREEMPTION
 				preempt_schedule();
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 46/48] rcutorture: Fix invalid context warning when enable srcu barrier testing
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (44 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 45/48] rcutorture: Make stall-tasks directly exit when rcutorture tests end Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 47/48] torture: Scale --do-kvfree test time Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 48/48] rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test Uladzislau Rezki (Sony)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

When the torture_type is set srcu or srcud and cb_barrier is
non-zero, running the rcutorture test will trigger the
following warning:

[  163.910989][    C1] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[  163.910994][    C1] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
[  163.910999][    C1] preempt_count: 10001, expected: 0
[  163.911002][    C1] RCU nest depth: 0, expected: 0
[  163.911005][    C1] INFO: lockdep is turned off.
[  163.911007][    C1] irq event stamp: 30964
[  163.911010][    C1] hardirqs last  enabled at (30963): [<ffffffffabc7df52>] do_idle+0x362/0x500
[  163.911018][    C1] hardirqs last disabled at (30964): [<ffffffffae616eff>] sysvec_call_function_single+0xf/0xd0
[  163.911025][    C1] softirqs last  enabled at (0): [<ffffffffabb6475f>] copy_process+0x16ff/0x6580
[  163.911033][    C1] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  163.911038][    C1] Preemption disabled at:
[  163.911039][    C1] [<ffffffffacf1964b>] stack_depot_save_flags+0x24b/0x6c0
[  163.911063][    C1] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W          6.8.0-rc4-rt4-yocto-preempt-rt+ #3 1e39aa9a737dd024a3275c4f835a872f673a7d3a
[  163.911071][    C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[  163.911075][    C1] Call Trace:
[  163.911078][    C1]  <IRQ>
[  163.911080][    C1]  dump_stack_lvl+0x88/0xd0
[  163.911089][    C1]  dump_stack+0x10/0x20
[  163.911095][    C1]  __might_resched+0x36f/0x530
[  163.911105][    C1]  rt_spin_lock+0x82/0x1c0
[  163.911112][    C1]  spin_lock_irqsave_ssp_contention+0xb8/0x100
[  163.911121][    C1]  srcu_gp_start_if_needed+0x782/0xf00
[  163.911128][    C1]  ? _raw_spin_unlock_irqrestore+0x46/0x70
[  163.911136][    C1]  ? debug_object_active_state+0x336/0x470
[  163.911148][    C1]  ? __pfx_srcu_gp_start_if_needed+0x10/0x10
[  163.911156][    C1]  ? __pfx_lock_release+0x10/0x10
[  163.911165][    C1]  ? __pfx_rcu_torture_barrier_cbf+0x10/0x10
[  163.911188][    C1]  __call_srcu+0x9f/0xe0
[  163.911196][    C1]  call_srcu+0x13/0x20
[  163.911201][    C1]  srcu_torture_call+0x1b/0x30
[  163.911224][    C1]  rcu_torture_barrier1cb+0x4a/0x60
[  163.911247][    C1]  __flush_smp_call_function_queue+0x267/0xca0
[  163.911256][    C1]  ? __pfx_rcu_torture_barrier1cb+0x10/0x10
[  163.911281][    C1]  generic_smp_call_function_single_interrupt+0x13/0x20
[  163.911288][    C1]  __sysvec_call_function_single+0x7d/0x280
[  163.911295][    C1]  sysvec_call_function_single+0x93/0xd0
[  163.911302][    C1]  </IRQ>
[  163.911304][    C1]  <TASK>
[  163.911308][    C1]  asm_sysvec_call_function_single+0x1b/0x20
[  163.911313][    C1] RIP: 0010:default_idle+0x17/0x20
[  163.911326][    C1] RSP: 0018:ffff888001997dc8 EFLAGS: 00000246
[  163.911333][    C1] RAX: 0000000000000000 RBX: dffffc0000000000 RCX: ffffffffae618b51
[  163.911337][    C1] RDX: 0000000000000000 RSI: ffffffffaea80920 RDI: ffffffffaec2de80
[  163.911342][    C1] RBP: ffff888001997dc8 R08: 0000000000000001 R09: ffffed100d740cad
[  163.911346][    C1] R10: ffffed100d740cac R11: ffff88806ba06563 R12: 0000000000000001
[  163.911350][    C1] R13: ffffffffafe460c0 R14: ffffffffafe460c0 R15: 0000000000000000
[  163.911358][    C1]  ? ct_kernel_exit.constprop.3+0x121/0x160
[  163.911369][    C1]  ? lockdep_hardirqs_on+0xc4/0x150
[  163.911376][    C1]  arch_cpu_idle+0x9/0x10
[  163.911383][    C1]  default_idle_call+0x7a/0xb0
[  163.911390][    C1]  do_idle+0x362/0x500
[  163.911398][    C1]  ? __pfx_do_idle+0x10/0x10
[  163.911404][    C1]  ? complete_with_flags+0x8b/0xb0
[  163.911416][    C1]  cpu_startup_entry+0x58/0x70
[  163.911423][    C1]  start_secondary+0x221/0x280
[  163.911430][    C1]  ? __pfx_start_secondary+0x10/0x10
[  163.911440][    C1]  secondary_startup_64_no_verify+0x17f/0x18b
[  163.911455][    C1]  </TASK>

This commit therefore use smp_call_on_cpu() instead of
smp_call_function_single(), make rcu_torture_barrier1cb() invoked
happens on task-context.

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 456185d9e6c0..8654e99bd4a3 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -3044,11 +3044,12 @@ static void rcu_torture_barrier_cbf(struct rcu_head *rcu)
 }
 
 /* IPI handler to get callback posted on desired CPU, if online. */
-static void rcu_torture_barrier1cb(void *rcu_void)
+static int rcu_torture_barrier1cb(void *rcu_void)
 {
 	struct rcu_head *rhp = rcu_void;
 
 	cur_ops->call(rhp, rcu_torture_barrier_cbf);
+	return 0;
 }
 
 /* kthread function to register callbacks used to test RCU barriers. */
@@ -3074,11 +3075,9 @@ static int rcu_torture_barrier_cbs(void *arg)
 		 * The above smp_load_acquire() ensures barrier_phase load
 		 * is ordered before the following ->call().
 		 */
-		if (smp_call_function_single(myid, rcu_torture_barrier1cb,
-					     &rcu, 1)) {
-			// IPI failed, so use direct call from current CPU.
+		if (smp_call_on_cpu(myid, rcu_torture_barrier1cb, &rcu, 1))
 			cur_ops->call(&rcu, rcu_torture_barrier_cbf);
-		}
+
 		if (atomic_dec_and_test(&barrier_cbs_count))
 			wake_up(&barrier_wq);
 	} while (!torture_must_stop());
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 47/48] torture: Scale --do-kvfree test time
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (45 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 46/48] rcutorture: Fix invalid context warning when enable srcu barrier testing Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  2024-05-07  9:35 ` [PATCH 48/48] rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test Uladzislau Rezki (Sony)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker

From: "Paul E. McKenney" <paulmck@kernel.org>

Currently, the torture.sh --do-kvfree testing is hard-coded to ten
minutes, ignoring the --duration argument.  This commit therefore scales
this test duration the same as for the rcutorture tests.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 tools/testing/selftests/rcutorture/bin/torture.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/rcutorture/bin/torture.sh b/tools/testing/selftests/rcutorture/bin/torture.sh
index 13875ee7b050..990d24696fd3 100755
--- a/tools/testing/selftests/rcutorture/bin/torture.sh
+++ b/tools/testing/selftests/rcutorture/bin/torture.sh
@@ -559,7 +559,7 @@ do_kcsan="$do_kcsan_save"
 if test "$do_kvfree" = "yes"
 then
 	torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
-	torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
+	torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
 fi
 
 if test "$do_clocksourcewd" = "yes"
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 48/48] rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test
  2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
                   ` (46 preceding siblings ...)
  2024-05-07  9:35 ` [PATCH 47/48] torture: Scale --do-kvfree test time Uladzislau Rezki (Sony)
@ 2024-05-07  9:35 ` Uladzislau Rezki (Sony)
  47 siblings, 0 replies; 70+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-05-07  9:35 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Uladzislau Rezki, Oleksiy Avramchenko, Frederic Weisbecker,
	Zqiang

From: Zqiang <qiang.zhang1211@gmail.com>

The rcu_gp_slow_register/unregister() is only useful in tests where
torture_type=rcu, so this commit therefore generates ->gp_slow_register()
and ->gp_slow_unregister() function pointers in the rcu_torture_ops
structure, and slows grace periods only when these function pointers
exist.

Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 kernel/rcu/rcutorture.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 8654e99bd4a3..807fbf6123a7 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -382,6 +382,8 @@ struct rcu_torture_ops {
 	bool (*check_boost_failed)(unsigned long gp_state, int *cpup);
 	int (*stall_dur)(void);
 	void (*get_gp_data)(int *flags, unsigned long *gp_seq);
+	void (*gp_slow_register)(atomic_t *rgssp);
+	void (*gp_slow_unregister)(atomic_t *rgssp);
 	long cbflood_max;
 	int irq_capable;
 	int can_boost;
@@ -570,6 +572,8 @@ static struct rcu_torture_ops rcu_ops = {
 	.check_boost_failed	= rcu_check_boost_fail,
 	.stall_dur		= rcu_jiffies_till_stall_check,
 	.get_gp_data		= rcutorture_get_gp_data,
+	.gp_slow_register	= rcu_gp_slow_register,
+	.gp_slow_unregister	= rcu_gp_slow_unregister,
 	.irq_capable		= 1,
 	.can_boost		= IS_ENABLED(CONFIG_RCU_BOOST),
 	.extendables		= RCUTORTURE_MAX_EXTEND,
@@ -3343,12 +3347,12 @@ rcu_torture_cleanup(void)
 			pr_info("%s: Invoking %pS().\n", __func__, cur_ops->cb_barrier);
 			cur_ops->cb_barrier();
 		}
-		rcu_gp_slow_unregister(NULL);
+		if (cur_ops->gp_slow_unregister)
+			cur_ops->gp_slow_unregister(NULL);
 		return;
 	}
 	if (!cur_ops) {
 		torture_cleanup_end();
-		rcu_gp_slow_unregister(NULL);
 		return;
 	}
 
@@ -3447,7 +3451,8 @@ rcu_torture_cleanup(void)
 	else
 		rcu_torture_print_module_parms(cur_ops, "End of test: SUCCESS");
 	torture_cleanup_end();
-	rcu_gp_slow_unregister(&rcu_fwd_cb_nodelay);
+	if (cur_ops->gp_slow_unregister)
+		cur_ops->gp_slow_unregister(NULL);
 }
 
 #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
@@ -3929,7 +3934,8 @@ rcu_torture_init(void)
 	if (object_debug)
 		rcu_test_debug_objects();
 	torture_init_end();
-	rcu_gp_slow_register(&rcu_fwd_cb_nodelay);
+	if (cur_ops->gp_slow_register && !WARN_ON_ONCE(!cur_ops->gp_slow_unregister))
+		cur_ops->gp_slow_register(&rcu_fwd_cb_nodelay);
 	return 0;
 
 unwind:
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-07  9:35 ` [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field Uladzislau Rezki (Sony)
@ 2024-05-07 14:54   ` Oleg Nesterov
  2024-05-07 17:44     ` Paul E. McKenney
  2024-05-12 11:19   ` [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless Oleg Nesterov
  1 sibling, 1 reply; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-07 14:54 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: Paul E . McKenney, RCU, Neeraj upadhyay, Boqun Feng,
	Hillf Danton, Joel Fernandes, LKML, Oleksiy Avramchenko,
	Frederic Weisbecker, Peter Zijlstra

Hello,

I feel I don't really like this patch but I am travelling without my working
laptop, can't read the source code ;) Quite possibly I am wrong, I'll return
to this when I get back on May 10.

Oleg.

On 05/07, Uladzislau Rezki (Sony) wrote:
>
> From: "Paul E. McKenney" <paulmck@kernel.org>
>
> The rcu_sync structure's ->gp_count field is updated under the protection
> of ->rss_lock, but read locklessly, and KCSAN noted the data race.
> This commit therefore uses WRITE_ONCE() to do this update to clearly
> document its racy nature.
>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  kernel/rcu/sync.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
> index 86df878a2fee..6c2bd9001adc 100644
> --- a/kernel/rcu/sync.c
> +++ b/kernel/rcu/sync.c
> @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
>  		 * we are called at early boot time but this shouldn't happen.
>  		 */
>  	}
> -	rsp->gp_count++;
> +	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
>  	spin_unlock_irq(&rsp->rss_lock);
>
>  	if (gp_state == GP_IDLE) {
> @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp)
>   */
>  void rcu_sync_exit(struct rcu_sync *rsp)
>  {
> +	int gpc;
> +
>  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
>  	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
>
>  	spin_lock_irq(&rsp->rss_lock);
> -	if (!--rsp->gp_count) {
> +	gpc = rsp->gp_count - 1;
> +	WRITE_ONCE(rsp->gp_count, gpc);
> +	if (!gpc) {
>  		if (rsp->gp_state == GP_PASSED) {
>  			WRITE_ONCE(rsp->gp_state, GP_EXIT);
>  			rcu_sync_call(rsp);
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-07 14:54   ` Oleg Nesterov
@ 2024-05-07 17:44     ` Paul E. McKenney
  2024-05-09 15:13       ` Oleg Nesterov
  2024-05-10 13:18       ` Oleg Nesterov
  0 siblings, 2 replies; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-07 17:44 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Tue, May 07, 2024 at 10:54:41AM -0400, Oleg Nesterov wrote:
> Hello,
> 
> I feel I don't really like this patch but I am travelling without my working
> laptop, can't read the source code ;) Quite possibly I am wrong, I'll return
> to this when I get back on May 10.

By the stricter data-race rules used in RCU code [1], this is a bug that
needs to be fixed.  This code is updating ->gp_count, which is read
locklessly, which in turn results in a data race.  The fix is to mark
the updates (as below) with WRITE_ONCE().

Or is there something in one or the other of these updates to ->gp_count
that excludes lockless readers?  (I am not seeing it, but you know this
code way better than I do!)

							Thanx, Paul

[1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing

> Oleg.
> 
> On 05/07, Uladzislau Rezki (Sony) wrote:
> >
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> >
> > The rcu_sync structure's ->gp_count field is updated under the protection
> > of ->rss_lock, but read locklessly, and KCSAN noted the data race.
> > This commit therefore uses WRITE_ONCE() to do this update to clearly
> > document its racy nature.
> >
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Oleg Nesterov <oleg@redhat.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> >  kernel/rcu/sync.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
> > index 86df878a2fee..6c2bd9001adc 100644
> > --- a/kernel/rcu/sync.c
> > +++ b/kernel/rcu/sync.c
> > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> >  		 * we are called at early boot time but this shouldn't happen.
> >  		 */
> >  	}
> > -	rsp->gp_count++;
> > +	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
> >  	spin_unlock_irq(&rsp->rss_lock);
> >
> >  	if (gp_state == GP_IDLE) {
> > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> >   */
> >  void rcu_sync_exit(struct rcu_sync *rsp)
> >  {
> > +	int gpc;
> > +
> >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
> >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
> >
> >  	spin_lock_irq(&rsp->rss_lock);
> > -	if (!--rsp->gp_count) {
> > +	gpc = rsp->gp_count - 1;
> > +	WRITE_ONCE(rsp->gp_count, gpc);
> > +	if (!gpc) {
> >  		if (rsp->gp_state == GP_PASSED) {
> >  			WRITE_ONCE(rsp->gp_state, GP_EXIT);
> >  			rcu_sync_call(rsp);
> > --
> > 2.39.2
> >
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-07 17:44     ` Paul E. McKenney
@ 2024-05-09 15:13       ` Oleg Nesterov
  2024-05-10  3:59         ` Paul E. McKenney
  2024-05-10 13:18       ` Oleg Nesterov
  1 sibling, 1 reply; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-09 15:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/07, Paul E. McKenney wrote:
>
> On Tue, May 07, 2024 at 10:54:41AM -0400, Oleg Nesterov wrote:
> > Hello,
> >
> > I feel I don't really like this patch but I am travelling without my working
> > laptop, can't read the source code ;) Quite possibly I am wrong, I'll return
> > to this when I get back on May 10.
>
> By the stricter data-race rules used in RCU code [1], this is a bug that
> needs to be fixed.

Now that I can read the code... Sorry, still can't understand.

> which is read locklessly,

Where???

OK, OK, we have

	// rcu_sync_exit()
	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0)

and

	// rcu_sync_dtor()
	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));

other than that ->gp_count is always accessed under ->rss_lock.

And yes, at least WARN_ON_ONCE() in rcu_sync_exit() can obviously race with
rcu_sync_enter/exit which update gp_count. I think this is fine correctness-wise.

But OK, we need to please KCSAN (or is there another problem I missed ???)

We can move these WARN_ON()'s into the ->rss_lock protected section.

Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought
that READ_ONCE() should be enough...

Or we can simply kill these WARN_ON_ONCE()'s.

I don't understand why should we add more WRITE_ONCE()'s into the critical
section protected by ->rss_lock.

Help! ;)

Oleg.


which in turn results in a data race.  The fix is to mark
> the updates (as below) with WRITE_ONCE().
>
> Or is there something in one or the other of these updates to ->gp_count
> that excludes lockless readers?  (I am not seeing it, but you know this
> code way better than I do!)
>
> 							Thanx, Paul
>
> [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing
>
> > Oleg.
> >
> > On 05/07, Uladzislau Rezki (Sony) wrote:
> > >
> > > From: "Paul E. McKenney" <paulmck@kernel.org>
> > >
> > > The rcu_sync structure's ->gp_count field is updated under the protection
> > > of ->rss_lock, but read locklessly, and KCSAN noted the data race.
> > > This commit therefore uses WRITE_ONCE() to do this update to clearly
> > > document its racy nature.
> > >
> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > Cc: Oleg Nesterov <oleg@redhat.com>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > ---
> > >  kernel/rcu/sync.c | 8 ++++++--
> > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
> > > index 86df878a2fee..6c2bd9001adc 100644
> > > --- a/kernel/rcu/sync.c
> > > +++ b/kernel/rcu/sync.c
> > > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> > >  		 * we are called at early boot time but this shouldn't happen.
> > >  		 */
> > >  	}
> > > -	rsp->gp_count++;
> > > +	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
> > >  	spin_unlock_irq(&rsp->rss_lock);
> > >
> > >  	if (gp_state == GP_IDLE) {
> > > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> > >   */
> > >  void rcu_sync_exit(struct rcu_sync *rsp)
> > >  {
> > > +	int gpc;
> > > +
> > >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
> > >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
> > >
> > >  	spin_lock_irq(&rsp->rss_lock);
> > > -	if (!--rsp->gp_count) {
> > > +	gpc = rsp->gp_count - 1;
> > > +	WRITE_ONCE(rsp->gp_count, gpc);
> > > +	if (!gpc) {
> > >  		if (rsp->gp_state == GP_PASSED) {
> > >  			WRITE_ONCE(rsp->gp_state, GP_EXIT);
> > >  			rcu_sync_call(rsp);
> > > --
> > > 2.39.2
> > >
> >
>


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-09 15:13       ` Oleg Nesterov
@ 2024-05-10  3:59         ` Paul E. McKenney
  2024-05-10 11:31           ` Oleg Nesterov
  0 siblings, 1 reply; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-10  3:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote:
> On 05/07, Paul E. McKenney wrote:
> >
> > On Tue, May 07, 2024 at 10:54:41AM -0400, Oleg Nesterov wrote:
> > > Hello,
> > >
> > > I feel I don't really like this patch but I am travelling without my working
> > > laptop, can't read the source code ;) Quite possibly I am wrong, I'll return
> > > to this when I get back on May 10.
> >
> > By the stricter data-race rules used in RCU code [1], this is a bug that
> > needs to be fixed.
> 
> Now that I can read the code... Sorry, still can't understand.
> 
> > which is read locklessly,
> 
> Where???
> 
> OK, OK, we have
> 
> 	// rcu_sync_exit()
> 	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0)
> 
> and
> 
> 	// rcu_sync_dtor()
> 	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
> 
> other than that ->gp_count is always accessed under ->rss_lock.
> 
> And yes, at least WARN_ON_ONCE() in rcu_sync_exit() can obviously race with
> rcu_sync_enter/exit which update gp_count. I think this is fine correctness-wise.
> 
> But OK, we need to please KCSAN (or is there another problem I missed ???)
> 
> We can move these WARN_ON()'s into the ->rss_lock protected section.
> 
> Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought
> that READ_ONCE() should be enough...
> 
> Or we can simply kill these WARN_ON_ONCE()'s.

Or we could move those WARN_ON_ONCE() under the lock.  If this would
be a lock-contention issue, we could condition them with something like
IS_ENABLED(CONFIG_PROVE_RCU).  Then all accesses to those variables would
always be protected by the lock, and the WRITE_ONCE() and READ_ONCE()
calls could be dropped.  (Or am I missing another lockless access?)

Which would have the further advantage that if anyone accessed these
without holding the lock, KCSAN would complain.

> I don't understand why should we add more WRITE_ONCE()'s into the critical
> section protected by ->rss_lock.

There are indeed several ways to fix this.  Which would you prefer?

> Help! ;)

;-) ;-) ;-)

							Thanx, Paul

> Oleg.
> 
> 
> which in turn results in a data race.  The fix is to mark
> > the updates (as below) with WRITE_ONCE().
> >
> > Or is there something in one or the other of these updates to ->gp_count
> > that excludes lockless readers?  (I am not seeing it, but you know this
> > code way better than I do!)
> >
> > 							Thanx, Paul
> >
> > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing
> >
> > > Oleg.
> > >
> > > On 05/07, Uladzislau Rezki (Sony) wrote:
> > > >
> > > > From: "Paul E. McKenney" <paulmck@kernel.org>
> > > >
> > > > The rcu_sync structure's ->gp_count field is updated under the protection
> > > > of ->rss_lock, but read locklessly, and KCSAN noted the data race.
> > > > This commit therefore uses WRITE_ONCE() to do this update to clearly
> > > > document its racy nature.
> > > >
> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > Cc: Oleg Nesterov <oleg@redhat.com>
> > > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > > ---
> > > >  kernel/rcu/sync.c | 8 ++++++--
> > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
> > > > index 86df878a2fee..6c2bd9001adc 100644
> > > > --- a/kernel/rcu/sync.c
> > > > +++ b/kernel/rcu/sync.c
> > > > @@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> > > >  		 * we are called at early boot time but this shouldn't happen.
> > > >  		 */
> > > >  	}
> > > > -	rsp->gp_count++;
> > > > +	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
> > > >  	spin_unlock_irq(&rsp->rss_lock);
> > > >
> > > >  	if (gp_state == GP_IDLE) {
> > > > @@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> > > >   */
> > > >  void rcu_sync_exit(struct rcu_sync *rsp)
> > > >  {
> > > > +	int gpc;
> > > > +
> > > >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
> > > >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
> > > >
> > > >  	spin_lock_irq(&rsp->rss_lock);
> > > > -	if (!--rsp->gp_count) {
> > > > +	gpc = rsp->gp_count - 1;
> > > > +	WRITE_ONCE(rsp->gp_count, gpc);
> > > > +	if (!gpc) {
> > > >  		if (rsp->gp_state == GP_PASSED) {
> > > >  			WRITE_ONCE(rsp->gp_state, GP_EXIT);
> > > >  			rcu_sync_call(rsp);
> > > > --
> > > > 2.39.2
> > > >
> > >
> >
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10  3:59         ` Paul E. McKenney
@ 2024-05-10 11:31           ` Oleg Nesterov
  2024-05-10 14:00             ` Alan Huang
  2024-05-10 14:04             ` Paul E. McKenney
  0 siblings, 2 replies; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-10 11:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/09, Paul E. McKenney wrote:
>
> On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote:
> >
> > We can move these WARN_ON()'s into the ->rss_lock protected section.
> >
> > Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought
> > that READ_ONCE() should be enough...
> >
> > Or we can simply kill these WARN_ON_ONCE()'s.
>
> Or we could move those WARN_ON_ONCE() under the lock.

Sure, see above.

But could you help me to understand this magic? I naively thought
that READ_ONCE() is always "safe"...

So, unless I am totally confused it turns out that, say,

	CPU 0			CPU 1
	-----			-----

	spin_lock(LOCK);
	++X;			READ_ONCE(X); // data race
	spin_unlock(LOCK);

is data-racy accoring to KCSAN, while

	CPU 0			CPU 1
	-----			-----

	spin_lock(LOCK);
	WRITE_ONCE(X, X+1);	READ_ONCE(X); // no data race
	spin_unlock(LOCK);

is not.

Why is that?

Trying to read Documentation/dev-tools/kcsan.rst... it says

	KCSAN is aware of *marked atomic operations* (``READ_ONCE``, WRITE_ONCE``,

	...

	if all accesses to a variable that is accessed concurrently are properly
	marked, KCSAN will never trigger a watchpoint

but how can KCSAN detect that all accesses to X are properly marked? I see nothing
KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE().

And what does the "all accesses" above actually mean? The 2nd version does

	WRITE_ONCE(X, X+1);

but "X + 1" is the plain/unmarked access?

Thanks,

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-07 17:44     ` Paul E. McKenney
  2024-05-09 15:13       ` Oleg Nesterov
@ 2024-05-10 13:18       ` Oleg Nesterov
  2024-05-10 13:50         ` Oleg Nesterov
  2024-05-10 13:58         ` Paul E. McKenney
  1 sibling, 2 replies; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-10 13:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/07, Paul E. McKenney wrote:
>
> By the stricter data-race rules used in RCU code [1],
...
> [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing

I am getting more and more confused...

Does this mean that KCSAN/etc treats the files in kernel/rcu/
differently than the "Rest of Kernel"? Or what?

And how is it enforced?

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 13:18       ` Oleg Nesterov
@ 2024-05-10 13:50         ` Oleg Nesterov
  2024-05-10 14:11           ` Paul E. McKenney
  2024-05-10 13:58         ` Paul E. McKenney
  1 sibling, 1 reply; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-10 13:50 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/10, Oleg Nesterov wrote:
>
> On 05/07, Paul E. McKenney wrote:
> >
> > By the stricter data-race rules used in RCU code [1],
> ...
> > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing
>
> I am getting more and more confused...
>
> Does this mean that KCSAN/etc treats the files in kernel/rcu/
> differently than the "Rest of Kernel"? Or what?
>
> And how is it enforced?

I can only find the strnstr(buf, "rcu") checks in skip_report(),
but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case...

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 13:18       ` Oleg Nesterov
  2024-05-10 13:50         ` Oleg Nesterov
@ 2024-05-10 13:58         ` Paul E. McKenney
  1 sibling, 0 replies; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-10 13:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Fri, May 10, 2024 at 03:18:50PM +0200, Oleg Nesterov wrote:
> On 05/07, Paul E. McKenney wrote:
> >
> > By the stricter data-race rules used in RCU code [1],
> ...
> > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing
> 
> I am getting more and more confused...
> 
> Does this mean that KCSAN/etc treats the files in kernel/rcu/
> differently than the "Rest of Kernel"? Or what?

Yes.

> And how is it enforced?

By me running rcutorture with KCSAN with the Kconfig options listed in
that docusment.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 11:31           ` Oleg Nesterov
@ 2024-05-10 14:00             ` Alan Huang
  2024-05-10 14:04             ` Paul E. McKenney
  1 sibling, 0 replies; 70+ messages in thread
From: Alan Huang @ 2024-05-10 14:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Paul E. McKenney, Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

> 2024年5月10日 19:31,Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 05/09, Paul E. McKenney wrote:
>> 
>> On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote:
>>> 
>>> We can move these WARN_ON()'s into the ->rss_lock protected section.
>>> 
>>> Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought
>>> that READ_ONCE() should be enough...
>>> 
>>> Or we can simply kill these WARN_ON_ONCE()'s.
>> 
>> Or we could move those WARN_ON_ONCE() under the lock.
> 
> Sure, see above.
> 
> But could you help me to understand this magic? I naively thought
> that READ_ONCE() is always "safe"...
> 
> So, unless I am totally confused it turns out that, say,
> 
> CPU 0 CPU 1
> ----- -----
> 
> spin_lock(LOCK);
> ++X; READ_ONCE(X); // data race
> spin_unlock(LOCK);
> 
> is data-racy accoring to KCSAN, while
> 
> CPU 0 CPU 1
> ----- -----
> 
> spin_lock(LOCK);
> WRITE_ONCE(X, X+1); READ_ONCE(X); // no data race
> spin_unlock(LOCK);
> 
> is not.
> 
> Why is that?
> 
> Trying to read Documentation/dev-tools/kcsan.rst... it says
> 
> KCSAN is aware of *marked atomic operations* (``READ_ONCE``, WRITE_ONCE``,
> 
> ...
> 
> if all accesses to a variable that is accessed concurrently are properly
> marked, KCSAN will never trigger a watchpoint
> 
> but how can KCSAN detect that all accesses to X are properly marked? I see nothing
> KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE().
> 
> And what does the "all accesses" above actually mean? The 2nd version does
> 
> WRITE_ONCE(X, X+1);
> 
> but "X + 1" is the plain/unmarked access?

X + 1 and READ_ONCE(X) are two read.

> 
> Thanks,
> 
> Oleg.
> 
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 11:31           ` Oleg Nesterov
  2024-05-10 14:00             ` Alan Huang
@ 2024-05-10 14:04             ` Paul E. McKenney
  2024-05-12 10:53               ` Oleg Nesterov
  1 sibling, 1 reply; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-10 14:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Fri, May 10, 2024 at 01:31:49PM +0200, Oleg Nesterov wrote:
> On 05/09, Paul E. McKenney wrote:
> >
> > On Thu, May 09, 2024 at 05:13:12PM +0200, Oleg Nesterov wrote:
> > >
> > > We can move these WARN_ON()'s into the ->rss_lock protected section.
> > >
> > > Or perhaps we can use data_race(rsp->gp_count) ? To be honest I thought
> > > that READ_ONCE() should be enough...
> > >
> > > Or we can simply kill these WARN_ON_ONCE()'s.
> >
> > Or we could move those WARN_ON_ONCE() under the lock.
> 
> Sure, see above.
> 
> But could you help me to understand this magic? I naively thought
> that READ_ONCE() is always "safe"...
> 
> So, unless I am totally confused it turns out that, say,
> 
> 	CPU 0			CPU 1
> 	-----			-----
> 
> 	spin_lock(LOCK);
> 	++X;			READ_ONCE(X); // data race
> 	spin_unlock(LOCK);
> 
> is data-racy accoring to KCSAN, while
> 
> 	CPU 0			CPU 1
> 	-----			-----
> 
> 	spin_lock(LOCK);
> 	WRITE_ONCE(X, X+1);	READ_ONCE(X); // no data race
> 	spin_unlock(LOCK);
> 
> is not.

Agreed, in RCU code.

> Why is that?

Because I run KCSAN on RCU using Kconfig options that cause KCSAN
to be more strict.

> Trying to read Documentation/dev-tools/kcsan.rst... it says
> 
> 	KCSAN is aware of *marked atomic operations* (``READ_ONCE``, WRITE_ONCE``,
> 
> 	...
> 
> 	if all accesses to a variable that is accessed concurrently are properly
> 	marked, KCSAN will never trigger a watchpoint
> 
> but how can KCSAN detect that all accesses to X are properly marked? I see nothing
> KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE().

The trick is that KCSAN sees the volatile cast that both READ_ONCE()
and WRITE_ONCE() use.

> And what does the "all accesses" above actually mean? The 2nd version does
> 
> 	WRITE_ONCE(X, X+1);
> 
> but "X + 1" is the plain/unmarked access?

That would be the correct usage in RCU code if there were lockless
accesses to X, which would use READ_ONCE(), but a lock was held across
that WRITE_ONCE() such that there would be no concurrent updates of X.
In that case, the "X+1" cannot be involved in a data race, so KCSAN
won't complain.

But if all accesses to X were protected by an exclusive lock, then there
would be no data races involving X, and thus no marking of any accesses
to X.  Which would allow KCSAN to detect buggy lockless accesses to X.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 13:50         ` Oleg Nesterov
@ 2024-05-10 14:11           ` Paul E. McKenney
  2024-05-12 10:41             ` Oleg Nesterov
  2024-05-13 14:13             ` Marco Elver
  0 siblings, 2 replies; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-10 14:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Fri, May 10, 2024 at 03:50:57PM +0200, Oleg Nesterov wrote:
> On 05/10, Oleg Nesterov wrote:
> >
> > On 05/07, Paul E. McKenney wrote:
> > >
> > > By the stricter data-race rules used in RCU code [1],
> > ...
> > > [1] https://docs.google.com/document/d/1FwZaXSg3A55ivVoWffA9iMuhJ3_Gmj_E494dLYjjyLQ/edit?usp=sharing
> >
> > I am getting more and more confused...
> >
> > Does this mean that KCSAN/etc treats the files in kernel/rcu/
> > differently than the "Rest of Kernel"? Or what?
> >
> > And how is it enforced?
> 
> I can only find the strnstr(buf, "rcu") checks in skip_report(),
> but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case...

Huh, new one on me!  When I run KCSAN, I set CONFIG_KCSAN_STRICT=y,
which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should
prevent skip_report() from even being invoked.

Which suggests that in the rest of the kernel, including "rcu_"
in your function name gets you stricter KCSAN checking.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 14:11           ` Paul E. McKenney
@ 2024-05-12 10:41             ` Oleg Nesterov
  2024-05-13 14:13             ` Marco Elver
  1 sibling, 0 replies; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-12 10:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

Sorry for another delay...

On 05/10, Paul E. McKenney wrote:
>
> On Fri, May 10, 2024 at 03:50:57PM +0200, Oleg Nesterov wrote:
> >
> > I can only find the strnstr(buf, "rcu") checks in skip_report(),
> > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case...
>
> Huh, new one on me!  When I run KCSAN, I set CONFIG_KCSAN_STRICT=y,
> which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should
> prevent skip_report() from even being invoked.
>
> Which suggests that in the rest of the kernel, including "rcu_"
> in your function name gets you stricter KCSAN checking.  ;-)

Yes.

And that is why I was very confused. I misinterpreted the "stricter
data-race rules used in RCU code" as if there must be more "rcu-only"
hacks in the kernel/kcsan/ code which I can't find ;)

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 14:04             ` Paul E. McKenney
@ 2024-05-12 10:53               ` Oleg Nesterov
  2024-05-12 14:57                 ` Paul E. McKenney
  0 siblings, 1 reply; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-12 10:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/10, Paul E. McKenney wrote:
>
> On Fri, May 10, 2024 at 01:31:49PM +0200, Oleg Nesterov wrote:
>
> > Why is that?
>
> Because I run KCSAN on RCU using Kconfig options that cause KCSAN
> to be more strict.

Yes, I see now.

> > but how can KCSAN detect that all accesses to X are properly marked? I see nothing
> > KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE().
>
> The trick is that KCSAN sees the volatile cast that both READ_ONCE()
> and WRITE_ONCE() use.

Hmm. grep-grep-grep... I seem to understand, DEFINE_TSAN_VOLATILE_READ_WRITE.
So __tsan_volatile_readX() will use KCSAN_ACCESS_ATOMIC.

Thanks!

> > And what does the "all accesses" above actually mean? The 2nd version does
> >
> > 	WRITE_ONCE(X, X+1);
> >
> > but "X + 1" is the plain/unmarked access?
>
> ...
>
> In that case, the "X+1" cannot be involved in a data race, so KCSAN
> won't complain.

Yes, yes, I understand now.

Paul, thanks for your explanations! and sorry for wasting your time by
provoking the unnecessarily long discussion.

I am going to send the trivial patch which moves these WARN_ON()'s under
spin_lock(), this looks more clean to me. But I won't argue if you prefer
your original patch.

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless
  2024-05-07  9:35 ` [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field Uladzislau Rezki (Sony)
  2024-05-07 14:54   ` Oleg Nesterov
@ 2024-05-12 11:19   ` Oleg Nesterov
  2024-05-12 15:10     ` Paul E. McKenney
  1 sibling, 1 reply; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-12 11:19 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony), Paul E . McKenney
  Cc: RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

rcu_sync->gp_count is updated under the protection of ->rss_lock but read
locklessly by the WARN_ON() checks, and KCSAN noted the data race.

Move these WARN_ON_ONCE()'s under the lock and remove the no longer needed
READ_ONCE().

Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/rcu/sync.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 86df878a2fee..b50fde082198 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -152,9 +152,9 @@ void rcu_sync_enter(struct rcu_sync *rsp)
 void rcu_sync_exit(struct rcu_sync *rsp)
 {
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
-	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
 
 	spin_lock_irq(&rsp->rss_lock);
+	WARN_ON_ONCE(rsp->gp_count == 0);
 	if (!--rsp->gp_count) {
 		if (rsp->gp_state == GP_PASSED) {
 			WRITE_ONCE(rsp->gp_state, GP_EXIT);
@@ -174,10 +174,10 @@ void rcu_sync_dtor(struct rcu_sync *rsp)
 {
 	int gp_state;
 
-	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
 
 	spin_lock_irq(&rsp->rss_lock);
+	WARN_ON_ONCE(rsp->gp_count != 0);
 	if (rsp->gp_state == GP_REPLAY)
 		WRITE_ONCE(rsp->gp_state, GP_EXIT);
 	gp_state = rsp->gp_state;
-- 
2.25.1.362.g51ebf55



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-12 10:53               ` Oleg Nesterov
@ 2024-05-12 14:57                 ` Paul E. McKenney
  0 siblings, 0 replies; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-12 14:57 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Sun, May 12, 2024 at 12:53:06PM +0200, Oleg Nesterov wrote:
> On 05/10, Paul E. McKenney wrote:
> >
> > On Fri, May 10, 2024 at 01:31:49PM +0200, Oleg Nesterov wrote:
> >
> > > Why is that?
> >
> > Because I run KCSAN on RCU using Kconfig options that cause KCSAN
> > to be more strict.
> 
> Yes, I see now.
> 
> > > but how can KCSAN detect that all accesses to X are properly marked? I see nothing
> > > KCSAN-related in the definition of WRITE_ONCE() or READ_ONCE().
> >
> > The trick is that KCSAN sees the volatile cast that both READ_ONCE()
> > and WRITE_ONCE() use.
> 
> Hmm. grep-grep-grep... I seem to understand, DEFINE_TSAN_VOLATILE_READ_WRITE.
> So __tsan_volatile_readX() will use KCSAN_ACCESS_ATOMIC.
> 
> Thanks!

You got it!!!

> > > And what does the "all accesses" above actually mean? The 2nd version does
> > >
> > > 	WRITE_ONCE(X, X+1);
> > >
> > > but "X + 1" is the plain/unmarked access?
> >
> > ...
> >
> > In that case, the "X+1" cannot be involved in a data race, so KCSAN
> > won't complain.
> 
> Yes, yes, I understand now.
> 
> Paul, thanks for your explanations! and sorry for wasting your time by
> provoking the unnecessarily long discussion.

Not a problem and absolutely no need to apologize!  Of course, please do
pass this information on to anyone else needing it.

> I am going to send the trivial patch which moves these WARN_ON()'s under
> spin_lock(), this looks more clean to me. But I won't argue if you prefer
> your original patch.

Actually, I like your patch quite a bit better than I do my original.
In fact, I feel quite foolish that I did not think of this to begin with.
With your way, we have strict locking for that field and can therefore
just use plain C-language accesses for all accesses to it.  KCSAN will
then warn us of any buggy lockless access to that field, even if that
buggy access uses READ_ONCE().  Much much better your way!!!

So thank you for persisting on this!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless
  2024-05-12 11:19   ` [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless Oleg Nesterov
@ 2024-05-12 15:10     ` Paul E. McKenney
  2024-05-12 16:55       ` Oleg Nesterov
  0 siblings, 1 reply; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-12 15:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Sun, May 12, 2024 at 01:19:48PM +0200, Oleg Nesterov wrote:
> rcu_sync->gp_count is updated under the protection of ->rss_lock but read
> locklessly by the WARN_ON() checks, and KCSAN noted the data race.
> 
> Move these WARN_ON_ONCE()'s under the lock and remove the no longer needed
> READ_ONCE().
> 
> Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Very good, thank you!

Due to inattention on my part, the patches were sent late, so the patch
you are (rightly) complaining about is on its way in.  So what I did was
to port your patch on top of that one as shown below.  Left to myself,
I would be thinking in terms of the v6.11 merge window.  Please let me
know if this is more urgent than that.

And as always, please let me know if I messed anything on in the port.

							Thanx, Paul

------------------------------------------------------------------------

commit 8d75fb302aaa97693c2294ded48a472e4956d615
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Sun May 12 08:02:07 2024 -0700

    rcu: Eliminate lockless accesses to rcu_sync->gp_count
    
    The rcu_sync structure's ->gp_count field is always accessed under the
    protection of that same structure's ->rss_lock field, with the exception
    of a pair of WARN_ON_ONCE() calls just prior to acquiring that lock in
    functions rcu_sync_exit() and rcu_sync_dtor().  These lockless accesses
    are unnecessary and impair KCSAN's ability to catch bugs that might be
    inserted via other lockless accesses.
    
    This commit therefore moves those WARN_ON_ONCE() calls under the lock.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 6c2bd9001adcd..05bfe69fdb0bb 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -151,15 +151,11 @@ void rcu_sync_enter(struct rcu_sync *rsp)
  */
 void rcu_sync_exit(struct rcu_sync *rsp)
 {
-	int gpc;
-
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
-	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
 
 	spin_lock_irq(&rsp->rss_lock);
-	gpc = rsp->gp_count - 1;
-	WRITE_ONCE(rsp->gp_count, gpc);
-	if (!gpc) {
+	WARN_ON_ONCE(rsp->gp_count == 0);
+	if (!--rsp->gp_count) {
 		if (rsp->gp_state == GP_PASSED) {
 			WRITE_ONCE(rsp->gp_state, GP_EXIT);
 			rcu_sync_call(rsp);
@@ -178,10 +174,10 @@ void rcu_sync_dtor(struct rcu_sync *rsp)
 {
 	int gp_state;
 
-	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
 
 	spin_lock_irq(&rsp->rss_lock);
+	WARN_ON_ONCE(rsp->gp_count);
 	if (rsp->gp_state == GP_REPLAY)
 		WRITE_ONCE(rsp->gp_state, GP_EXIT);
 	gp_state = rsp->gp_state;

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless
  2024-05-12 15:10     ` Paul E. McKenney
@ 2024-05-12 16:55       ` Oleg Nesterov
  2024-05-12 17:11         ` Paul E. McKenney
  0 siblings, 1 reply; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-12 16:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/12, Paul E. McKenney wrote:
>
> --- a/kernel/rcu/sync.c
> +++ b/kernel/rcu/sync.c
> @@ -151,15 +151,11 @@ void rcu_sync_enter(struct rcu_sync *rsp)
>   */
>  void rcu_sync_exit(struct rcu_sync *rsp)
>  {
> -	int gpc;
> -
>  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
> -	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
>
>  	spin_lock_irq(&rsp->rss_lock);
> -	gpc = rsp->gp_count - 1;
> -	WRITE_ONCE(rsp->gp_count, gpc);
> -	if (!gpc) {
> +	WARN_ON_ONCE(rsp->gp_count == 0);
> +	if (!--rsp->gp_count) {
>  		if (rsp->gp_state == GP_PASSED) {
>  			WRITE_ONCE(rsp->gp_state, GP_EXIT);
>  			rcu_sync_call(rsp);
> @@ -178,10 +174,10 @@ void rcu_sync_dtor(struct rcu_sync *rsp)
>  {
>  	int gp_state;
>
> -	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
>  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
>
>  	spin_lock_irq(&rsp->rss_lock);
> +	WARN_ON_ONCE(rsp->gp_count);
>  	if (rsp->gp_state == GP_REPLAY)
>  		WRITE_ONCE(rsp->gp_state, GP_EXIT);
>  	gp_state = rsp->gp_state;

Thanks Paul!

But then I think this change can also revert this chunk from the previous
patch:

	@@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
			 * we are called at early boot time but this shouldn't happen.
			 */
		}
	-	rsp->gp_count++;
	+	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
		spin_unlock_irq(&rsp->rss_lock);
	 
		if (gp_state == GP_IDLE) {


Thanks again,

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless
  2024-05-12 16:55       ` Oleg Nesterov
@ 2024-05-12 17:11         ` Paul E. McKenney
  2024-05-12 18:06           ` Oleg Nesterov
  0 siblings, 1 reply; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-12 17:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Sun, May 12, 2024 at 06:55:29PM +0200, Oleg Nesterov wrote:
> On 05/12, Paul E. McKenney wrote:
> >
> > --- a/kernel/rcu/sync.c
> > +++ b/kernel/rcu/sync.c
> > @@ -151,15 +151,11 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> >   */
> >  void rcu_sync_exit(struct rcu_sync *rsp)
> >  {
> > -	int gpc;
> > -
> >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
> > -	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
> >
> >  	spin_lock_irq(&rsp->rss_lock);
> > -	gpc = rsp->gp_count - 1;
> > -	WRITE_ONCE(rsp->gp_count, gpc);
> > -	if (!gpc) {
> > +	WARN_ON_ONCE(rsp->gp_count == 0);
> > +	if (!--rsp->gp_count) {
> >  		if (rsp->gp_state == GP_PASSED) {
> >  			WRITE_ONCE(rsp->gp_state, GP_EXIT);
> >  			rcu_sync_call(rsp);
> > @@ -178,10 +174,10 @@ void rcu_sync_dtor(struct rcu_sync *rsp)
> >  {
> >  	int gp_state;
> >
> > -	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
> >  	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
> >
> >  	spin_lock_irq(&rsp->rss_lock);
> > +	WARN_ON_ONCE(rsp->gp_count);
> >  	if (rsp->gp_state == GP_REPLAY)
> >  		WRITE_ONCE(rsp->gp_state, GP_EXIT);
> >  	gp_state = rsp->gp_state;
> 
> Thanks Paul!
> 
> But then I think this change can also revert this chunk from the previous
> patch:
> 
> 	@@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> 			 * we are called at early boot time but this shouldn't happen.
> 			 */
> 		}
> 	-	rsp->gp_count++;
> 	+	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
> 		spin_unlock_irq(&rsp->rss_lock);
> 	 
> 		if (gp_state == GP_IDLE) {

Good catch, thank you!  How about like this?

							Thanx, Paul

------------------------------------------------------------------------

commit 3e75ce9876396a770a0fcd8eecd83b9f6469f49c
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Sun May 12 08:02:07 2024 -0700

    rcu: Eliminate lockless accesses to rcu_sync->gp_count
    
    The rcu_sync structure's ->gp_count field is always accessed under the
    protection of that same structure's ->rss_lock field, with the exception
    of a pair of WARN_ON_ONCE() calls just prior to acquiring that lock in
    functions rcu_sync_exit() and rcu_sync_dtor().  These lockless accesses
    are unnecessary and impair KCSAN's ability to catch bugs that might be
    inserted via other lockless accesses.
    
    This commit therefore moves those WARN_ON_ONCE() calls under the lock.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 6c2bd9001adcd..da60a9947c005 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
 		 * we are called at early boot time but this shouldn't happen.
 		 */
 	}
-	WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
+	rsp->gp_count++;
 	spin_unlock_irq(&rsp->rss_lock);
 
 	if (gp_state == GP_IDLE) {
@@ -151,15 +151,11 @@ void rcu_sync_enter(struct rcu_sync *rsp)
  */
 void rcu_sync_exit(struct rcu_sync *rsp)
 {
-	int gpc;
-
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
-	WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
 
 	spin_lock_irq(&rsp->rss_lock);
-	gpc = rsp->gp_count - 1;
-	WRITE_ONCE(rsp->gp_count, gpc);
-	if (!gpc) {
+	WARN_ON_ONCE(rsp->gp_count == 0);
+	if (!--rsp->gp_count) {
 		if (rsp->gp_state == GP_PASSED) {
 			WRITE_ONCE(rsp->gp_state, GP_EXIT);
 			rcu_sync_call(rsp);
@@ -178,10 +174,10 @@ void rcu_sync_dtor(struct rcu_sync *rsp)
 {
 	int gp_state;
 
-	WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
 	WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
 
 	spin_lock_irq(&rsp->rss_lock);
+	WARN_ON_ONCE(rsp->gp_count);
 	if (rsp->gp_state == GP_REPLAY)
 		WRITE_ONCE(rsp->gp_state, GP_EXIT);
 	gp_state = rsp->gp_state;

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless
  2024-05-12 17:11         ` Paul E. McKenney
@ 2024-05-12 18:06           ` Oleg Nesterov
  0 siblings, 0 replies; 70+ messages in thread
From: Oleg Nesterov @ 2024-05-12 18:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On 05/12, Paul E. McKenney wrote:
>
> How about like this?

LGTM ;)

Oleg.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-10 14:11           ` Paul E. McKenney
  2024-05-12 10:41             ` Oleg Nesterov
@ 2024-05-13 14:13             ` Marco Elver
  2024-05-14 16:54               ` Paul E. McKenney
  1 sibling, 1 reply; 70+ messages in thread
From: Marco Elver @ 2024-05-13 14:13 UTC (permalink / raw)
  To: paulmck
  Cc: Oleg Nesterov, Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Fri, 10 May 2024 at 16:11, Paul E. McKenney <paulmck@kernel.org> wrote:
[...]
> > > Does this mean that KCSAN/etc treats the files in kernel/rcu/
> > > differently than the "Rest of Kernel"? Or what?
> > >
> > > And how is it enforced?
> >
> > I can only find the strnstr(buf, "rcu") checks in skip_report(),
> > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case...
>
> Huh, new one on me!  When I run KCSAN, I set CONFIG_KCSAN_STRICT=y,
> which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should
> prevent skip_report() from even being invoked.

The strnstr hack goes back to the first version of KCSAN released in
v5.8 [1]. It was added in response to Paul wanting the "stricter"
treatment for RCU even while KCSAN was still in development, and back
then syzbot was the first test system using KCSAN. Shortly after Paul
added a KCSAN config for rcutorture with a laundry list of options to
make KCSAN "strict" (before we eventually added CONFIG_KCSAN_STRICT
which greatly simplified that).

While the strnstr(.., "rcu") rules are redundant when using the
stricter rules (at least CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n is
set), we're keeping the "rcu" special case around because there are
test robots and fuzzers that use the default KCSAN config (unlike
rcutorture). And because we know that RCU likes the stricter rules,
the "value change only" filter is ignored in all KCSAN configs for
rcu-related functions.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcsan/report.c?id=v5.8

Back then syzbot occasionally reported RCU data races, but these days
rcutorture probably finds all of them before syzbot ever gets its
hands on new code.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field
  2024-05-13 14:13             ` Marco Elver
@ 2024-05-14 16:54               ` Paul E. McKenney
  0 siblings, 0 replies; 70+ messages in thread
From: Paul E. McKenney @ 2024-05-14 16:54 UTC (permalink / raw)
  To: Marco Elver
  Cc: Oleg Nesterov, Uladzislau Rezki (Sony),
	RCU, Neeraj upadhyay, Boqun Feng, Hillf Danton, Joel Fernandes,
	LKML, Oleksiy Avramchenko, Frederic Weisbecker, Peter Zijlstra

On Mon, May 13, 2024 at 04:13:35PM +0200, Marco Elver wrote:
> On Fri, 10 May 2024 at 16:11, Paul E. McKenney <paulmck@kernel.org> wrote:
> [...]
> > > > Does this mean that KCSAN/etc treats the files in kernel/rcu/
> > > > differently than the "Rest of Kernel"? Or what?
> > > >
> > > > And how is it enforced?
> > >
> > > I can only find the strnstr(buf, "rcu") checks in skip_report(),
> > > but they only cover the KCSAN_REPORT_VALUE_CHANGE_ONLY case...
> >
> > Huh, new one on me!  When I run KCSAN, I set CONFIG_KCSAN_STRICT=y,
> > which implies CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n, which should
> > prevent skip_report() from even being invoked.
> 
> The strnstr hack goes back to the first version of KCSAN released in
> v5.8 [1]. It was added in response to Paul wanting the "stricter"
> treatment for RCU even while KCSAN was still in development, and back
> then syzbot was the first test system using KCSAN. Shortly after Paul
> added a KCSAN config for rcutorture with a laundry list of options to
> make KCSAN "strict" (before we eventually added CONFIG_KCSAN_STRICT
> which greatly simplified that).
> 
> While the strnstr(.., "rcu") rules are redundant when using the
> stricter rules (at least CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n is
> set), we're keeping the "rcu" special case around because there are
> test robots and fuzzers that use the default KCSAN config (unlike
> rcutorture). And because we know that RCU likes the stricter rules,
> the "value change only" filter is ignored in all KCSAN configs for
> rcu-related functions.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/kcsan/report.c?id=v5.8

Thank you for the background information!

> Back then syzbot occasionally reported RCU data races, but these days
> rcutorture probably finds all of them before syzbot ever gets its
> hands on new code.

I do try my best.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2024-05-14 16:54 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-07  9:34 [PATCH 00/48] RCU changes for v6.10 Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 01/48] scftorture: Increase memory provided to guest OS Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 02/48] rcutorture: Disable tracing to permit Tasks Rude RCU testing Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 03/48] rcu: Add lockdep checks and kernel-doc header to rcu_softirq_qs() Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 04/48] rcutorture: Enable RCU priority boosting for TREE09 Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 05/48] rcutorture: Dump # online CPUs on insufficient cb-flood laundering Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 06/48] rcutorture: Dump GP kthread state " Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 07/48] rcutorture: ASSERT_EXCLUSIVE_WRITER() for ->rtort_pipe_count updates Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 08/48] rcu-tasks: Make Tasks RCU wait idly for grace-period delays Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 09/48] doc: Remove references to arrayRCU.rst Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 10/48] MAINTAINERS: Update Neeraj's email address Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 11/48] bpf: Select new NEED_TASKS_RCU Kconfig option Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 12/48] arch: " Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 13/48] rcu: Add data structures for synchronize_rcu() Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 14/48] tracing: Select new NEED_TASKS_RCU Kconfig option Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 15/48] bpf: Choose RCU Tasks based on TASKS_RCU rather than PREEMPTION Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 16/48] ftrace: " Uladzislau Rezki (Sony)
2024-05-07  9:34 ` [PATCH 17/48] rcu: Update lockdep while in RCU read-side critical section Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 18/48] rcu: Make TINY_RCU depend on !PREEMPT_RCU rather than !PREEMPTION Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 19/48] srcu: Make Tiny SRCU explicitly disable preemption Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 20/48] rcu: Create NEED_TASKS_RCU to factor out enablement logic Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 21/48] rcu: Remove redundant BH disabling in TINY_RCU Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 22/48] rcu: Make Tiny RCU explicitly disable preemption Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 23/48] rcu: Remove redundant READ_ONCE() of rcu_state.gp_flags in tree.c Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 24/48] rcu: Bring diagnostic read of rcu_state.gp_flags into alignment Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 25/48] rcu: Mark writes to rcu_sync ->gp_count field Uladzislau Rezki (Sony)
2024-05-07 14:54   ` Oleg Nesterov
2024-05-07 17:44     ` Paul E. McKenney
2024-05-09 15:13       ` Oleg Nesterov
2024-05-10  3:59         ` Paul E. McKenney
2024-05-10 11:31           ` Oleg Nesterov
2024-05-10 14:00             ` Alan Huang
2024-05-10 14:04             ` Paul E. McKenney
2024-05-12 10:53               ` Oleg Nesterov
2024-05-12 14:57                 ` Paul E. McKenney
2024-05-10 13:18       ` Oleg Nesterov
2024-05-10 13:50         ` Oleg Nesterov
2024-05-10 14:11           ` Paul E. McKenney
2024-05-12 10:41             ` Oleg Nesterov
2024-05-13 14:13             ` Marco Elver
2024-05-14 16:54               ` Paul E. McKenney
2024-05-10 13:58         ` Paul E. McKenney
2024-05-12 11:19   ` [PATCH] rcu/sync: don't read rcu_sync->gp_count lockless Oleg Nesterov
2024-05-12 15:10     ` Paul E. McKenney
2024-05-12 16:55       ` Oleg Nesterov
2024-05-12 17:11         ` Paul E. McKenney
2024-05-12 18:06           ` Oleg Nesterov
2024-05-07  9:35 ` [PATCH 26/48] rcu: Mark loads from rcu_state.n_online_cpus Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 27/48] rcu: Make hotplug operations track GP state, not flags Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 28/48] rcu: Inform KCSAN of one-byte cmpxchg() in rcu_trc_cmpxchg_need_qs() Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 29/48] rcu: Remove redundant CONFIG_PROVE_RCU #if condition Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 30/48] rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE() Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 31/48] rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 32/48] rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 33/48] rcu: Mollify sparse with RCU guard Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 34/48] rcu: Fix buffer overflow in print_cpu_stall_info() Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 35/48] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 36/48] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 37/48] rcu: Support direct wake-up of synchronize_rcu() users Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 38/48] rcu: Allocate WQ with WQ_MEM_RECLAIM bit set Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 39/48] rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE() Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 40/48] rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 41/48] rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 42/48] rcutorture: Use the gp_kthread_dbg operation specified by cur_ops Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 43/48] rcutorture: Make rcutorture support print rcu-tasks gp state Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 44/48] rcutorture: Removing redundant function pointer initialization Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 45/48] rcutorture: Make stall-tasks directly exit when rcutorture tests end Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 46/48] rcutorture: Fix invalid context warning when enable srcu barrier testing Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 47/48] torture: Scale --do-kvfree test time Uladzislau Rezki (Sony)
2024-05-07  9:35 ` [PATCH 48/48] rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test Uladzislau Rezki (Sony)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).