linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking
@ 2022-03-02 15:47 Frederic Weisbecker
  2022-03-02 15:47 ` [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit() Frederic Weisbecker
                   ` (19 more replies)
  0 siblings, 20 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

This mixes up the RCU dynticks counter and the context tracking state
updates into a single atomic instruction. This may serve several
purposes:

1) Improve CPU isolation with deferring some disturbances until sensitive
   userspace workload completes and goes to the kernel. This can take
   several forms, for example smp_call_function_housekeeping() or
   on_each_housekeeping_cpu() to enqueue and execute work on all
   housekeeping CPUs. Then an atomic operation on ct->state can defer
   the work on nohz_full CPUs until they run in kernel (or IPI them
   if they are in kernel mode), see this proposal by Peter:
   https://lore.kernel.org/all/20210929151723.162004989@infradead.org/#r

2) Unearth sysidle (https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=sysidle.2017.05.11a&id=fe5ac724d81a3c7803e60c2232718f212f3f38d4)
   This feature allowed to shutdown the tick on the last housekeeping
   CPU once the rest of the system is fully idle. We needed some proper
   fully ordered context tracking for that.

Inspired by Peterz: https://lore.kernel.org/all/20210929151723.162004989@infradead.org

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	rcu/context-tracking

HEAD: e4eaff86ec91c1cbde9a113cf5232dac9f897337

Thanks,
	Frederic
---

Frederic Weisbecker (19):
      context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit()
      context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable()
      context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit()
      context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user()
      context_tracking: Split user tracking Kconfig
      context_tracking: Take idle eqs entrypoints over RCU
      context_tracking: Take IRQ eqs entrypoints over RCU
      context_tracking: Take NMI eqs entrypoints over RCU
      rcu/context-tracking: Remove rcu_irq_enter/exit()
      rcu/context_tracking: Move dynticks counter to context tracking
      rcu/context_tracking: Move dynticks_nesting to context tracking
      rcu/context_tracking: Move dynticks_nmi_nesting to context tracking
      rcu/context-tracking: Move deferred nocb resched to context tracking
      rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
      rcu/context-tracking: Remove unused and/or unecessary middle functions
      context_tracking: Convert state to atomic_t
      rcu/context-tracking: Use accessor for dynticks counter value
      rcu/context_tracking: Merge dynticks counter and context tracking states
      context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking


 .../RCU/Design/Requirements/Requirements.rst       |  10 +-
 Documentation/RCU/stallwarn.rst                    |   6 +-
 .../time/context-tracking/arch-support.txt         |   6 +-
 arch/Kconfig                                       |   8 +-
 arch/arm/Kconfig                                   |   2 +-
 arch/arm/kernel/entry-common.S                     |   4 +-
 arch/arm/kernel/entry-header.S                     |  12 +-
 arch/arm/mach-imx/cpuidle-imx6q.c                  |   5 +-
 arch/arm64/Kconfig                                 |   2 +-
 arch/arm64/kernel/entry-common.c                   |  14 +-
 arch/csky/Kconfig                                  |   2 +-
 arch/csky/kernel/entry.S                           |   8 +-
 arch/mips/Kconfig                                  |   2 +-
 arch/powerpc/Kconfig                               |   2 +-
 arch/powerpc/include/asm/context_tracking.h        |   2 +-
 arch/riscv/Kconfig                                 |   2 +-
 arch/riscv/kernel/entry.S                          |  12 +-
 arch/sparc/Kconfig                                 |   2 +-
 arch/sparc/kernel/rtrap_64.S                       |   2 +-
 arch/x86/Kconfig                                   |   4 +-
 arch/x86/mm/fault.c                                |   2 +-
 drivers/acpi/processor_idle.c                      |   5 +-
 drivers/cpuidle/cpuidle-psci.c                     |   8 +-
 drivers/cpuidle/cpuidle.c                          |   9 +-
 include/linux/context_tracking.h                   |  81 ++--
 include/linux/context_tracking_irq.h               |  21 +
 include/linux/context_tracking_state.h             |  92 +++-
 include/linux/entry-common.h                       |  10 +-
 include/linux/hardirq.h                            |  12 +-
 include/linux/rcupdate.h                           |   7 +-
 include/linux/rcutiny.h                            |   6 -
 include/linux/rcutree.h                            |  15 +-
 include/linux/tracepoint.h                         |   4 +-
 init/Kconfig                                       |   4 +-
 kernel/context_tracking.c                          | 526 +++++++++++++++++++--
 kernel/cpu_pm.c                                    |   8 +-
 kernel/entry/common.c                              |  16 +-
 kernel/extable.c                                   |   4 +-
 kernel/locking/lockdep.c                           |   2 +-
 kernel/rcu/Kconfig                                 |   2 +
 kernel/rcu/rcu.h                                   |   4 -
 kernel/rcu/tree.c                                  | 478 ++-----------------
 kernel/rcu/tree.h                                  |   8 -
 kernel/rcu/tree_exp.h                              |   2 +-
 kernel/rcu/tree_plugin.h                           |  36 +-
 kernel/rcu/tree_stall.h                            |   7 +-
 kernel/rcu/update.c                                |   2 +-
 kernel/sched/core.c                                |   2 +-
 kernel/sched/idle.c                                |  11 +-
 kernel/softirq.c                                   |   4 +-
 kernel/time/Kconfig                                |  22 +-
 kernel/time/tick-sched.c                           |   2 +-
 kernel/trace/trace.c                               |   8 +-
 53 files changed, 793 insertions(+), 734 deletions(-)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit()
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-10 19:27   ` Paul E. McKenney
  2022-03-02 15:47 ` [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable() Frederic Weisbecker
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The context tracking namespace is going to expand and some new functions
will require even longer names. Start shrinking the context_tracking
prefix to "ct" as is already the case for some existing macros, this
will make the introduction of new functions easier.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking.h | 12 ++++++------
 kernel/context_tracking.c        | 20 ++++++++++----------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 7a14807c9d1a..773035124bad 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -14,8 +14,8 @@
 extern void context_tracking_cpu_set(int cpu);
 
 /* Called with interrupts disabled.  */
-extern void __context_tracking_enter(enum ctx_state state);
-extern void __context_tracking_exit(enum ctx_state state);
+extern void __ct_user_enter(enum ctx_state state);
+extern void __ct_user_exit(enum ctx_state state);
 
 extern void context_tracking_enter(enum ctx_state state);
 extern void context_tracking_exit(enum ctx_state state);
@@ -38,13 +38,13 @@ static inline void user_exit(void)
 static __always_inline void user_enter_irqoff(void)
 {
 	if (context_tracking_enabled())
-		__context_tracking_enter(CONTEXT_USER);
+		__ct_user_enter(CONTEXT_USER);
 
 }
 static __always_inline void user_exit_irqoff(void)
 {
 	if (context_tracking_enabled())
-		__context_tracking_exit(CONTEXT_USER);
+		__ct_user_exit(CONTEXT_USER);
 }
 
 static inline enum ctx_state exception_enter(void)
@@ -74,7 +74,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 static __always_inline bool context_tracking_guest_enter(void)
 {
 	if (context_tracking_enabled())
-		__context_tracking_enter(CONTEXT_GUEST);
+		__ct_user_enter(CONTEXT_GUEST);
 
 	return context_tracking_enabled_this_cpu();
 }
@@ -82,7 +82,7 @@ static __always_inline bool context_tracking_guest_enter(void)
 static __always_inline void context_tracking_guest_exit(void)
 {
 	if (context_tracking_enabled())
-		__context_tracking_exit(CONTEXT_GUEST);
+		__ct_user_exit(CONTEXT_GUEST);
 }
 
 /**
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 36a98c48aedc..ad2a973393a6 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -51,15 +51,15 @@ static __always_inline void context_tracking_recursion_exit(void)
 }
 
 /**
- * context_tracking_enter - Inform the context tracking that the CPU is going
- *                          enter user or guest space mode.
+ * __ct_user_enter - Inform the context tracking that the CPU is going
+ *		     to enter user or guest space mode.
  *
  * This function must be called right before we switch from the kernel
  * to user or guest space, when it's guaranteed the remaining kernel
  * instructions to execute won't use any RCU read side critical section
  * because this function sets RCU in extended quiescent state.
  */
-void noinstr __context_tracking_enter(enum ctx_state state)
+void noinstr __ct_user_enter(enum ctx_state state)
 {
 	/* Kernel threads aren't supposed to go to userspace */
 	WARN_ON_ONCE(!current->mm);
@@ -101,7 +101,7 @@ void noinstr __context_tracking_enter(enum ctx_state state)
 	}
 	context_tracking_recursion_exit();
 }
-EXPORT_SYMBOL_GPL(__context_tracking_enter);
+EXPORT_SYMBOL_GPL(__ct_user_enter);
 
 void context_tracking_enter(enum ctx_state state)
 {
@@ -119,7 +119,7 @@ void context_tracking_enter(enum ctx_state state)
 		return;
 
 	local_irq_save(flags);
-	__context_tracking_enter(state);
+	__ct_user_enter(state);
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_enter);
@@ -132,8 +132,8 @@ void context_tracking_user_enter(void)
 NOKPROBE_SYMBOL(context_tracking_user_enter);
 
 /**
- * context_tracking_exit - Inform the context tracking that the CPU is
- *                         exiting user or guest mode and entering the kernel.
+ * __ct_user_exit - Inform the context tracking that the CPU is
+ * 		    exiting user or guest mode and entering the kernel.
  *
  * This function must be called after we entered the kernel from user or
  * guest space before any use of RCU read side critical section. This
@@ -143,7 +143,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
  * This call supports re-entrancy. This way it can be called from any exception
  * handler without needing to know if we came from userspace or not.
  */
-void noinstr __context_tracking_exit(enum ctx_state state)
+void noinstr __ct_user_exit(enum ctx_state state)
 {
 	if (!context_tracking_recursion_enter())
 		return;
@@ -166,7 +166,7 @@ void noinstr __context_tracking_exit(enum ctx_state state)
 	}
 	context_tracking_recursion_exit();
 }
-EXPORT_SYMBOL_GPL(__context_tracking_exit);
+EXPORT_SYMBOL_GPL(__ct_user_exit);
 
 void context_tracking_exit(enum ctx_state state)
 {
@@ -176,7 +176,7 @@ void context_tracking_exit(enum ctx_state state)
 		return;
 
 	local_irq_save(flags);
-	__context_tracking_exit(state);
+	__ct_user_exit(state);
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_exit);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable()
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
  2022-03-02 15:47 ` [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit() Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-05 13:59   ` Peter Zijlstra
  2022-03-02 15:47 ` [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit() Frederic Weisbecker
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

context_tracking_user_enter() and context_tracking_user_exit() are
ASM callable versions of user_enter() and user_exit() for architectures
that didn't manage to check the context tracking static key from ASM.
Change those function names to better reflect their purpose.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 arch/arm/kernel/entry-header.S   |  8 ++++----
 arch/csky/kernel/entry.S         |  4 ++--
 arch/riscv/kernel/entry.S        |  6 +++---
 include/linux/context_tracking.h |  4 ++--
 kernel/context_tracking.c        | 18 ++++++++++++++----
 5 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index ae24dd54e9ef..3af2a521e1d6 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -364,10 +364,10 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 	.if	\save
 	stmdb   sp!, {r0-r3, ip, lr}
-	bl	context_tracking_user_exit
+	bl	user_exit_callable
 	ldmia	sp!, {r0-r3, ip, lr}
 	.else
-	bl	context_tracking_user_exit
+	bl	user_exit_callable
 	.endif
 #endif
 	.endm
@@ -376,10 +376,10 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 	.if	\save
 	stmdb   sp!, {r0-r3, ip, lr}
-	bl	context_tracking_user_enter
+	bl	user_enter_callable
 	ldmia	sp!, {r0-r3, ip, lr}
 	.else
-	bl	context_tracking_user_enter
+	bl	user_enter_callable
 	.endif
 #endif
 	.endm
diff --git a/arch/csky/kernel/entry.S b/arch/csky/kernel/entry.S
index a4ababf25e24..bc734d17c16f 100644
--- a/arch/csky/kernel/entry.S
+++ b/arch/csky/kernel/entry.S
@@ -23,7 +23,7 @@
 	mfcr	a0, epsr
 	btsti	a0, 31
 	bt	1f
-	jbsr	context_tracking_user_exit
+	jbsr	user_exit_callable
 	ldw	a0, (sp, LSAVE_A0)
 	ldw	a1, (sp, LSAVE_A1)
 	ldw	a2, (sp, LSAVE_A2)
@@ -160,7 +160,7 @@ ret_from_exception:
 	cmpnei	r10, 0
 	bt	exit_work
 #ifdef CONFIG_CONTEXT_TRACKING
-	jbsr	context_tracking_user_enter
+	jbsr	user_enter_callable
 #endif
 1:
 #ifdef CONFIG_PREEMPTION
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index ed29e9c8f660..5fbaa7be18a2 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -112,11 +112,11 @@ _save_context:
 #endif
 
 #ifdef CONFIG_CONTEXT_TRACKING
-	/* If previous state is in user mode, call context_tracking_user_exit. */
+	/* If previous state is in user mode, call user_exit_callable(). */
 	li   a0, SR_PP
 	and a0, s1, a0
 	bnez a0, skip_context_tracking
-	call context_tracking_user_exit
+	call user_exit_callable
 skip_context_tracking:
 #endif
 
@@ -252,7 +252,7 @@ resume_userspace:
 	bnez s1, work_pending
 
 #ifdef CONFIG_CONTEXT_TRACKING
-	call context_tracking_user_enter
+	call user_enter_callable
 #endif
 
 	/* Save unwound kernel stack pointer in thread_info */
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 773035124bad..69532cd18f72 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -19,8 +19,8 @@ extern void __ct_user_exit(enum ctx_state state);
 
 extern void context_tracking_enter(enum ctx_state state);
 extern void context_tracking_exit(enum ctx_state state);
-extern void context_tracking_user_enter(void);
-extern void context_tracking_user_exit(void);
+extern void user_enter_callable(void);
+extern void user_exit_callable(void);
 
 static inline void user_enter(void)
 {
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index ad2a973393a6..83e050675b23 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -125,11 +125,16 @@ void context_tracking_enter(enum ctx_state state)
 NOKPROBE_SYMBOL(context_tracking_enter);
 EXPORT_SYMBOL_GPL(context_tracking_enter);
 
-void context_tracking_user_enter(void)
+/**
+ * user_enter_callable() - Unfortunate ASM callable version of user_enter() for
+ * 			   archs that didn't manage to check the context tracking
+ * 			   static key from low level code.
+ */
+void user_enter_callable(void)
 {
 	user_enter();
 }
-NOKPROBE_SYMBOL(context_tracking_user_enter);
+NOKPROBE_SYMBOL(user_enter_callable);
 
 /**
  * __ct_user_exit - Inform the context tracking that the CPU is
@@ -182,11 +187,16 @@ void context_tracking_exit(enum ctx_state state)
 NOKPROBE_SYMBOL(context_tracking_exit);
 EXPORT_SYMBOL_GPL(context_tracking_exit);
 
-void context_tracking_user_exit(void)
+/**
+ * user_exit_callable() - Unfortunate ASM callable version of user_exit() for
+ * 			  archs that didn't manage to check the context tracking
+ * 			  static key from low level code.
+ */
+void user_exit_callable(void)
 {
 	user_exit();
 }
-NOKPROBE_SYMBOL(context_tracking_user_exit);
+NOKPROBE_SYMBOL(user_exit_callable);
 
 void __init context_tracking_cpu_set(int cpu)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit()
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
  2022-03-02 15:47 ` [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit() Frederic Weisbecker
  2022-03-02 15:47 ` [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable() Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-05 14:02   ` Peter Zijlstra
  2022-03-02 15:47 ` [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user() Frederic Weisbecker
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

context_tracking_enter() and context_tracking_exit() have confusing
names that don't explain the fact they are referring to user/guest state.

Use more self-explanatory names and shrink to the new context tracking
prefix instead.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking.h | 13 +++++++------
 kernel/context_tracking.c        | 12 ++++++------
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 69532cd18f72..7a5f04ae1758 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -17,21 +17,22 @@ extern void context_tracking_cpu_set(int cpu);
 extern void __ct_user_enter(enum ctx_state state);
 extern void __ct_user_exit(enum ctx_state state);
 
-extern void context_tracking_enter(enum ctx_state state);
-extern void context_tracking_exit(enum ctx_state state);
+extern void ct_user_enter(enum ctx_state state);
+extern void ct_user_exit(enum ctx_state state);
+
 extern void user_enter_callable(void);
 extern void user_exit_callable(void);
 
 static inline void user_enter(void)
 {
 	if (context_tracking_enabled())
-		context_tracking_enter(CONTEXT_USER);
+		ct_user_enter(CONTEXT_USER);
 
 }
 static inline void user_exit(void)
 {
 	if (context_tracking_enabled())
-		context_tracking_exit(CONTEXT_USER);
+		ct_user_exit(CONTEXT_USER);
 }
 
 /* Called with interrupts disabled.  */
@@ -57,7 +58,7 @@ static inline enum ctx_state exception_enter(void)
 
 	prev_ctx = this_cpu_read(context_tracking.state);
 	if (prev_ctx != CONTEXT_KERNEL)
-		context_tracking_exit(prev_ctx);
+		ct_user_exit(prev_ctx);
 
 	return prev_ctx;
 }
@@ -67,7 +68,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK) &&
 	    context_tracking_enabled()) {
 		if (prev_ctx != CONTEXT_KERNEL)
-			context_tracking_enter(prev_ctx);
+			ct_user_enter(prev_ctx);
 	}
 }
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 83e050675b23..e8e58c10f135 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -103,7 +103,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
 }
 EXPORT_SYMBOL_GPL(__ct_user_enter);
 
-void context_tracking_enter(enum ctx_state state)
+void ct_user_enter(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -122,8 +122,8 @@ void context_tracking_enter(enum ctx_state state)
 	__ct_user_enter(state);
 	local_irq_restore(flags);
 }
-NOKPROBE_SYMBOL(context_tracking_enter);
-EXPORT_SYMBOL_GPL(context_tracking_enter);
+NOKPROBE_SYMBOL(ct_user_enter);
+EXPORT_SYMBOL_GPL(ct_user_enter);
 
 /**
  * user_enter_callable() - Unfortunate ASM callable version of user_enter() for
@@ -173,7 +173,7 @@ void noinstr __ct_user_exit(enum ctx_state state)
 }
 EXPORT_SYMBOL_GPL(__ct_user_exit);
 
-void context_tracking_exit(enum ctx_state state)
+void ct_user_exit(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -184,8 +184,8 @@ void context_tracking_exit(enum ctx_state state)
 	__ct_user_exit(state);
 	local_irq_restore(flags);
 }
-NOKPROBE_SYMBOL(context_tracking_exit);
-EXPORT_SYMBOL_GPL(context_tracking_exit);
+NOKPROBE_SYMBOL(ct_user_exit);
+EXPORT_SYMBOL_GPL(ct_user_exit);
 
 /**
  * user_exit_callable() - Unfortunate ASM callable version of user_exit() for
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user()
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2022-03-02 15:47 ` [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit() Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-05 14:03   ` Peter Zijlstra
  2022-03-02 15:47 ` [PATCH 05/19] context_tracking: Split user tracking Kconfig Frederic Weisbecker
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

context_tracking_cpu_set() is called in order to tell a CPU to track
user/kernel transitions. Since context tracking is going to expand in
to also track transitions from/to idle/IRQ/NMIs, the scope
of this function name becomes too broad and needs to be made more
specific.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking.h | 2 +-
 kernel/context_tracking.c        | 4 ++--
 kernel/time/tick-sched.c         | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 7a5f04ae1758..40badd62ad56 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -11,7 +11,7 @@
 
 
 #ifdef CONFIG_CONTEXT_TRACKING
-extern void context_tracking_cpu_set(int cpu);
+extern void context_tracking_cpu_track_user(int cpu);
 
 /* Called with interrupts disabled.  */
 extern void __ct_user_enter(enum ctx_state state);
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index e8e58c10f135..7b6643d2075d 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -198,7 +198,7 @@ void user_exit_callable(void)
 }
 NOKPROBE_SYMBOL(user_exit_callable);
 
-void __init context_tracking_cpu_set(int cpu)
+void __init context_tracking_cpu_track_user(int cpu)
 {
 	static __initdata bool initialized = false;
 
@@ -228,6 +228,6 @@ void __init context_tracking_init(void)
 	int cpu;
 
 	for_each_possible_cpu(cpu)
-		context_tracking_cpu_set(cpu);
+		context_tracking_cpu_track_user(cpu);
 }
 #endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2d76c91b85de..794410da3ee1 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -571,7 +571,7 @@ void __init tick_nohz_init(void)
 	}
 
 	for_each_cpu(cpu, tick_nohz_full_mask)
-		context_tracking_cpu_set(cpu);
+		context_tracking_cpu_track_user(cpu);
 
 	ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
 					"kernel/nohz:predown", NULL,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 05/19] context_tracking: Split user tracking Kconfig
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2022-03-02 15:47 ` [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user() Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-10 19:43   ` Paul E. McKenney
  2022-03-02 15:47 ` [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU Frederic Weisbecker
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Context tracking is going to be used not only to track user transitions
but also idle/IRQs/NMIs. The user tracking part will then become a
seperate feature. Prepare Kconfig for that.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 .../time/context-tracking/arch-support.txt    |  6 ++---
 arch/Kconfig                                  |  4 ++--
 arch/arm/Kconfig                              |  2 +-
 arch/arm/kernel/entry-common.S                |  4 ++--
 arch/arm/kernel/entry-header.S                |  4 ++--
 arch/arm64/Kconfig                            |  2 +-
 arch/csky/Kconfig                             |  2 +-
 arch/csky/kernel/entry.S                      |  4 ++--
 arch/mips/Kconfig                             |  2 +-
 arch/powerpc/Kconfig                          |  2 +-
 arch/powerpc/include/asm/context_tracking.h   |  2 +-
 arch/riscv/Kconfig                            |  2 +-
 arch/riscv/kernel/entry.S                     |  6 ++---
 arch/sparc/Kconfig                            |  2 +-
 arch/sparc/kernel/rtrap_64.S                  |  2 +-
 arch/x86/Kconfig                              |  4 ++--
 include/linux/context_tracking.h              | 12 +++++-----
 include/linux/context_tracking_state.h        |  4 ++--
 init/Kconfig                                  |  4 ++--
 kernel/context_tracking.c                     |  6 ++++-
 kernel/sched/core.c                           |  2 +-
 kernel/time/Kconfig                           | 22 +++++++++++--------
 22 files changed, 54 insertions(+), 46 deletions(-)

diff --git a/Documentation/features/time/context-tracking/arch-support.txt b/Documentation/features/time/context-tracking/arch-support.txt
index 4ed116c2ec39..0696fd08429e 100644
--- a/Documentation/features/time/context-tracking/arch-support.txt
+++ b/Documentation/features/time/context-tracking/arch-support.txt
@@ -1,7 +1,7 @@
 #
-# Feature name:          context-tracking
-#         Kconfig:       HAVE_CONTEXT_TRACKING
-#         description:   arch supports context tracking for NO_HZ_FULL
+# Feature name:          user-context-tracking
+#         Kconfig:       HAVE_CONTEXT_TRACKING_USER
+#         description:   arch supports user context tracking for NO_HZ_FULL
 #
     -----------------------
     |         arch |status|
diff --git a/arch/Kconfig b/arch/Kconfig
index 678a80713b21..1a3b79cfc9e3 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -762,7 +762,7 @@ config HAVE_ARCH_WITHIN_STACK_FRAMES
 	  and similar) by implementing an inline arch_within_stack_frames(),
 	  which is used by CONFIG_HARDENED_USERCOPY.
 
-config HAVE_CONTEXT_TRACKING
+config HAVE_CONTEXT_TRACKING_USER
 	bool
 	help
 	  Provide kernel/user boundaries probes necessary for subsystems
@@ -773,7 +773,7 @@ config HAVE_CONTEXT_TRACKING
 	  protected inside rcu_irq_enter/rcu_irq_exit() but preemption or signal
 	  handling on irq exit still need to be protected.
 
-config HAVE_CONTEXT_TRACKING_OFFSTACK
+config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
 	bool
 	help
 	  Architecture neither relies on exception_enter()/exception_exit()
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index fabe39169b12..2c5688f20421 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -81,7 +81,7 @@ config ARM
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE if ARM_LPAE
 	select HAVE_ARM_SMCCC if CPU_V7
 	select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32
-	select HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL
 	select HAVE_DMA_CONTIGUOUS if MMU
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index ac86c34682bb..5be34b7fe41e 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -26,7 +26,7 @@
 #include "entry-header.S"
 
 saved_psr	.req	r8
-#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING)
+#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING_USER)
 saved_pc	.req	r9
 #define TRACE(x...) x
 #else
@@ -36,7 +36,7 @@ saved_pc	.req	lr
 
 	.section .entry.text,"ax",%progbits
 	.align	5
-#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING) || \
+#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING_USER) || \
 	IS_ENABLED(CONFIG_DEBUG_RSEQ))
 /*
  * This is the fast syscall return path.  We do as little as possible here,
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 3af2a521e1d6..cd1ce0a9c652 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -361,7 +361,7 @@
  * between user and kernel mode.
  */
 	.macro ct_user_exit, save = 1
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	.if	\save
 	stmdb   sp!, {r0-r3, ip, lr}
 	bl	user_exit_callable
@@ -373,7 +373,7 @@
 	.endm
 
 	.macro ct_user_enter, save = 1
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	.if	\save
 	stmdb   sp!, {r0-r3, ip, lr}
 	bl	user_enter_callable
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6978140edfa4..96e75d7fa0a3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -169,7 +169,7 @@ config ARM64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_CMPXCHG_DOUBLE
 	select HAVE_CMPXCHG_LOCAL
-	select HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 132f43f12dd8..c94cc907b828 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -42,7 +42,7 @@ config CSKY
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_SECCOMP_FILTER
-	select HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select HAVE_DEBUG_BUGVERBOSE
 	select HAVE_DEBUG_KMEMLEAK
diff --git a/arch/csky/kernel/entry.S b/arch/csky/kernel/entry.S
index bc734d17c16f..547b4cd1b24b 100644
--- a/arch/csky/kernel/entry.S
+++ b/arch/csky/kernel/entry.S
@@ -19,7 +19,7 @@
 .endm
 
 .macro	context_tracking
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	mfcr	a0, epsr
 	btsti	a0, 31
 	bt	1f
@@ -159,7 +159,7 @@ ret_from_exception:
 	and	r10, r9
 	cmpnei	r10, 0
 	bt	exit_work
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	jbsr	user_enter_callable
 #endif
 1:
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 058446f01487..efcab39667ea 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -55,7 +55,7 @@ config MIPS
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE if CPU_SUPPORTS_HUGEPAGES
 	select HAVE_ASM_MODVERSIONS
-	select HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_TIF_NOHZ
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b779603978e1..9a889f919fed 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -192,7 +192,7 @@ config PPC
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ASM_MODVERSIONS
-	select HAVE_CONTEXT_TRACKING		if PPC64
+	select HAVE_CONTEXT_TRACKING_USER		if PPC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DEBUG_STACKOVERFLOW
diff --git a/arch/powerpc/include/asm/context_tracking.h b/arch/powerpc/include/asm/context_tracking.h
index f2682b28b050..4b63931c49e0 100644
--- a/arch/powerpc/include/asm/context_tracking.h
+++ b/arch/powerpc/include/asm/context_tracking.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_POWERPC_CONTEXT_TRACKING_H
 #define _ASM_POWERPC_CONTEXT_TRACKING_H
 
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 #define SCHEDULE_USER bl	schedule_user
 #else
 #define SCHEDULE_USER bl	schedule
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5adcbd9b5e88..36953ec26294 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -80,7 +80,7 @@ config RISCV
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
 	select HAVE_ARCH_VMAP_STACK if MMU && 64BIT
 	select HAVE_ASM_MODVERSIONS
-	select HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS if MMU
 	select HAVE_EBPF_JIT if MMU
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 5fbaa7be18a2..a773526fb3cc 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -111,7 +111,7 @@ _save_context:
 	call trace_hardirqs_off
 #endif
 
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	/* If previous state is in user mode, call user_exit_callable(). */
 	li   a0, SR_PP
 	and a0, s1, a0
@@ -176,7 +176,7 @@ handle_syscall:
 	 */
 	csrs CSR_STATUS, SR_IE
 #endif
-#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING)
+#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING_USER)
 	/* Recover a0 - a7 for system calls */
 	REG_L a0, PT_A0(sp)
 	REG_L a1, PT_A1(sp)
@@ -251,7 +251,7 @@ resume_userspace:
 	andi s1, s0, _TIF_WORK_MASK
 	bnez s1, work_pending
 
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	call user_enter_callable
 #endif
 
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 1cab1b284f1a..e736120f4333 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -71,7 +71,7 @@ config SPARC64
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_FTRACE_MCOUNT_RECORD
 	select HAVE_SYSCALL_TRACEPOINTS
-	select HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER
 	select HAVE_TIF_NOHZ
 	select HAVE_DEBUG_KMEMLEAK
 	select IOMMU_HELPER
diff --git a/arch/sparc/kernel/rtrap_64.S b/arch/sparc/kernel/rtrap_64.S
index c5fd4b450d9b..eef102765a7e 100644
--- a/arch/sparc/kernel/rtrap_64.S
+++ b/arch/sparc/kernel/rtrap_64.S
@@ -15,7 +15,7 @@
 #include <asm/visasm.h>
 #include <asm/processor.h>
 
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 # define SCHEDULE_USER schedule_user
 #else
 # define SCHEDULE_USER schedule
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ebe8fc76949a..fbda20f6cf08 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -182,8 +182,8 @@ config X86
 	select HAVE_ASM_MODVERSIONS
 	select HAVE_CMPXCHG_DOUBLE
 	select HAVE_CMPXCHG_LOCAL
-	select HAVE_CONTEXT_TRACKING		if X86_64
-	select HAVE_CONTEXT_TRACKING_OFFSTACK	if HAVE_CONTEXT_TRACKING
+	select HAVE_CONTEXT_TRACKING_USER		if X86_64
+	select HAVE_CONTEXT_TRACKING_USER_OFFSTACK	if HAVE_CONTEXT_TRACKING_USER
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_OBJTOOL_MCOUNT		if STACK_VALIDATION
 	select HAVE_DEBUG_KMEMLEAK
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 40badd62ad56..75738f20e111 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -10,7 +10,7 @@
 #include <asm/ptrace.h>
 
 
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 extern void context_tracking_cpu_track_user(int cpu);
 
 /* Called with interrupts disabled.  */
@@ -52,7 +52,7 @@ static inline enum ctx_state exception_enter(void)
 {
 	enum ctx_state prev_ctx;
 
-	if (IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK) ||
+	if (IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK) ||
 	    !context_tracking_enabled())
 		return 0;
 
@@ -65,7 +65,7 @@ static inline enum ctx_state exception_enter(void)
 
 static inline void exception_exit(enum ctx_state prev_ctx)
 {
-	if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK) &&
+	if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK) &&
 	    context_tracking_enabled()) {
 		if (prev_ctx != CONTEXT_KERNEL)
 			ct_user_enter(prev_ctx);
@@ -109,14 +109,14 @@ static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
 static __always_inline bool context_tracking_guest_enter(void) { return false; }
 static inline void context_tracking_guest_exit(void) { }
 
-#endif /* !CONFIG_CONTEXT_TRACKING */
+#endif /* !CONFIG_CONTEXT_TRACKING_USER */
 
 #define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled() && (cond))
 
-#ifdef CONFIG_CONTEXT_TRACKING_FORCE
+#ifdef CONFIG_CONTEXT_TRACKING_USER_FORCE
 extern void context_tracking_init(void);
 #else
 static inline void context_tracking_init(void) { }
-#endif /* CONFIG_CONTEXT_TRACKING_FORCE */
+#endif /* CONFIG_CONTEXT_TRACKING_USER_FORCE */
 
 #endif
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 65a60d3313b0..64dbbb880378 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -22,7 +22,7 @@ struct context_tracking {
 	} state;
 };
 
-#ifdef CONFIG_CONTEXT_TRACKING
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 extern struct static_key_false context_tracking_key;
 DECLARE_PER_CPU(struct context_tracking, context_tracking);
 
@@ -50,6 +50,6 @@ static inline bool context_tracking_in_user(void) { return false; }
 static inline bool context_tracking_enabled(void) { return false; }
 static inline bool context_tracking_enabled_cpu(int cpu) { return false; }
 static inline bool context_tracking_enabled_this_cpu(void) { return false; }
-#endif /* CONFIG_CONTEXT_TRACKING */
+#endif /* CONFIG_CONTEXT_TRACKING_USER */
 
 #endif
diff --git a/init/Kconfig b/init/Kconfig
index e9119bf54b1f..22525443de90 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -498,11 +498,11 @@ config VIRT_CPU_ACCOUNTING_NATIVE
 
 config VIRT_CPU_ACCOUNTING_GEN
 	bool "Full dynticks CPU time accounting"
-	depends on HAVE_CONTEXT_TRACKING
+	depends on HAVE_CONTEXT_TRACKING_USER
 	depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
 	depends on GENERIC_CLOCKEVENTS
 	select VIRT_CPU_ACCOUNTING
-	select CONTEXT_TRACKING
+	select CONTEXT_TRACKING_USER
 	help
 	  Select this option to enable task and CPU time accounting on full
 	  dynticks systems. This accounting is implemented by watching every
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 7b6643d2075d..42054841af3f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -22,6 +22,8 @@
 #include <linux/export.h>
 #include <linux/kprobes.h>
 
+#ifdef CONFIG_CONTEXT_TRACKING_USER
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/context_tracking.h>
 
@@ -222,7 +224,7 @@ void __init context_tracking_cpu_track_user(int cpu)
 	initialized = true;
 }
 
-#ifdef CONFIG_CONTEXT_TRACKING_FORCE
+#ifdef CONFIG_CONTEXT_TRACKING_USER_FORCE
 void __init context_tracking_init(void)
 {
 	int cpu;
@@ -231,3 +233,5 @@ void __init context_tracking_init(void)
 		context_tracking_cpu_track_user(cpu);
 }
 #endif
+
+#endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2e4ae00e52d1..e79485afb58c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6398,7 +6398,7 @@ void __sched schedule_idle(void)
 	} while (need_resched());
 }
 
-#if defined(CONFIG_CONTEXT_TRACKING) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK)
+#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
 asmlinkage __visible void __sched schedule_user(void)
 {
 	/*
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 27b7868b5c30..aad89cc96787 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -111,7 +111,7 @@ config NO_HZ_FULL
 	# NO_HZ_COMMON dependency
 	# We need at least one periodic CPU for timekeeping
 	depends on SMP
-	depends on HAVE_CONTEXT_TRACKING
+	depends on HAVE_CONTEXT_TRACKING_USER
 	# VIRT_CPU_ACCOUNTING_GEN dependency
 	depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select NO_HZ_COMMON
@@ -140,28 +140,32 @@ endchoice
 config CONTEXT_TRACKING
        bool
 
-config CONTEXT_TRACKING_FORCE
-	bool "Force context tracking"
-	depends on CONTEXT_TRACKING
+config CONTEXT_TRACKING_USER
+       select CONTEXT_TRACKING
+       bool
+
+config CONTEXT_TRACKING_USER_FORCE
+	bool "Force user context tracking"
+	depends on CONTEXT_TRACKING_USER
 	default y if !NO_HZ_FULL
 	help
 	  The major pre-requirement for full dynticks to work is to
-	  support the context tracking subsystem. But there are also
+	  support the user context tracking subsystem. But there are also
 	  other dependencies to provide in order to make the full
 	  dynticks working.
 
 	  This option stands for testing when an arch implements the
-	  context tracking backend but doesn't yet fulfill all the
+	  user context tracking backend but doesn't yet fulfill all the
 	  requirements to make the full dynticks feature working.
 	  Without the full dynticks, there is no way to test the support
-	  for context tracking and the subsystems that rely on it: RCU
+	  for user context tracking and the subsystems that rely on it: RCU
 	  userspace extended quiescent state and tickless cputime
 	  accounting. This option copes with the absence of the full
-	  dynticks subsystem by forcing the context tracking on all
+	  dynticks subsystem by forcing the user context tracking on all
 	  CPUs in the system.
 
 	  Say Y only if you're working on the development of an
-	  architecture backend for the context tracking.
+	  architecture backend for the user context tracking.
 
 	  Say N otherwise, this option brings an overhead that you
 	  don't want in production.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2022-03-02 15:47 ` [PATCH 05/19] context_tracking: Split user tracking Kconfig Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-05 14:05   ` Peter Zijlstra
  2022-03-02 15:47 ` [PATCH 07/19] context_tracking: Take IRQ " Frederic Weisbecker
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The RCU dynticks counter is going to be merged into the context tracking
subsystem. Start with moving the idle extended quiescent states
entrypoints to RCU. For now those are dumb redirection to existing RCU
calls.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 Documentation/RCU/stallwarn.rst   |  4 ++--
 arch/arm/mach-imx/cpuidle-imx6q.c |  5 +++--
 drivers/acpi/processor_idle.c     |  5 +++--
 drivers/cpuidle/cpuidle.c         |  9 +++++----
 include/linux/context_tracking.h  |  8 ++++++++
 include/linux/rcupdate.h          |  2 +-
 kernel/context_tracking.c         | 12 ++++++++++++
 kernel/locking/lockdep.c          |  2 +-
 kernel/rcu/Kconfig                |  2 ++
 kernel/rcu/tree.c                 |  2 --
 kernel/rcu/update.c               |  2 +-
 kernel/sched/idle.c               | 11 ++++++-----
 12 files changed, 44 insertions(+), 20 deletions(-)

diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst
index 1d863b04727c..bdd52b40f307 100644
--- a/Documentation/RCU/stallwarn.rst
+++ b/Documentation/RCU/stallwarn.rst
@@ -97,8 +97,8 @@ warnings:
 	which will include additional debugging information.
 
 -	A low-level kernel issue that either fails to invoke one of the
-	variants of rcu_user_enter(), rcu_user_exit(), rcu_idle_enter(),
-	rcu_idle_exit(), rcu_irq_enter(), or rcu_irq_exit() on the one
+	variants of rcu_user_enter(), rcu_user_exit(), ct_idle_enter(),
+	ct_idle_exit(), rcu_irq_enter(), or rcu_irq_exit() on the one
 	hand, or that invokes one of them too many times on the other.
 	Historically, the most frequent issue has been an omission
 	of either irq_enter() or irq_exit(), which in turn invoke
diff --git a/arch/arm/mach-imx/cpuidle-imx6q.c b/arch/arm/mach-imx/cpuidle-imx6q.c
index 094337dc1bc7..d086cbae09c3 100644
--- a/arch/arm/mach-imx/cpuidle-imx6q.c
+++ b/arch/arm/mach-imx/cpuidle-imx6q.c
@@ -3,6 +3,7 @@
  * Copyright (C) 2012 Freescale Semiconductor, Inc.
  */
 
+#include <linux/context_tracking.h>
 #include <linux/cpuidle.h>
 #include <linux/module.h>
 #include <asm/cpuidle.h>
@@ -24,9 +25,9 @@ static int imx6q_enter_wait(struct cpuidle_device *dev,
 		imx6_set_lpm(WAIT_UNCLOCKED);
 	raw_spin_unlock(&cpuidle_lock);
 
-	rcu_idle_enter();
+	ct_idle_enter();
 	cpu_do_idle();
-	rcu_idle_exit();
+	ct_idle_exit();
 
 	raw_spin_lock(&cpuidle_lock);
 	if (num_idle_cpus-- == num_online_cpus())
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 86560a28751b..ce310c11f6ae 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -22,6 +22,7 @@
 #include <linux/cpu.h>
 #include <linux/minmax.h>
 #include <acpi/processor.h>
+#include <linux/context_tracking.h>
 
 /*
  * Include the apic definitions for x86 to have the APIC timer related defines
@@ -643,11 +644,11 @@ static int acpi_idle_enter_bm(struct cpuidle_driver *drv,
 		raw_spin_unlock(&c3_lock);
 	}
 
-	rcu_idle_enter();
+	ct_idle_enter();
 
 	acpi_idle_do_entry(cx);
 
-	rcu_idle_exit();
+	ct_idle_exit();
 
 	/* Re-enable bus master arbitration */
 	if (dis_bm) {
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index ef2ea1b12cd8..62dd956025f3 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -23,6 +23,7 @@
 #include <linux/suspend.h>
 #include <linux/tick.h>
 #include <linux/mmu_context.h>
+#include <linux/context_tracking.h>
 #include <trace/events/power.h>
 
 #include "cpuidle.h"
@@ -150,12 +151,12 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv,
 	 */
 	stop_critical_timings();
 	if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
-		rcu_idle_enter();
+		ct_idle_enter();
 	target_state->enter_s2idle(dev, drv, index);
 	if (WARN_ON_ONCE(!irqs_disabled()))
 		local_irq_disable();
 	if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
-		rcu_idle_exit();
+		ct_idle_exit();
 	tick_unfreeze();
 	start_critical_timings();
 
@@ -233,10 +234,10 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 
 	stop_critical_timings();
 	if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
-		rcu_idle_enter();
+		ct_idle_enter();
 	entered_state = target_state->enter(dev, drv, index);
 	if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
-		rcu_idle_exit();
+		ct_idle_exit();
 	start_critical_timings();
 
 	sched_clock_idle_wakeup_event();
diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 75738f20e111..52a2e23d5107 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -119,4 +119,12 @@ extern void context_tracking_init(void);
 static inline void context_tracking_init(void) { }
 #endif /* CONFIG_CONTEXT_TRACKING_USER_FORCE */
 
+#ifdef CONFIG_CONTEXT_TRACKING
+extern void ct_idle_enter(void);
+extern void ct_idle_exit(void);
+#else
+static inline void ct_idle_enter(void) { }
+static inline void ct_idle_exit(void) { }
+#endif /* !CONFIG_CONTEXT_TRACKING */
+
 #endif
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e7c39c200e2b..38258542a6c3 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -128,7 +128,7 @@ static inline void rcu_nocb_flush_deferred_wakeup(void) { }
  * @a: Code that RCU needs to pay attention to.
  *
  * RCU read-side critical sections are forbidden in the inner idle loop,
- * that is, between the rcu_idle_enter() and the rcu_idle_exit() -- RCU
+ * that is, between the ct_idle_enter() and the ct_idle_exit() -- RCU
  * will happily ignore any such read-side critical sections.  However,
  * things like powertop need tracepoints in the inner idle loop.
  *
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 42054841af3f..3d479f363275 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -235,3 +235,15 @@ void __init context_tracking_init(void)
 #endif
 
 #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
+
+void ct_idle_enter(void)
+{
+	rcu_idle_enter();
+}
+EXPORT_SYMBOL_GPL(ct_idle_enter);
+
+void ct_idle_exit(void)
+{
+	rcu_idle_exit();
+}
+EXPORT_SYMBOL_GPL(ct_idle_exit);
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 4a882f83aeb9..c34b465ee9f5 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6539,7 +6539,7 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 
 	/*
 	 * If a CPU is in the RCU-free window in idle (ie: in the section
-	 * between rcu_idle_enter() and rcu_idle_exit(), then RCU
+	 * between ct_idle_enter() and ct_idle_exit(), then RCU
 	 * considers that CPU to be in an "extended quiescent state",
 	 * which means that RCU will be completely ignoring that CPU.
 	 * Therefore, rcu_read_lock() and friends have absolutely no
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index bf8e341e75b4..13a49b6cbe37 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -8,6 +8,8 @@ menu "RCU Subsystem"
 config TREE_RCU
 	bool
 	default y if SMP
+	# Dynticks-idle tracking
+	select CONTEXT_TRACKING
 	help
 	  This option selects the RCU implementation that is
 	  designed for very large SMP system with hundreds or
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9f36bd82ffd1..57110c583767 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -664,7 +664,6 @@ void rcu_idle_enter(void)
 	lockdep_assert_irqs_disabled();
 	rcu_eqs_enter(false);
 }
-EXPORT_SYMBOL_GPL(rcu_idle_enter);
 
 #ifdef CONFIG_NO_HZ_FULL
 
@@ -904,7 +903,6 @@ void rcu_idle_exit(void)
 	rcu_eqs_exit(false);
 	local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(rcu_idle_exit);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index fc7fef575606..147214b2cd68 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -85,7 +85,7 @@ module_param(rcu_normal_after_boot, int, 0444);
  * and while lockdep is disabled.
  *
  * Note that if the CPU is in the idle loop from an RCU point of view (ie:
- * that we are in the section between rcu_idle_enter() and rcu_idle_exit())
+ * that we are in the section between ct_idle_enter() and ct_idle_exit())
  * then rcu_read_lock_held() sets ``*ret`` to false even if the CPU did an
  * rcu_read_lock().  The reason for this is that RCU ignores CPUs that are
  * in such a section, considering these as in extended quiescent state,
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index d17b0a5ce6ac..421341d6a74b 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -7,6 +7,7 @@
  *        tasks which are handled in sched/fair.c )
  */
 #include "sched.h"
+#include <linux/context_tracking.h>
 
 #include <trace/events/power.h>
 
@@ -56,14 +57,14 @@ static noinline int __cpuidle cpu_idle_poll(void)
 {
 	trace_cpu_idle(0, smp_processor_id());
 	stop_critical_timings();
-	rcu_idle_enter();
+	ct_idle_enter();
 	local_irq_enable();
 
 	while (!tif_need_resched() &&
 	       (cpu_idle_force_poll || tick_check_broadcast_expired()))
 		cpu_relax();
 
-	rcu_idle_exit();
+	ct_idle_exit();
 	start_critical_timings();
 	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
 
@@ -101,12 +102,12 @@ void __cpuidle default_idle_call(void)
 		 *
 		 * Trace IRQs enable here, then switch off RCU, and have
 		 * arch_cpu_idle() use raw_local_irq_enable(). Note that
-		 * rcu_idle_enter() relies on lockdep IRQ state, so switch that
+		 * ct_idle_enter() relies on lockdep IRQ state, so switch that
 		 * last -- this is very similar to the entry code.
 		 */
 		trace_hardirqs_on_prepare();
 		lockdep_hardirqs_on_prepare(_THIS_IP_);
-		rcu_idle_enter();
+		ct_idle_enter();
 		lockdep_hardirqs_on(_THIS_IP_);
 
 		arch_cpu_idle();
@@ -119,7 +120,7 @@ void __cpuidle default_idle_call(void)
 		 */
 		raw_local_irq_disable();
 		lockdep_hardirqs_off(_THIS_IP_);
-		rcu_idle_exit();
+		ct_idle_exit();
 		lockdep_hardirqs_on(_THIS_IP_);
 		raw_local_irq_enable();
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 07/19] context_tracking: Take IRQ eqs entrypoints over RCU
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2022-03-02 15:47 ` [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-10 19:46   ` Paul E. McKenney
  2022-03-02 15:47 ` [PATCH 08/19] context_tracking: Take NMI " Frederic Weisbecker
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The RCU dynticks counter is going to be merged into the context tracking
subsystem. Prepare with moving the IRQ extended quiescent states
entrypoints to context tracking. For now those are dumb redirection to
existing RCU calls.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 .../RCU/Design/Requirements/Requirements.rst  | 10 ++++----
 Documentation/RCU/stallwarn.rst               |  4 ++--
 arch/Kconfig                                  |  2 +-
 arch/arm64/kernel/entry-common.c              |  6 ++---
 arch/x86/mm/fault.c                           |  2 +-
 drivers/cpuidle/cpuidle-psci.c                |  8 +++----
 include/linux/context_tracking_irq.h          | 17 +++++++++++++
 include/linux/context_tracking_state.h        |  1 +
 include/linux/entry-common.h                  | 10 ++++----
 include/linux/rcupdate.h                      |  5 ++--
 include/linux/tracepoint.h                    |  4 ++--
 kernel/context_tracking.c                     | 24 +++++++++++++++++--
 kernel/cpu_pm.c                               |  8 +++----
 kernel/entry/common.c                         | 12 +++++-----
 kernel/softirq.c                              |  4 ++--
 kernel/trace/trace.c                          |  6 ++---
 16 files changed, 81 insertions(+), 42 deletions(-)
 create mode 100644 include/linux/context_tracking_irq.h

diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index ff2be1ac54c4..e3dd5d71c798 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -1844,10 +1844,10 @@ that meets this requirement.
 
 Furthermore, NMI handlers can be interrupted by what appear to RCU to be
 normal interrupts. One way that this can happen is for code that
-directly invokes rcu_irq_enter() and rcu_irq_exit() to be called
+directly invokes ct_irq_enter() and ct_irq_exit() to be called
 from an NMI handler. This astonishing fact of life prompted the current
-code structure, which has rcu_irq_enter() invoking
-rcu_nmi_enter() and rcu_irq_exit() invoking rcu_nmi_exit().
+code structure, which has ct_irq_enter() invoking
+rcu_nmi_enter() and ct_irq_exit() invoking rcu_nmi_exit().
 And yes, I also learned of this requirement the hard way.
 
 Loadable Modules
@@ -2195,7 +2195,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
    sections, and RCU believes this CPU to be idle, no problem. This
    sort of thing is used by some architectures for light-weight
    exception handlers, which can then avoid the overhead of
-   rcu_irq_enter() and rcu_irq_exit() at exception entry and
+   ct_irq_enter() and ct_irq_exit() at exception entry and
    exit, respectively. Some go further and avoid the entireties of
    irq_enter() and irq_exit().
    Just make very sure you are running some of your tests with
@@ -2226,7 +2226,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
 +-----------------------------------------------------------------------+
 | **Answer**:                                                           |
 +-----------------------------------------------------------------------+
-| One approach is to do ``rcu_irq_exit();rcu_irq_enter();`` every so    |
+| One approach is to do ``ct_irq_exit();ct_irq_enter();`` every so    |
 | often. But given that long-running interrupt handlers can cause other |
 | problems, not least for response time, shouldn't you work to keep     |
 | your interrupt handler's runtime within reasonable bounds?            |
diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst
index bdd52b40f307..7858c3afa1f4 100644
--- a/Documentation/RCU/stallwarn.rst
+++ b/Documentation/RCU/stallwarn.rst
@@ -98,11 +98,11 @@ warnings:
 
 -	A low-level kernel issue that either fails to invoke one of the
 	variants of rcu_user_enter(), rcu_user_exit(), ct_idle_enter(),
-	ct_idle_exit(), rcu_irq_enter(), or rcu_irq_exit() on the one
+	ct_idle_exit(), ct_irq_enter(), or ct_irq_exit() on the one
 	hand, or that invokes one of them too many times on the other.
 	Historically, the most frequent issue has been an omission
 	of either irq_enter() or irq_exit(), which in turn invoke
-	rcu_irq_enter() or rcu_irq_exit(), respectively.  Building your
+	ct_irq_enter() or ct_irq_exit(), respectively.  Building your
 	kernel with CONFIG_RCU_EQS_DEBUG=y can help track down these types
 	of issues, which sometimes arise in architecture-specific code.
 
diff --git a/arch/Kconfig b/arch/Kconfig
index 1a3b79cfc9e3..66b2b6d4717b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -770,7 +770,7 @@ config HAVE_CONTEXT_TRACKING_USER
 	  Syscalls need to be wrapped inside user_exit()-user_enter(), either
 	  optimized behind static key or through the slow path using TIF_NOHZ
 	  flag. Exceptions handlers must be wrapped as well. Irqs are already
-	  protected inside rcu_irq_enter/rcu_irq_exit() but preemption or signal
+	  protected inside ct_irq_enter/ct_irq_exit() but preemption or signal
 	  handling on irq exit still need to be protected.
 
 config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index ef7fcefb96bd..43ca8cf4e1dd 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -40,7 +40,7 @@ static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)
 
 	if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
 		lockdep_hardirqs_off(CALLER_ADDR0);
-		rcu_irq_enter();
+		ct_irq_enter();
 		trace_hardirqs_off_finish();
 
 		regs->exit_rcu = true;
@@ -74,7 +74,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
 		if (regs->exit_rcu) {
 			trace_hardirqs_on_prepare();
 			lockdep_hardirqs_on_prepare(CALLER_ADDR0);
-			rcu_irq_exit();
+			ct_irq_exit();
 			lockdep_hardirqs_on(CALLER_ADDR0);
 			return;
 		}
@@ -82,7 +82,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
 		trace_hardirqs_on();
 	} else {
 		if (regs->exit_rcu)
-			rcu_irq_exit();
+			ct_irq_exit();
 	}
 }
 
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index d0074c6ed31a..b781785b1ff3 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1526,7 +1526,7 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
 
 	/*
 	 * Entry handling for valid #PF from kernel mode is slightly
-	 * different: RCU is already watching and rcu_irq_enter() must not
+	 * different: RCU is already watching and ct_irq_enter() must not
 	 * be invoked because a kernel fault on a user space address might
 	 * sleep.
 	 *
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index b51b5df08450..fe31b2d522b3 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -68,12 +68,12 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev,
 		return -1;
 
 	/* Do runtime PM to manage a hierarchical CPU toplogy. */
-	rcu_irq_enter_irqson();
+	ct_irq_enter_irqson();
 	if (s2idle)
 		dev_pm_genpd_suspend(pd_dev);
 	else
 		pm_runtime_put_sync_suspend(pd_dev);
-	rcu_irq_exit_irqson();
+	ct_irq_exit_irqson();
 
 	state = psci_get_domain_state();
 	if (!state)
@@ -81,12 +81,12 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev,
 
 	ret = psci_cpu_suspend_enter(state) ? -1 : idx;
 
-	rcu_irq_enter_irqson();
+	ct_irq_enter_irqson();
 	if (s2idle)
 		dev_pm_genpd_resume(pd_dev);
 	else
 		pm_runtime_get_sync(pd_dev);
-	rcu_irq_exit_irqson();
+	ct_irq_exit_irqson();
 
 	cpu_pm_exit();
 
diff --git a/include/linux/context_tracking_irq.h b/include/linux/context_tracking_irq.h
new file mode 100644
index 000000000000..60e3ed15a04e
--- /dev/null
+++ b/include/linux/context_tracking_irq.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CONTEXT_TRACKING_IRQ_H
+#define _LINUX_CONTEXT_TRACKING_IRQ_H
+
+#ifdef CONFIG_CONTEXT_TRACKING
+void ct_irq_enter(void);
+void ct_irq_exit(void);
+void ct_irq_enter_irqson(void);
+void ct_irq_exit_irqson(void);
+#else
+static inline void ct_irq_enter(void) { }
+static inline void ct_irq_exit(void) { }
+static inline void ct_irq_enter_irqson(void) { }
+static inline void ct_irq_exit_irqson(void) { }
+#endif
+
+#endif
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 64dbbb880378..cdc692caa01d 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -4,6 +4,7 @@
 
 #include <linux/percpu.h>
 #include <linux/static_key.h>
+#include <linux/context_tracking_irq.h>
 
 struct context_tracking {
 	/*
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 2e2b8d6140ed..7c6b1d864448 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -396,7 +396,7 @@ void irqentry_exit_to_user_mode(struct pt_regs *regs);
 /**
  * struct irqentry_state - Opaque object for exception state storage
  * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the
- *            exit path has to invoke rcu_irq_exit().
+ *            exit path has to invoke ct_irq_exit().
  * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that
  *           lockdep state is restored correctly on exit from nmi.
  *
@@ -434,12 +434,12 @@ typedef struct irqentry_state {
  *
  * For kernel mode entries RCU handling is done conditional. If RCU is
  * watching then the only RCU requirement is to check whether the tick has
- * to be restarted. If RCU is not watching then rcu_irq_enter() has to be
- * invoked on entry and rcu_irq_exit() on exit.
+ * to be restarted. If RCU is not watching then ct_irq_enter() has to be
+ * invoked on entry and ct_irq_exit() on exit.
  *
- * Avoiding the rcu_irq_enter/exit() calls is an optimization but also
+ * Avoiding the ct_irq_enter/exit() calls is an optimization but also
  * solves the problem of kernel mode pagefaults which can schedule, which
- * is not possible after invoking rcu_irq_enter() without undoing it.
+ * is not possible after invoking ct_irq_enter() without undoing it.
  *
  * For user mode entries irqentry_enter_from_user_mode() is invoked to
  * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 38258542a6c3..5efba2bfa689 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -29,6 +29,7 @@
 #include <linux/lockdep.h>
 #include <asm/processor.h>
 #include <linux/cpumask.h>
+#include <linux/context_tracking_irq.h>
 
 #define ULONG_CMP_GE(a, b)	(ULONG_MAX / 2 >= (a) - (b))
 #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
@@ -143,9 +144,9 @@ static inline void rcu_nocb_flush_deferred_wakeup(void) { }
  */
 #define RCU_NONIDLE(a) \
 	do { \
-		rcu_irq_enter_irqson(); \
+		ct_irq_enter_irqson(); \
 		do { a; } while (0); \
-		rcu_irq_exit_irqson(); \
+		ct_irq_exit_irqson(); \
 	} while (0)
 
 /*
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 28031b15f878..55717a2eda08 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -200,13 +200,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
 		 */							\
 		if (rcuidle) {						\
 			__idx = srcu_read_lock_notrace(&tracepoint_srcu);\
-			rcu_irq_enter_irqson();				\
+			ct_irq_enter_irqson();				\
 		}							\
 									\
 		__DO_TRACE_CALL(name, TP_ARGS(args));			\
 									\
 		if (rcuidle) {						\
-			rcu_irq_exit_irqson();				\
+			ct_irq_exit_irqson();				\
 			srcu_read_unlock_notrace(&tracepoint_srcu, __idx);\
 		}							\
 									\
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 3d479f363275..b63ff851472e 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -75,7 +75,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
 			 * At this stage, only low level arch entry code remains and
 			 * then we'll run in userspace. We can assume there won't be
 			 * any RCU read-side critical section until the next call to
-			 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
+			 * user_exit() or ct_irq_enter(). Let's remove RCU's dependency
 			 * on the tick.
 			 */
 			if (state == CONTEXT_USER) {
@@ -112,7 +112,7 @@ void ct_user_enter(enum ctx_state state)
 	/*
 	 * Some contexts may involve an exception occuring in an irq,
 	 * leading to that nesting:
-	 * rcu_irq_enter() rcu_user_exit() rcu_user_exit() rcu_irq_exit()
+	 * ct_irq_enter() rcu_user_exit() rcu_user_exit() ct_irq_exit()
 	 * This would mess up the dyntick_nesting count though. And rcu_irq_*()
 	 * helpers are enough to protect RCU uses inside the exception. So
 	 * just return immediately if we detect we are in an IRQ.
@@ -247,3 +247,23 @@ void ct_idle_exit(void)
 	rcu_idle_exit();
 }
 EXPORT_SYMBOL_GPL(ct_idle_exit);
+
+noinstr void ct_irq_enter(void)
+{
+	rcu_irq_enter();
+}
+
+noinstr void ct_irq_exit(void)
+{
+	rcu_irq_exit();
+}
+
+void ct_irq_enter_irqson(void)
+{
+	rcu_irq_enter_irqson();
+}
+
+void ct_irq_exit_irqson(void)
+{
+	rcu_irq_exit_irqson();
+}
diff --git a/kernel/cpu_pm.c b/kernel/cpu_pm.c
index 246efc74e3f3..ba4ba71facf9 100644
--- a/kernel/cpu_pm.c
+++ b/kernel/cpu_pm.c
@@ -35,11 +35,11 @@ static int cpu_pm_notify(enum cpu_pm_event event)
 	 * disfunctional in cpu idle. Copy RCU_NONIDLE code to let RCU know
 	 * this.
 	 */
-	rcu_irq_enter_irqson();
+	ct_irq_enter_irqson();
 	rcu_read_lock();
 	ret = raw_notifier_call_chain(&cpu_pm_notifier.chain, event, NULL);
 	rcu_read_unlock();
-	rcu_irq_exit_irqson();
+	ct_irq_exit_irqson();
 
 	return notifier_to_errno(ret);
 }
@@ -49,11 +49,11 @@ static int cpu_pm_notify_robust(enum cpu_pm_event event_up, enum cpu_pm_event ev
 	unsigned long flags;
 	int ret;
 
-	rcu_irq_enter_irqson();
+	ct_irq_enter_irqson();
 	raw_spin_lock_irqsave(&cpu_pm_notifier.lock, flags);
 	ret = raw_notifier_call_chain_robust(&cpu_pm_notifier.chain, event_up, event_down, NULL);
 	raw_spin_unlock_irqrestore(&cpu_pm_notifier.lock, flags);
-	rcu_irq_exit_irqson();
+	ct_irq_exit_irqson();
 
 	return notifier_to_errno(ret);
 }
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index bad713684c2e..cebc98b8adc6 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -327,7 +327,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
 	}
 
 	/*
-	 * If this entry hit the idle task invoke rcu_irq_enter() whether
+	 * If this entry hit the idle task invoke ct_irq_enter() whether
 	 * RCU is watching or not.
 	 *
 	 * Interrupts can nest when the first interrupt invokes softirq
@@ -338,12 +338,12 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
 	 * not nested into another interrupt.
 	 *
 	 * Checking for rcu_is_watching() here would prevent the nesting
-	 * interrupt to invoke rcu_irq_enter(). If that nested interrupt is
+	 * interrupt to invoke ct_irq_enter(). If that nested interrupt is
 	 * the tick then rcu_flavor_sched_clock_irq() would wrongfully
 	 * assume that it is the first interrupt and eventually claim
 	 * quiescent state and end grace periods prematurely.
 	 *
-	 * Unconditionally invoke rcu_irq_enter() so RCU state stays
+	 * Unconditionally invoke ct_irq_enter() so RCU state stays
 	 * consistent.
 	 *
 	 * TINY_RCU does not support EQS, so let the compiler eliminate
@@ -356,7 +356,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
 		 * as in irqentry_enter_from_user_mode().
 		 */
 		lockdep_hardirqs_off(CALLER_ADDR0);
-		rcu_irq_enter();
+		ct_irq_enter();
 		instrumentation_begin();
 		trace_hardirqs_off_finish();
 		instrumentation_end();
@@ -414,7 +414,7 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state)
 			trace_hardirqs_on_prepare();
 			lockdep_hardirqs_on_prepare(CALLER_ADDR0);
 			instrumentation_end();
-			rcu_irq_exit();
+			ct_irq_exit();
 			lockdep_hardirqs_on(CALLER_ADDR0);
 			return;
 		}
@@ -436,7 +436,7 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state)
 		 * was not watching on entry.
 		 */
 		if (state.exit_rcu)
-			rcu_irq_exit();
+			ct_irq_exit();
 	}
 }
 
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 41f470929e99..7b6761c1a0f3 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -607,7 +607,7 @@ void irq_enter_rcu(void)
  */
 void irq_enter(void)
 {
-	rcu_irq_enter();
+	ct_irq_enter();
 	irq_enter_rcu();
 }
 
@@ -659,7 +659,7 @@ void irq_exit_rcu(void)
 void irq_exit(void)
 {
 	__irq_exit_rcu();
-	rcu_irq_exit();
+	ct_irq_exit();
 	 /* must be last! */
 	lockdep_hardirq_exit();
 }
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a569a0cb81ee..7c500c708180 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3088,15 +3088,15 @@ void __trace_stack(struct trace_array *tr, unsigned int trace_ctx,
 	/*
 	 * When an NMI triggers, RCU is enabled via rcu_nmi_enter(),
 	 * but if the above rcu_is_watching() failed, then the NMI
-	 * triggered someplace critical, and rcu_irq_enter() should
+	 * triggered someplace critical, and ct_irq_enter() should
 	 * not be called from NMI.
 	 */
 	if (unlikely(in_nmi()))
 		return;
 
-	rcu_irq_enter_irqson();
+	ct_irq_enter_irqson();
 	__ftrace_trace_stack(buffer, trace_ctx, skip, NULL);
-	rcu_irq_exit_irqson();
+	ct_irq_exit_irqson();
 }
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 08/19] context_tracking: Take NMI eqs entrypoints over RCU
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2022-03-02 15:47 ` [PATCH 07/19] context_tracking: Take IRQ " Frederic Weisbecker
@ 2022-03-02 15:47 ` Frederic Weisbecker
  2022-03-10 19:47   ` Paul E. McKenney
  2022-03-02 15:48 ` [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit() Frederic Weisbecker
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:47 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The RCU dynticks counter is going to be merged into the context tracking
subsystem. Prepare with moving the NMI extended quiescent states
entrypoints to context tracking. For now those are dumb redirection to
existing RCU calls.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 Documentation/RCU/Design/Requirements/Requirements.rst |  2 +-
 arch/Kconfig                                           |  2 +-
 arch/arm64/kernel/entry-common.c                       |  8 ++++----
 include/linux/context_tracking_irq.h                   |  4 ++++
 include/linux/hardirq.h                                |  4 ++--
 kernel/context_tracking.c                              | 10 ++++++++++
 kernel/entry/common.c                                  |  4 ++--
 kernel/extable.c                                       |  4 ++--
 kernel/trace/trace.c                                   |  2 +-
 9 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index e3dd5d71c798..256cf260e864 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -1847,7 +1847,7 @@ normal interrupts. One way that this can happen is for code that
 directly invokes ct_irq_enter() and ct_irq_exit() to be called
 from an NMI handler. This astonishing fact of life prompted the current
 code structure, which has ct_irq_enter() invoking
-rcu_nmi_enter() and ct_irq_exit() invoking rcu_nmi_exit().
+ct_nmi_enter() and ct_irq_exit() invoking ct_nmi_exit().
 And yes, I also learned of this requirement the hard way.
 
 Loadable Modules
diff --git a/arch/Kconfig b/arch/Kconfig
index 66b2b6d4717b..c22b8ca0eb01 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -785,7 +785,7 @@ config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
 
 	  - Critical entry code isn't preemptible (or better yet:
 	    not interruptible).
-	  - No use of RCU read side critical sections, unless rcu_nmi_enter()
+	  - No use of RCU read side critical sections, unless ct_nmi_enter()
 	    got called.
 	  - No use of instrumentation, unless instrumentation_begin() got
 	    called.
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 43ca8cf4e1dd..6a1ea28731c8 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -158,7 +158,7 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
 	__nmi_enter();
 	lockdep_hardirqs_off(CALLER_ADDR0);
 	lockdep_hardirq_enter();
-	rcu_nmi_enter();
+	ct_nmi_enter();
 
 	trace_hardirqs_off_finish();
 	ftrace_nmi_enter();
@@ -179,7 +179,7 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs)
 		lockdep_hardirqs_on_prepare(CALLER_ADDR0);
 	}
 
-	rcu_nmi_exit();
+	ct_nmi_exit();
 	lockdep_hardirq_exit();
 	if (restore)
 		lockdep_hardirqs_on(CALLER_ADDR0);
@@ -196,7 +196,7 @@ static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
 	regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
 
 	lockdep_hardirqs_off(CALLER_ADDR0);
-	rcu_nmi_enter();
+	ct_nmi_enter();
 
 	trace_hardirqs_off_finish();
 }
@@ -215,7 +215,7 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs)
 		lockdep_hardirqs_on_prepare(CALLER_ADDR0);
 	}
 
-	rcu_nmi_exit();
+	ct_nmi_exit();
 	if (restore)
 		lockdep_hardirqs_on(CALLER_ADDR0);
 }
diff --git a/include/linux/context_tracking_irq.h b/include/linux/context_tracking_irq.h
index 60e3ed15a04e..11043bf724b7 100644
--- a/include/linux/context_tracking_irq.h
+++ b/include/linux/context_tracking_irq.h
@@ -7,11 +7,15 @@ void ct_irq_enter(void);
 void ct_irq_exit(void);
 void ct_irq_enter_irqson(void);
 void ct_irq_exit_irqson(void);
+void ct_nmi_enter(void);
+void ct_nmi_exit(void);
 #else
 static inline void ct_irq_enter(void) { }
 static inline void ct_irq_exit(void) { }
 static inline void ct_irq_enter_irqson(void) { }
 static inline void ct_irq_exit_irqson(void) { }
+static inline void ct_nmi_enter(void) { }
+static inline void ct_nmi_exit(void) { }
 #endif
 
 #endif
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 76878b357ffa..345cdbe9c1b7 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -124,7 +124,7 @@ extern void rcu_nmi_exit(void);
 	do {							\
 		__nmi_enter();					\
 		lockdep_hardirq_enter();			\
-		rcu_nmi_enter();				\
+		ct_nmi_enter();				\
 		instrumentation_begin();			\
 		ftrace_nmi_enter();				\
 		instrumentation_end();				\
@@ -143,7 +143,7 @@ extern void rcu_nmi_exit(void);
 		instrumentation_begin();			\
 		ftrace_nmi_exit();				\
 		instrumentation_end();				\
-		rcu_nmi_exit();					\
+		ct_nmi_exit();					\
 		lockdep_hardirq_exit();				\
 		__nmi_exit();					\
 	} while (0)
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index b63ff851472e..1686cd528966 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -267,3 +267,13 @@ void ct_irq_exit_irqson(void)
 {
 	rcu_irq_exit_irqson();
 }
+
+noinstr void ct_nmi_enter(void)
+{
+	rcu_nmi_enter();
+}
+
+noinstr void ct_nmi_exit(void)
+{
+	rcu_nmi_exit();
+}
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index cebc98b8adc6..08230507793f 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -449,7 +449,7 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs)
 	__nmi_enter();
 	lockdep_hardirqs_off(CALLER_ADDR0);
 	lockdep_hardirq_enter();
-	rcu_nmi_enter();
+	ct_nmi_enter();
 
 	instrumentation_begin();
 	trace_hardirqs_off_finish();
@@ -469,7 +469,7 @@ void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state)
 	}
 	instrumentation_end();
 
-	rcu_nmi_exit();
+	ct_nmi_exit();
 	lockdep_hardirq_exit();
 	if (irq_state.lockdep)
 		lockdep_hardirqs_on(CALLER_ADDR0);
diff --git a/kernel/extable.c b/kernel/extable.c
index b6f330f0fe74..88d4d739c5a1 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -113,7 +113,7 @@ int kernel_text_address(unsigned long addr)
 
 	/* Treat this like an NMI as it can happen anywhere */
 	if (no_rcu)
-		rcu_nmi_enter();
+		ct_nmi_enter();
 
 	if (is_module_text_address(addr))
 		goto out;
@@ -126,7 +126,7 @@ int kernel_text_address(unsigned long addr)
 	ret = 0;
 out:
 	if (no_rcu)
-		rcu_nmi_exit();
+		ct_nmi_exit();
 
 	return ret;
 }
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 7c500c708180..9434da82af8a 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3086,7 +3086,7 @@ void __trace_stack(struct trace_array *tr, unsigned int trace_ctx,
 	}
 
 	/*
-	 * When an NMI triggers, RCU is enabled via rcu_nmi_enter(),
+	 * When an NMI triggers, RCU is enabled via ct_nmi_enter(),
 	 * but if the above rcu_is_watching() failed, then the NMI
 	 * triggered someplace critical, and ct_irq_enter() should
 	 * not be called from NMI.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit()
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2022-03-02 15:47 ` [PATCH 08/19] context_tracking: Take NMI " Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-05 14:16   ` Peter Zijlstra
  2022-03-02 15:48 ` [PATCH 10/19] rcu/context_tracking: Move dynticks counter to context tracking Frederic Weisbecker
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Now rcu_irq_enter/exit() is an unecessary middle call between
ct_irq_enter/exit() and nmi_irq_enter/exit(). Take this opportunity
to remove the former functions and move the comments above them to the
new entrypoints.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/rcutiny.h   |  4 --
 include/linux/rcutree.h   |  4 --
 kernel/context_tracking.c | 59 ++++++++++++++++++++++++++--
 kernel/rcu/tree.c         | 83 ---------------------------------------
 4 files changed, 55 insertions(+), 95 deletions(-)

diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 5ebac609c984..9e07f8d9d544 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -98,10 +98,6 @@ static inline void rcu_cpu_stall_reset(void) { }
 static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle_exit(void) { }
-static inline void rcu_irq_enter(void) { }
-static inline void rcu_irq_exit_irqson(void) { }
-static inline void rcu_irq_enter_irqson(void) { }
-static inline void rcu_irq_exit(void) { }
 static inline void rcu_irq_exit_check_preempt(void) { }
 #define rcu_is_idle_cpu(cpu) \
 	(is_idle_task(current) && !in_nmi() && !in_hardirq() && !in_serving_softirq())
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 127bcfb6f7e9..e05334c4c3d1 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -51,10 +51,6 @@ void cond_synchronize_rcu(unsigned long oldstate);
 
 void rcu_idle_enter(void);
 void rcu_idle_exit(void);
-void rcu_irq_enter(void);
-void rcu_irq_exit(void);
-void rcu_irq_enter_irqson(void);
-void rcu_irq_exit_irqson(void);
 bool rcu_is_idle_cpu(int cpu);
 
 #ifdef CONFIG_PROVE_RCU
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 1686cd528966..ea22eb04750f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -248,24 +248,75 @@ void ct_idle_exit(void)
 }
 EXPORT_SYMBOL_GPL(ct_idle_exit);
 
+/**
+ * ct_irq_enter - inform RCU that current CPU is entering irq away from idle
+ *
+ * Enter an interrupt handler, which might possibly result in exiting
+ * idle mode, in other words, entering the mode in which read-side critical
+ * sections can occur.  The caller must have disabled interrupts.
+ *
+ * Note that the Linux kernel is fully capable of entering an interrupt
+ * handler that it never exits, for example when doing upcalls to user mode!
+ * This code assumes that the idle loop never does upcalls to user mode.
+ * If your architecture's idle loop does do upcalls to user mode (or does
+ * anything else that results in unbalanced calls to the irq_enter() and
+ * irq_exit() functions), RCU will give you what you deserve, good and hard.
+ * But very infrequently and irreproducibly.
+ *
+ * Use things like work queues to work around this limitation.
+ *
+ * You have been warned.
+ *
+ * If you add or remove a call to ct_irq_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
 noinstr void ct_irq_enter(void)
 {
-	rcu_irq_enter();
+	lockdep_assert_irqs_disabled();
+	ct_nmi_enter();
 }
 
+/**
+ * ct_irq_exit - inform RCU that current CPU is exiting irq towards idle
+ *
+ * Exit from an interrupt handler, which might possibly result in entering
+ * idle mode, in other words, leaving the mode in which read-side critical
+ * sections can occur.  The caller must have disabled interrupts.
+ *
+ * This code assumes that the idle loop never does anything that might
+ * result in unbalanced calls to irq_enter() and irq_exit().  If your
+ * architecture's idle loop violates this assumption, RCU will give you what
+ * you deserve, good and hard.  But very infrequently and irreproducibly.
+ *
+ * Use things like work queues to work around this limitation.
+ *
+ * You have been warned.
+ *
+ * If you add or remove a call to ct_irq_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
 noinstr void ct_irq_exit(void)
 {
-	rcu_irq_exit();
+	lockdep_assert_irqs_disabled();
+	ct_nmi_exit();
 }
 
 void ct_irq_enter_irqson(void)
 {
-	rcu_irq_enter_irqson();
+	unsigned long flags;
+
+	local_irq_save(flags);
+	ct_irq_enter();
+	local_irq_restore(flags);
 }
 
 void ct_irq_exit_irqson(void)
 {
-	rcu_irq_exit_irqson();
+	unsigned long flags;
+
+	local_irq_save(flags);
+	ct_irq_exit();
+	local_irq_restore(flags);
 }
 
 noinstr void ct_nmi_enter(void)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 57110c583767..cadf5f5a4700 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -789,31 +789,6 @@ noinstr void rcu_nmi_exit(void)
 		rcu_dynticks_task_enter();
 }
 
-/**
- * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
- *
- * Exit from an interrupt handler, which might possibly result in entering
- * idle mode, in other words, leaving the mode in which read-side critical
- * sections can occur.  The caller must have disabled interrupts.
- *
- * This code assumes that the idle loop never does anything that might
- * result in unbalanced calls to irq_enter() and irq_exit().  If your
- * architecture's idle loop violates this assumption, RCU will give you what
- * you deserve, good and hard.  But very infrequently and irreproducibly.
- *
- * Use things like work queues to work around this limitation.
- *
- * You have been warned.
- *
- * If you add or remove a call to rcu_irq_exit(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void noinstr rcu_irq_exit(void)
-{
-	lockdep_assert_irqs_disabled();
-	rcu_nmi_exit();
-}
-
 #ifdef CONFIG_PROVE_RCU
 /**
  * rcu_irq_exit_check_preempt - Validate that scheduling is possible
@@ -832,21 +807,6 @@ void rcu_irq_exit_check_preempt(void)
 }
 #endif /* #ifdef CONFIG_PROVE_RCU */
 
-/*
- * Wrapper for rcu_irq_exit() where interrupts are enabled.
- *
- * If you add or remove a call to rcu_irq_exit_irqson(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_irq_exit_irqson(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	rcu_irq_exit();
-	local_irq_restore(flags);
-}
-
 /*
  * Exit an RCU extended quiescent state, which can be either the
  * idle loop or adaptive-tickless usermode execution.
@@ -1041,49 +1001,6 @@ noinstr void rcu_nmi_enter(void)
 	barrier();
 }
 
-/**
- * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
- *
- * Enter an interrupt handler, which might possibly result in exiting
- * idle mode, in other words, entering the mode in which read-side critical
- * sections can occur.  The caller must have disabled interrupts.
- *
- * Note that the Linux kernel is fully capable of entering an interrupt
- * handler that it never exits, for example when doing upcalls to user mode!
- * This code assumes that the idle loop never does upcalls to user mode.
- * If your architecture's idle loop does do upcalls to user mode (or does
- * anything else that results in unbalanced calls to the irq_enter() and
- * irq_exit() functions), RCU will give you what you deserve, good and hard.
- * But very infrequently and irreproducibly.
- *
- * Use things like work queues to work around this limitation.
- *
- * You have been warned.
- *
- * If you add or remove a call to rcu_irq_enter(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-noinstr void rcu_irq_enter(void)
-{
-	lockdep_assert_irqs_disabled();
-	rcu_nmi_enter();
-}
-
-/*
- * Wrapper for rcu_irq_enter() where interrupts are enabled.
- *
- * If you add or remove a call to rcu_irq_enter_irqson(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_irq_enter_irqson(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	rcu_irq_enter();
-	local_irq_restore(flags);
-}
-
 /*
  * Check to see if any future non-offloaded RCU-related work will need
  * to be done by the current CPU, even if none need be done immediately,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 10/19] rcu/context_tracking: Move dynticks counter to context tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit() Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:00   ` Paul E. McKenney
  2022-03-02 15:48 ` [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting " Frederic Weisbecker
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

In order to prepare for merging RCU dynticks counter into the context
tracking state, move the rcu_data's dynticks field to the context
tracking structure. It will later be mixed within the context tracking
state itself.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking_state.h | 10 ++++-
 kernel/context_tracking.c              |  9 ++--
 kernel/rcu/tree.c                      | 59 ++++++++++++++------------
 kernel/rcu/tree.h                      |  1 -
 kernel/rcu/tree_exp.h                  |  2 +-
 kernel/rcu/tree_stall.h                |  4 +-
 6 files changed, 48 insertions(+), 37 deletions(-)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index cdc692caa01d..5ad0e481c5a3 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -7,6 +7,7 @@
 #include <linux/context_tracking_irq.h>
 
 struct context_tracking {
+#ifdef CONFIG_CONTEXT_TRACKING_USER
 	/*
 	 * When active is false, probes are unset in order
 	 * to minimize overhead: TIF flags are cleared
@@ -21,11 +22,16 @@ struct context_tracking {
 		CONTEXT_USER,
 		CONTEXT_GUEST,
 	} state;
+#endif
+	atomic_t dynticks;		/* Even value for idle, else odd. */
 };
 
-#ifdef CONFIG_CONTEXT_TRACKING_USER
-extern struct static_key_false context_tracking_key;
+#ifdef CONFIG_CONTEXT_TRACKING
 DECLARE_PER_CPU(struct context_tracking, context_tracking);
+#endif
+
+#ifdef CONFIG_CONTEXT_TRACKING_USER
+extern struct static_key_false context_tracking_key;
 
 static __always_inline bool context_tracking_enabled(void)
 {
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index ea22eb04750f..77b61a7c9890 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -30,9 +30,6 @@
 DEFINE_STATIC_KEY_FALSE(context_tracking_key);
 EXPORT_SYMBOL_GPL(context_tracking_key);
 
-DEFINE_PER_CPU(struct context_tracking, context_tracking);
-EXPORT_SYMBOL_GPL(context_tracking);
-
 static noinstr bool context_tracking_recursion_enter(void)
 {
 	int recursion;
@@ -236,6 +233,12 @@ void __init context_tracking_init(void)
 
 #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
 
+DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
+		.dynticks = ATOMIC_INIT(1),
+};
+EXPORT_SYMBOL_GPL(context_tracking);
+
+
 void ct_idle_enter(void)
 {
 	rcu_idle_enter();
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index cadf5f5a4700..96eb8503f28e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -77,7 +77,6 @@
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
 	.dynticks_nesting = 1,
 	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
-	.dynticks = ATOMIC_INIT(1),
 #ifdef CONFIG_RCU_NOCB_CPU
 	.cblist.flags = SEGCBLIST_RCU_CORE,
 #endif
@@ -268,7 +267,7 @@ void rcu_softirq_qs(void)
  */
 static noinline noinstr unsigned long rcu_dynticks_inc(int incby)
 {
-	return arch_atomic_add_return(incby, this_cpu_ptr(&rcu_data.dynticks));
+	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
 }
 
 /*
@@ -324,9 +323,9 @@ static noinstr void rcu_dynticks_eqs_exit(void)
  */
 static void rcu_dynticks_eqs_online(void)
 {
-	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
-	if (atomic_read(&rdp->dynticks) & 0x1)
+	if (atomic_read(&ct->dynticks) & 0x1)
 		return;
 	rcu_dynticks_inc(1);
 }
@@ -338,17 +337,19 @@ static void rcu_dynticks_eqs_online(void)
  */
 static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
 {
-	return !(arch_atomic_read(this_cpu_ptr(&rcu_data.dynticks)) & 0x1);
+	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
 }
 
 /*
  * Snapshot the ->dynticks counter with full ordering so as to allow
  * stable comparison of this counter with past and future snapshots.
  */
-static int rcu_dynticks_snap(struct rcu_data *rdp)
+static int rcu_dynticks_snap(int cpu)
 {
+	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
+
 	smp_mb();  // Fundamental RCU ordering guarantee.
-	return atomic_read_acquire(&rdp->dynticks);
+	return atomic_read_acquire(&ct->dynticks);
 }
 
 /*
@@ -363,9 +364,7 @@ static bool rcu_dynticks_in_eqs(int snap)
 /* Return true if the specified CPU is currently idle from an RCU viewpoint.  */
 bool rcu_is_idle_cpu(int cpu)
 {
-	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
-
-	return rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp));
+	return rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
 }
 
 /*
@@ -375,7 +374,7 @@ bool rcu_is_idle_cpu(int cpu)
  */
 static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
 {
-	return snap != rcu_dynticks_snap(rdp);
+	return snap != rcu_dynticks_snap(rdp->cpu);
 }
 
 /*
@@ -384,11 +383,11 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
  */
 bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
 {
-	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
 	int snap;
 
 	// If not quiescent, force back to earlier extended quiescent state.
-	snap = atomic_read(&rdp->dynticks) & ~0x1;
+	snap = atomic_read(&ct->dynticks) & ~0x1;
 
 	smp_rmb(); // Order ->dynticks and *vp reads.
 	if (READ_ONCE(*vp))
@@ -396,7 +395,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
 	smp_rmb(); // Order *vp read and ->dynticks re-read.
 
 	// If still in the same extended quiescent state, we are good!
-	return snap == atomic_read(&rdp->dynticks);
+	return snap == atomic_read(&ct->dynticks);
 }
 
 /*
@@ -620,6 +619,7 @@ EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
 static noinstr void rcu_eqs_enter(bool user)
 {
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
 	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
 	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
@@ -633,12 +633,12 @@ static noinstr void rcu_eqs_enter(bool user)
 
 	lockdep_assert_irqs_disabled();
 	instrumentation_begin();
-	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
+	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&ct->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	rcu_preempt_deferred_qs(current);
 
 	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
-	instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
+	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 
 	instrumentation_end();
 	WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
@@ -740,7 +740,7 @@ noinstr void rcu_user_enter(void)
  * rcu_nmi_exit - inform RCU of exit from NMI context
  *
  * If we are returning from the outermost NMI handler that interrupted an
- * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
+ * RCU-idle period, update ct->dynticks and rdp->dynticks_nmi_nesting
  * to let the RCU grace-period handling know that the CPU is back to
  * being RCU-idle.
  *
@@ -749,6 +749,7 @@ noinstr void rcu_user_enter(void)
  */
 noinstr void rcu_nmi_exit(void)
 {
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 
 	instrumentation_begin();
@@ -766,7 +767,7 @@ noinstr void rcu_nmi_exit(void)
 	 */
 	if (rdp->dynticks_nmi_nesting != 1) {
 		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2,
-				  atomic_read(&rdp->dynticks));
+				  atomic_read(&ct->dynticks));
 		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
 			   rdp->dynticks_nmi_nesting - 2);
 		instrumentation_end();
@@ -774,11 +775,11 @@ noinstr void rcu_nmi_exit(void)
 	}
 
 	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
-	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
+	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
 	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
 
 	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
-	instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
+	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 	instrumentation_end();
 
 	// RCU is watching here ...
@@ -817,6 +818,7 @@ void rcu_irq_exit_check_preempt(void)
  */
 static void noinstr rcu_eqs_exit(bool user)
 {
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 	struct rcu_data *rdp;
 	long oldval;
 
@@ -836,9 +838,9 @@ static void noinstr rcu_eqs_exit(bool user)
 	instrumentation_begin();
 
 	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
-	instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
+	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 
-	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&rdp->dynticks));
+	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&ct->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	WRITE_ONCE(rdp->dynticks_nesting, 1);
 	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
@@ -944,7 +946,7 @@ void __rcu_irq_enter_check_tick(void)
 /**
  * rcu_nmi_enter - inform RCU of entry to NMI context
  *
- * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
+ * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
  * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know
  * that the CPU is active.  This implementation permits nested NMIs, as
  * long as the nesting level does not overflow an int.  (You will probably
@@ -957,6 +959,7 @@ noinstr void rcu_nmi_enter(void)
 {
 	long incby = 2;
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
 	/* Complain about underflow. */
 	WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0);
@@ -980,9 +983,9 @@ noinstr void rcu_nmi_enter(void)
 
 		instrumentation_begin();
 		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
-		instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks));
+		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
 		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
-		instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
+		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 
 		incby = 1;
 	} else if (!in_nmi()) {
@@ -994,7 +997,7 @@ noinstr void rcu_nmi_enter(void)
 
 	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
 			  rdp->dynticks_nmi_nesting,
-			  rdp->dynticks_nmi_nesting + incby, atomic_read(&rdp->dynticks));
+			  rdp->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
 	instrumentation_end();
 	WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */
 		   rdp->dynticks_nmi_nesting + incby);
@@ -1138,7 +1141,7 @@ static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp)
  */
 static int dyntick_save_progress_counter(struct rcu_data *rdp)
 {
-	rdp->dynticks_snap = rcu_dynticks_snap(rdp);
+	rdp->dynticks_snap = rcu_dynticks_snap(rdp->cpu);
 	if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) {
 		trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti"));
 		rcu_gpnum_ovf(rdp->mynode, rdp);
@@ -4125,7 +4128,7 @@ rcu_boot_init_percpu_data(int cpu)
 	rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
 	INIT_WORK(&rdp->strict_work, strict_work_handler);
 	WARN_ON_ONCE(rdp->dynticks_nesting != 1);
-	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp)));
+	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu)));
 	rdp->barrier_seq_snap = rcu_state.barrier_sequence;
 	rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
 	rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED;
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index b8d07bf92d29..15246a3f0734 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -188,7 +188,6 @@ struct rcu_data {
 	int dynticks_snap;		/* Per-GP tracking for dynticks. */
 	long dynticks_nesting;		/* Track process nesting level. */
 	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
-	atomic_t dynticks;		/* Even value for idle, else odd. */
 	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
 	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
 	bool rcu_forced_tick;		/* Forced tick to provide QS. */
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index d5f30085b0cf..2210110990f4 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -358,7 +358,7 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
 		    !(rnp->qsmaskinitnext & mask)) {
 			mask_ofl_test |= mask;
 		} else {
-			snap = rcu_dynticks_snap(rdp);
+			snap = rcu_dynticks_snap(cpu);
 			if (rcu_dynticks_in_eqs(snap))
 				mask_ofl_test |= mask;
 			else
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 84b812a3ab44..202129b1c7e4 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -448,7 +448,7 @@ static void print_cpu_stall_info(int cpu)
 	}
 	delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
 	falsepositive = rcu_is_gp_kthread_starving(NULL) &&
-			rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp));
+			rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
 	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
 	       cpu,
 	       "O."[!!cpu_online(cpu)],
@@ -458,7 +458,7 @@ static void print_cpu_stall_info(int cpu)
 			rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
 				"!."[!delta],
 	       ticks_value, ticks_title,
-	       rcu_dynticks_snap(rdp) & 0xfff,
+	       rcu_dynticks_snap(cpu) & 0xfff,
 	       rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
 	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
 	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting to context tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 10/19] rcu/context_tracking: Move dynticks counter to context tracking Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:01   ` Paul E. McKenney
  2022-03-12 23:23   ` Peter Zijlstra
  2022-03-02 15:48 ` [PATCH 12/19] rcu/context_tracking: Move dynticks_nmi_nesting " Frederic Weisbecker
                   ` (8 subsequent siblings)
  19 siblings, 2 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The RCU eqs tracking is going to be performed by the context tracking
subsystem. The related nesting counters thus need to be moved to the
context tracking structure.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking_state.h |  1 +
 kernel/context_tracking.c              |  1 +
 kernel/rcu/tree.c                      | 31 +++++++++++++-------------
 kernel/rcu/tree.h                      |  1 -
 kernel/rcu/tree_stall.h                |  3 ++-
 5 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 5ad0e481c5a3..bcb942945265 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -24,6 +24,7 @@ struct context_tracking {
 	} state;
 #endif
 	atomic_t dynticks;		/* Even value for idle, else odd. */
+	long dynticks_nesting;		/* Track process nesting level. */
 };
 
 #ifdef CONFIG_CONTEXT_TRACKING
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 77b61a7c9890..09a77884a4e3 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -234,6 +234,7 @@ void __init context_tracking_init(void)
 #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
 
 DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
+		.dynticks_nesting = 1,
 		.dynticks = ATOMIC_INIT(1),
 };
 EXPORT_SYMBOL_GPL(context_tracking);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 96eb8503f28e..8708d1a99565 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -75,7 +75,6 @@
 /* Data structures. */
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
-	.dynticks_nesting = 1,
 	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
 #ifdef CONFIG_RCU_NOCB_CPU
 	.cblist.flags = SEGCBLIST_RCU_CORE,
@@ -441,7 +440,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
 	lockdep_assert_irqs_disabled();
 
 	/* Check for counter underflows */
-	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) < 0,
+	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) < 0,
 			 "RCU dynticks_nesting counter underflow!");
 	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 0,
 			 "RCU dynticks_nmi_nesting counter underflow/zero!");
@@ -457,7 +456,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
 	WARN_ON_ONCE(!nesting && !is_idle_task(current));
 
 	/* Does CPU appear to be idle from an RCU standpoint? */
-	return __this_cpu_read(rcu_data.dynticks_nesting) == 0;
+	return __this_cpu_read(context_tracking.dynticks_nesting) == 0;
 }
 
 #define DEFAULT_RCU_BLIMIT (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) ? 1000 : 10)
@@ -624,16 +623,16 @@ static noinstr void rcu_eqs_enter(bool user)
 	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
 	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     rdp->dynticks_nesting == 0);
-	if (rdp->dynticks_nesting != 1) {
+		     ct->dynticks_nesting == 0);
+	if (ct->dynticks_nesting != 1) {
 		// RCU will still be watching, so just do accounting and leave.
-		rdp->dynticks_nesting--;
+		ct->dynticks_nesting--;
 		return;
 	}
 
 	lockdep_assert_irqs_disabled();
 	instrumentation_begin();
-	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&ct->dynticks));
+	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	rcu_preempt_deferred_qs(current);
 
@@ -641,7 +640,7 @@ static noinstr void rcu_eqs_enter(bool user)
 	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 
 	instrumentation_end();
-	WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
+	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
 	// RCU is watching here ...
 	rcu_dynticks_eqs_enter();
 	// ... but is no longer watching here.
@@ -798,7 +797,7 @@ void rcu_irq_exit_check_preempt(void)
 {
 	lockdep_assert_irqs_disabled();
 
-	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) <= 0,
+	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) <= 0,
 			 "RCU dynticks_nesting counter underflow/zero!");
 	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
 			 DYNTICK_IRQ_NONIDLE,
@@ -824,11 +823,11 @@ static void noinstr rcu_eqs_exit(bool user)
 
 	lockdep_assert_irqs_disabled();
 	rdp = this_cpu_ptr(&rcu_data);
-	oldval = rdp->dynticks_nesting;
+	oldval = ct->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
 	if (oldval) {
 		// RCU was already watching, so just do accounting and leave.
-		rdp->dynticks_nesting++;
+		ct->dynticks_nesting++;
 		return;
 	}
 	rcu_dynticks_task_exit();
@@ -840,9 +839,9 @@ static void noinstr rcu_eqs_exit(bool user)
 	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
 	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 
-	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&ct->dynticks));
+	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
-	WRITE_ONCE(rdp->dynticks_nesting, 1);
+	WRITE_ONCE(ct->dynticks_nesting, 1);
 	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
 	WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
 	instrumentation_end();
@@ -4122,12 +4121,13 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
 static void __init
 rcu_boot_init_percpu_data(int cpu)
 {
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 
 	/* Set up local state, ensuring consistent view of global state. */
 	rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
 	INIT_WORK(&rdp->strict_work, strict_work_handler);
-	WARN_ON_ONCE(rdp->dynticks_nesting != 1);
+	WARN_ON_ONCE(ct->dynticks_nesting != 1);
 	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu)));
 	rdp->barrier_seq_snap = rcu_state.barrier_sequence;
 	rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
@@ -4152,6 +4152,7 @@ rcu_boot_init_percpu_data(int cpu)
 int rcutree_prepare_cpu(unsigned int cpu)
 {
 	unsigned long flags;
+	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	struct rcu_node *rnp = rcu_get_root();
 
@@ -4160,7 +4161,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
 	rdp->qlen_last_fqs_check = 0;
 	rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
 	rdp->blimit = blimit;
-	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
+	ct->dynticks_nesting = 1;	/* CPU not up, no tearing. */
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
 
 	/*
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 15246a3f0734..8050bab08f39 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -186,7 +186,6 @@ struct rcu_data {
 
 	/* 3) dynticks interface. */
 	int dynticks_snap;		/* Per-GP tracking for dynticks. */
-	long dynticks_nesting;		/* Track process nesting level. */
 	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
 	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
 	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 202129b1c7e4..30a5e0a8ddb3 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -429,6 +429,7 @@ static void print_cpu_stall_info(int cpu)
 {
 	unsigned long delta;
 	bool falsepositive;
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	char *ticks_title;
 	unsigned long ticks_value;
@@ -459,7 +460,7 @@ static void print_cpu_stall_info(int cpu)
 				"!."[!delta],
 	       ticks_value, ticks_title,
 	       rcu_dynticks_snap(cpu) & 0xfff,
-	       rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
+	       ct->dynticks_nesting, rdp->dynticks_nmi_nesting,
 	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
 	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
 	       falsepositive ? " (false positive?)" : "");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 12/19] rcu/context_tracking: Move dynticks_nmi_nesting to context tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (10 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting " Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:02   ` Paul E. McKenney
  2022-03-02 15:48 ` [PATCH 13/19] rcu/context-tracking: Move deferred nocb resched " Frederic Weisbecker
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The RCU eqs tracking is going to be performed by the context tracking
subsystem. The related nesting counters thus need to be moved to the
context tracking structure.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking_state.h |  4 +++
 kernel/context_tracking.c              |  1 +
 kernel/rcu/rcu.h                       |  4 ---
 kernel/rcu/tree.c                      | 48 +++++++++++---------------
 kernel/rcu/tree.h                      |  1 -
 kernel/rcu/tree_stall.h                |  2 +-
 6 files changed, 27 insertions(+), 33 deletions(-)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index bcb942945265..4efb97fe6518 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -6,6 +6,9 @@
 #include <linux/static_key.h>
 #include <linux/context_tracking_irq.h>
 
+/* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
+#define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
+
 struct context_tracking {
 #ifdef CONFIG_CONTEXT_TRACKING_USER
 	/*
@@ -25,6 +28,7 @@ struct context_tracking {
 #endif
 	atomic_t dynticks;		/* Even value for idle, else odd. */
 	long dynticks_nesting;		/* Track process nesting level. */
+	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
 };
 
 #ifdef CONFIG_CONTEXT_TRACKING
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 09a77884a4e3..155534c409fc 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -234,6 +234,7 @@ void __init context_tracking_init(void)
 #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
 
 DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
+		.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
 		.dynticks_nesting = 1,
 		.dynticks = ATOMIC_INIT(1),
 };
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index eccbdbdaa02e..d3cd9e7d11fa 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -12,10 +12,6 @@
 
 #include <trace/events/rcu.h>
 
-/* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
-#define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
-
-
 /*
  * Grace-period counter management.
  */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8708d1a99565..c2528e65de0c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -75,7 +75,6 @@
 /* Data structures. */
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
-	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
 #ifdef CONFIG_RCU_NOCB_CPU
 	.cblist.flags = SEGCBLIST_RCU_CORE,
 #endif
@@ -442,11 +441,11 @@ static int rcu_is_cpu_rrupt_from_idle(void)
 	/* Check for counter underflows */
 	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) < 0,
 			 "RCU dynticks_nesting counter underflow!");
-	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 0,
+	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nmi_nesting) <= 0,
 			 "RCU dynticks_nmi_nesting counter underflow/zero!");
 
 	/* Are we at first interrupt nesting level? */
-	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
+	nesting = __this_cpu_read(context_tracking.dynticks_nmi_nesting);
 	if (nesting > 1)
 		return false;
 
@@ -617,11 +616,10 @@ EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
  */
 static noinstr void rcu_eqs_enter(bool user)
 {
-	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
-	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
-	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
+	WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 		     ct->dynticks_nesting == 0);
 	if (ct->dynticks_nesting != 1) {
@@ -739,7 +737,7 @@ noinstr void rcu_user_enter(void)
  * rcu_nmi_exit - inform RCU of exit from NMI context
  *
  * If we are returning from the outermost NMI handler that interrupted an
- * RCU-idle period, update ct->dynticks and rdp->dynticks_nmi_nesting
+ * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
  * to let the RCU grace-period handling know that the CPU is back to
  * being RCU-idle.
  *
@@ -749,7 +747,6 @@ noinstr void rcu_user_enter(void)
 noinstr void rcu_nmi_exit(void)
 {
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 
 	instrumentation_begin();
 	/*
@@ -757,25 +754,25 @@ noinstr void rcu_nmi_exit(void)
 	 * (We are exiting an NMI handler, so RCU better be paying attention
 	 * to us!)
 	 */
-	WARN_ON_ONCE(rdp->dynticks_nmi_nesting <= 0);
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting <= 0);
 	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
 
 	/*
 	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
 	 * leave it in non-RCU-idle state.
 	 */
-	if (rdp->dynticks_nmi_nesting != 1) {
-		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2,
+	if (ct->dynticks_nmi_nesting != 1) {
+		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
 				  atomic_read(&ct->dynticks));
-		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
-			   rdp->dynticks_nmi_nesting - 2);
+		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
+			   ct->dynticks_nmi_nesting - 2);
 		instrumentation_end();
 		return;
 	}
 
 	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
-	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
-	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
+	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
+	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
 
 	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
 	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
@@ -799,7 +796,7 @@ void rcu_irq_exit_check_preempt(void)
 
 	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) <= 0,
 			 "RCU dynticks_nesting counter underflow/zero!");
-	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
+	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nmi_nesting) !=
 			 DYNTICK_IRQ_NONIDLE,
 			 "Bad RCU  dynticks_nmi_nesting counter\n");
 	RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
@@ -818,11 +815,9 @@ void rcu_irq_exit_check_preempt(void)
 static void noinstr rcu_eqs_exit(bool user)
 {
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-	struct rcu_data *rdp;
 	long oldval;
 
 	lockdep_assert_irqs_disabled();
-	rdp = this_cpu_ptr(&rcu_data);
 	oldval = ct->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
 	if (oldval) {
@@ -842,8 +837,8 @@ static void noinstr rcu_eqs_exit(bool user)
 	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	WRITE_ONCE(ct->dynticks_nesting, 1);
-	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
-	WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
+	WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
 	instrumentation_end();
 }
 
@@ -946,7 +941,7 @@ void __rcu_irq_enter_check_tick(void)
  * rcu_nmi_enter - inform RCU of entry to NMI context
  *
  * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
- * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know
+ * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
  * that the CPU is active.  This implementation permits nested NMIs, as
  * long as the nesting level does not overflow an int.  (You will probably
  * run out of stack space first.)
@@ -957,11 +952,10 @@ void __rcu_irq_enter_check_tick(void)
 noinstr void rcu_nmi_enter(void)
 {
 	long incby = 2;
-	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
 	/* Complain about underflow. */
-	WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0);
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting < 0);
 
 	/*
 	 * If idle from RCU viewpoint, atomically increment ->dynticks
@@ -995,11 +989,11 @@ noinstr void rcu_nmi_enter(void)
 	}
 
 	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
-			  rdp->dynticks_nmi_nesting,
-			  rdp->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
+			  ct->dynticks_nmi_nesting,
+			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
 	instrumentation_end();
-	WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */
-		   rdp->dynticks_nmi_nesting + incby);
+	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
+		   ct->dynticks_nmi_nesting + incby);
 	barrier();
 }
 
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 8050bab08f39..56d38568292b 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -186,7 +186,6 @@ struct rcu_data {
 
 	/* 3) dynticks interface. */
 	int dynticks_snap;		/* Per-GP tracking for dynticks. */
-	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
 	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
 	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
 	bool rcu_forced_tick;		/* Forced tick to provide QS. */
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 30a5e0a8ddb3..9bf5cc79d5eb 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -460,7 +460,7 @@ static void print_cpu_stall_info(int cpu)
 				"!."[!delta],
 	       ticks_value, ticks_title,
 	       rcu_dynticks_snap(cpu) & 0xfff,
-	       ct->dynticks_nesting, rdp->dynticks_nmi_nesting,
+	       ct->dynticks_nesting, ct->dynticks_nmi_nesting,
 	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
 	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
 	       falsepositive ? " (false positive?)" : "");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 13/19] rcu/context-tracking: Move deferred nocb resched to context tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (11 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 12/19] rcu/context_tracking: Move dynticks_nmi_nesting " Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:04   ` Paul E. McKenney
  2022-03-02 15:48 ` [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Frederic Weisbecker
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

To prepare for migrating the RCU eqs accounting code to context tracking,
split the last-resort deferred nocb resched from rcu_user_enter() and
move it into a separate call from context tracking.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/rcutree.h   |  6 ++++++
 kernel/context_tracking.c |  8 ++++++++
 kernel/rcu/tree.c         | 15 ++-------------
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index e05334c4c3d1..6d111a3c0cc0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -78,4 +78,10 @@ int rcutree_dead_cpu(unsigned int cpu);
 int rcutree_dying_cpu(unsigned int cpu);
 void rcu_cpu_starting(unsigned int cpu);
 
+#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK))
+void rcu_irq_work_resched(void);
+#else
+static inline void rcu_irq_work_resched(void) { }
+#endif
+
 #endif /* __LINUX_RCUTREE_H */
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 155534c409fc..7be7a2044d3a 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -60,6 +60,8 @@ static __always_inline void context_tracking_recursion_exit(void)
  */
 void noinstr __ct_user_enter(enum ctx_state state)
 {
+	lockdep_assert_irqs_disabled();
+
 	/* Kernel threads aren't supposed to go to userspace */
 	WARN_ON_ONCE(!current->mm);
 
@@ -81,6 +83,12 @@ void noinstr __ct_user_enter(enum ctx_state state)
 				vtime_user_enter(current);
 				instrumentation_end();
 			}
+			/*
+			 * Other than generic entry implementation, we may be past the last
+			 * rescheduling opportunity in the entry code. Trigger a self IPI
+			 * that will fire and reschedule once we resume in user/guest mode.
+			 */
+			rcu_irq_work_resched();
 			rcu_user_enter();
 		}
 		/*
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c2528e65de0c..938537958c27 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -686,7 +686,7 @@ static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) =
  * last resort is to fire a local irq_work that will trigger a reschedule once IRQs
  * get re-enabled again.
  */
-noinstr static void rcu_irq_work_resched(void)
+noinstr void rcu_irq_work_resched(void)
 {
 	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
 
@@ -702,10 +702,7 @@ noinstr static void rcu_irq_work_resched(void)
 	}
 	instrumentation_end();
 }
-
-#else
-static inline void rcu_irq_work_resched(void) { }
-#endif
+#endif /* #if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK) */
 
 /**
  * rcu_user_enter - inform RCU that we are resuming userspace.
@@ -720,14 +717,6 @@ static inline void rcu_irq_work_resched(void) { }
  */
 noinstr void rcu_user_enter(void)
 {
-	lockdep_assert_irqs_disabled();
-
-	/*
-	 * Other than generic entry implementation, we may be past the last
-	 * rescheduling opportunity in the entry code. Trigger a self IPI
-	 * that will fire and reschedule once we resume in user/guest mode.
-	 */
-	rcu_irq_work_resched();
 	rcu_eqs_enter(true);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (12 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 13/19] rcu/context-tracking: Move deferred nocb resched " Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:07   ` Paul E. McKenney
  2022-03-12 23:10   ` Peter Zijlstra
  2022-03-02 15:48 ` [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions Frederic Weisbecker
                   ` (5 subsequent siblings)
  19 siblings, 2 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Move the core RCU eqs/dynticks functions to context tracking so that
we can later merge all that code within context tracking.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking.h |  12 ++
 include/linux/rcutree.h          |   3 +
 kernel/context_tracking.c        | 347 +++++++++++++++++++++++++++++++
 kernel/rcu/tree.c                | 326 +----------------------------
 kernel/rcu/tree.h                |   5 -
 kernel/rcu/tree_plugin.h         |  36 +---
 6 files changed, 366 insertions(+), 363 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 52a2e23d5107..086546569d14 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -122,6 +122,18 @@ static inline void context_tracking_init(void) { }
 #ifdef CONFIG_CONTEXT_TRACKING
 extern void ct_idle_enter(void);
 extern void ct_idle_exit(void);
+extern unsigned long rcu_dynticks_inc(int incby);
+
+/*
+ * Is the current CPU in an extended quiescent state?
+ *
+ * No ordering, as we are sampling CPU-local information.
+ */
+static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
+{
+	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
+}
+
 #else
 static inline void ct_idle_enter(void) { }
 static inline void ct_idle_exit(void) { }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 6d111a3c0cc0..408435ff7a06 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -59,6 +59,9 @@ void rcu_irq_exit_check_preempt(void);
 static inline void rcu_irq_exit_check_preempt(void) { }
 #endif
 
+struct task_struct;
+void rcu_preempt_deferred_qs(struct task_struct *t);
+
 void exit_rcu(void);
 
 void rcu_scheduler_starting(void);
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 7be7a2044d3a..dc24a9782bbd 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -21,6 +21,353 @@
 #include <linux/hardirq.h>
 #include <linux/export.h>
 #include <linux/kprobes.h>
+#include <trace/events/rcu.h>
+
+#define TPS(x)  tracepoint_string(x)
+
+/* Record the current task on dyntick-idle entry. */
+static __always_inline void rcu_dynticks_task_enter(void)
+{
+#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
+	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
+#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
+}
+
+/* Record no current task on dyntick-idle exit. */
+static __always_inline void rcu_dynticks_task_exit(void)
+{
+#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
+	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
+#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
+}
+
+/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
+static __always_inline void rcu_dynticks_task_trace_enter(void)
+{
+#ifdef CONFIG_TASKS_TRACE_RCU
+	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
+		current->trc_reader_special.b.need_mb = true;
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
+}
+
+/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
+static __always_inline void rcu_dynticks_task_trace_exit(void)
+{
+#ifdef CONFIG_TASKS_TRACE_RCU
+	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
+		current->trc_reader_special.b.need_mb = false;
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
+}
+
+/*
+ * Increment the current CPU's context_tracking structure's ->dynticks field
+ * with ordering.  Return the new value.
+ */
+noinstr unsigned long rcu_dynticks_inc(int incby)
+{
+	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
+}
+
+/*
+ * Record entry into an extended quiescent state.  This is only to be
+ * called when not already in an extended quiescent state, that is,
+ * RCU is watching prior to the call to this function and is no longer
+ * watching upon return.
+ */
+static noinstr void rcu_dynticks_eqs_enter(void)
+{
+	int seq;
+
+	/*
+	 * CPUs seeing atomic_add_return() must see prior RCU read-side
+	 * critical sections, and we also must force ordering with the
+	 * next idle sojourn.
+	 */
+	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
+	seq = rcu_dynticks_inc(1);
+	// RCU is no longer watching.  Better be in extended quiescent state!
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
+}
+
+/*
+ * Record exit from an extended quiescent state.  This is only to be
+ * called from an extended quiescent state, that is, RCU is not watching
+ * prior to the call to this function and is watching upon return.
+ */
+static noinstr void rcu_dynticks_eqs_exit(void)
+{
+	int seq;
+
+	/*
+	 * CPUs seeing atomic_add_return() must see prior idle sojourns,
+	 * and we also must force ordering with the next RCU read-side
+	 * critical section.
+	 */
+	seq = rcu_dynticks_inc(1);
+	// RCU is now watching.  Better not be in an extended quiescent state!
+	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
+}
+
+/*
+ * Enter an RCU extended quiescent state, which can be either the
+ * idle loop or adaptive-tickless usermode execution.
+ *
+ * We crowbar the ->dynticks_nmi_nesting field to zero to allow for
+ * the possibility of usermode upcalls having messed up our count
+ * of interrupt nesting level during the prior busy period.
+ */
+static noinstr void rcu_eqs_enter(bool user)
+{
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
+
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
+	WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+		     ct->dynticks_nesting == 0);
+	if (ct->dynticks_nesting != 1) {
+		// RCU will still be watching, so just do accounting and leave.
+		ct->dynticks_nesting--;
+		return;
+	}
+
+	lockdep_assert_irqs_disabled();
+	instrumentation_begin();
+	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
+	rcu_preempt_deferred_qs(current);
+
+	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
+	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+
+	instrumentation_end();
+	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
+	// RCU is watching here ...
+	rcu_dynticks_eqs_enter();
+	// ... but is no longer watching here.
+	rcu_dynticks_task_enter();
+}
+
+/**
+ * rcu_idle_enter - inform RCU that current CPU is entering idle
+ *
+ * Enter idle mode, in other words, -leave- the mode in which RCU
+ * read-side critical sections can occur.  (Though RCU read-side
+ * critical sections can occur in irq handlers in idle, a possibility
+ * handled by irq_enter() and irq_exit().)
+ *
+ * If you add or remove a call to rcu_idle_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_idle_enter(void)
+{
+	lockdep_assert_irqs_disabled();
+	rcu_eqs_enter(false);
+}
+
+#ifdef CONFIG_NO_HZ_FULL
+/**
+ * rcu_user_enter - inform RCU that we are resuming userspace.
+ *
+ * Enter RCU idle mode right before resuming userspace.  No use of RCU
+ * is permitted between this call and rcu_user_exit(). This way the
+ * CPU doesn't need to maintain the tick for RCU maintenance purposes
+ * when the CPU runs in userspace.
+ *
+ * If you add or remove a call to rcu_user_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+noinstr void rcu_user_enter(void)
+{
+	rcu_eqs_enter(true);
+}
+#endif /* CONFIG_NO_HZ_FULL */
+
+/**
+ * rcu_nmi_exit - inform RCU of exit from NMI context
+ *
+ * If we are returning from the outermost NMI handler that interrupted an
+ * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
+ * to let the RCU grace-period handling know that the CPU is back to
+ * being RCU-idle.
+ *
+ * If you add or remove a call to rcu_nmi_exit(), be sure to test
+ * with CONFIG_RCU_EQS_DEBUG=y.
+ */
+noinstr void rcu_nmi_exit(void)
+{
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
+
+	instrumentation_begin();
+	/*
+	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
+	 * (We are exiting an NMI handler, so RCU better be paying attention
+	 * to us!)
+	 */
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting <= 0);
+	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
+
+	/*
+	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
+	 * leave it in non-RCU-idle state.
+	 */
+	if (ct->dynticks_nmi_nesting != 1) {
+		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
+				  atomic_read(&ct->dynticks));
+		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
+			   ct->dynticks_nmi_nesting - 2);
+		instrumentation_end();
+		return;
+	}
+
+	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
+	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
+	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
+
+	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
+	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+	instrumentation_end();
+
+	// RCU is watching here ...
+	rcu_dynticks_eqs_enter();
+	// ... but is no longer watching here.
+
+	if (!in_nmi())
+		rcu_dynticks_task_enter();
+}
+
+/*
+ * Exit an RCU extended quiescent state, which can be either the
+ * idle loop or adaptive-tickless usermode execution.
+ *
+ * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
+ * allow for the possibility of usermode upcalls messing up our count of
+ * interrupt nesting level during the busy period that is just now starting.
+ */
+static void noinstr rcu_eqs_exit(bool user)
+{
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
+	long oldval;
+
+	lockdep_assert_irqs_disabled();
+	oldval = ct->dynticks_nesting;
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
+	if (oldval) {
+		// RCU was already watching, so just do accounting and leave.
+		ct->dynticks_nesting++;
+		return;
+	}
+	rcu_dynticks_task_exit();
+	// RCU is not watching here ...
+	rcu_dynticks_eqs_exit();
+	// ... but is watching here.
+	instrumentation_begin();
+
+	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
+	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+
+	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
+	WRITE_ONCE(ct->dynticks_nesting, 1);
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
+	WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
+	instrumentation_end();
+}
+
+/**
+ * rcu_idle_exit - inform RCU that current CPU is leaving idle
+ *
+ * Exit idle mode, in other words, -enter- the mode in which RCU
+ * read-side critical sections can occur.
+ *
+ * If you add or remove a call to rcu_idle_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_idle_exit(void)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	rcu_eqs_exit(false);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(rcu_idle_exit);
+
+#ifdef CONFIG_NO_HZ_FULL
+/**
+ * rcu_user_exit - inform RCU that we are exiting userspace.
+ *
+ * Exit RCU idle mode while entering the kernel because it can
+ * run a RCU read side critical section anytime.
+ *
+ * If you add or remove a call to rcu_user_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
+void noinstr rcu_user_exit(void)
+{
+	rcu_eqs_exit(true);
+}
+#endif /* ifdef CONFIG_NO_HZ_FULL */
+
+/**
+ * rcu_nmi_enter - inform RCU of entry to NMI context
+ *
+ * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
+ * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
+ * that the CPU is active.  This implementation permits nested NMIs, as
+ * long as the nesting level does not overflow an int.  (You will probably
+ * run out of stack space first.)
+ *
+ * If you add or remove a call to rcu_nmi_enter(), be sure to test
+ * with CONFIG_RCU_EQS_DEBUG=y.
+ */
+noinstr void rcu_nmi_enter(void)
+{
+	long incby = 2;
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
+
+	/* Complain about underflow. */
+	WARN_ON_ONCE(ct->dynticks_nmi_nesting < 0);
+
+	/*
+	 * If idle from RCU viewpoint, atomically increment ->dynticks
+	 * to mark non-idle and increment ->dynticks_nmi_nesting by one.
+	 * Otherwise, increment ->dynticks_nmi_nesting by two.  This means
+	 * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
+	 * to be in the outermost NMI handler that interrupted an RCU-idle
+	 * period (observation due to Andy Lutomirski).
+	 */
+	if (rcu_dynticks_curr_cpu_in_eqs()) {
+
+		if (!in_nmi())
+			rcu_dynticks_task_exit();
+
+		// RCU is not watching here ...
+		rcu_dynticks_eqs_exit();
+		// ... but is watching here.
+
+		instrumentation_begin();
+		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
+		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
+		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
+		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+
+		incby = 1;
+	} else if (!in_nmi()) {
+		instrumentation_begin();
+		rcu_irq_enter_check_tick();
+	} else  {
+		instrumentation_begin();
+	}
+
+	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
+			  ct->dynticks_nmi_nesting,
+			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
+	instrumentation_end();
+	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
+		   ct->dynticks_nmi_nesting + incby);
+	barrier();
+}
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER
 
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 938537958c27..e55a44ed19b6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -62,6 +62,7 @@
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
 #include <linux/kasan.h>
+#include <linux/context_tracking.h>
 #include "../time/tick-internal.h"
 
 #include "tree.h"
@@ -259,56 +260,6 @@ void rcu_softirq_qs(void)
 	rcu_tasks_qs(current, false);
 }
 
-/*
- * Increment the current CPU's rcu_data structure's ->dynticks field
- * with ordering.  Return the new value.
- */
-static noinline noinstr unsigned long rcu_dynticks_inc(int incby)
-{
-	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
-}
-
-/*
- * Record entry into an extended quiescent state.  This is only to be
- * called when not already in an extended quiescent state, that is,
- * RCU is watching prior to the call to this function and is no longer
- * watching upon return.
- */
-static noinstr void rcu_dynticks_eqs_enter(void)
-{
-	int seq;
-
-	/*
-	 * CPUs seeing atomic_add_return() must see prior RCU read-side
-	 * critical sections, and we also must force ordering with the
-	 * next idle sojourn.
-	 */
-	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
-	seq = rcu_dynticks_inc(1);
-	// RCU is no longer watching.  Better be in extended quiescent state!
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
-}
-
-/*
- * Record exit from an extended quiescent state.  This is only to be
- * called from an extended quiescent state, that is, RCU is not watching
- * prior to the call to this function and is watching upon return.
- */
-static noinstr void rcu_dynticks_eqs_exit(void)
-{
-	int seq;
-
-	/*
-	 * CPUs seeing atomic_add_return() must see prior idle sojourns,
-	 * and we also must force ordering with the next RCU read-side
-	 * critical section.
-	 */
-	seq = rcu_dynticks_inc(1);
-	// RCU is now watching.  Better not be in an extended quiescent state!
-	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
-}
-
 /*
  * Reset the current CPU's ->dynticks counter to indicate that the
  * newly onlined CPU is no longer in an extended quiescent state.
@@ -328,16 +279,6 @@ static void rcu_dynticks_eqs_online(void)
 	rcu_dynticks_inc(1);
 }
 
-/*
- * Is the current CPU in an extended quiescent state?
- *
- * No ordering, as we are sampling CPU-local information.
- */
-static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
-{
-	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
-}
-
 /*
  * Snapshot the ->dynticks counter with full ordering so as to allow
  * stable comparison of this counter with past and future snapshots.
@@ -606,65 +547,7 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
 }
 EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
 
-/*
- * Enter an RCU extended quiescent state, which can be either the
- * idle loop or adaptive-tickless usermode execution.
- *
- * We crowbar the ->dynticks_nmi_nesting field to zero to allow for
- * the possibility of usermode upcalls having messed up our count
- * of interrupt nesting level during the prior busy period.
- */
-static noinstr void rcu_eqs_enter(bool user)
-{
-	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-
-	WARN_ON_ONCE(ct->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
-	WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     ct->dynticks_nesting == 0);
-	if (ct->dynticks_nesting != 1) {
-		// RCU will still be watching, so just do accounting and leave.
-		ct->dynticks_nesting--;
-		return;
-	}
-
-	lockdep_assert_irqs_disabled();
-	instrumentation_begin();
-	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
-	rcu_preempt_deferred_qs(current);
-
-	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
-	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
-
-	instrumentation_end();
-	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
-	// RCU is watching here ...
-	rcu_dynticks_eqs_enter();
-	// ... but is no longer watching here.
-	rcu_dynticks_task_enter();
-}
-
-/**
- * rcu_idle_enter - inform RCU that current CPU is entering idle
- *
- * Enter idle mode, in other words, -leave- the mode in which RCU
- * read-side critical sections can occur.  (Though RCU read-side
- * critical sections can occur in irq handlers in idle, a possibility
- * handled by irq_enter() and irq_exit().)
- *
- * If you add or remove a call to rcu_idle_enter(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_idle_enter(void)
-{
-	lockdep_assert_irqs_disabled();
-	rcu_eqs_enter(false);
-}
-
-#ifdef CONFIG_NO_HZ_FULL
-
-#if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)
+#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK))
 /*
  * An empty function that will trigger a reschedule on
  * IRQ tail once IRQs get re-enabled on userspace/guest resume.
@@ -702,78 +585,7 @@ noinstr void rcu_irq_work_resched(void)
 	}
 	instrumentation_end();
 }
-#endif /* #if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK) */
-
-/**
- * rcu_user_enter - inform RCU that we are resuming userspace.
- *
- * Enter RCU idle mode right before resuming userspace.  No use of RCU
- * is permitted between this call and rcu_user_exit(). This way the
- * CPU doesn't need to maintain the tick for RCU maintenance purposes
- * when the CPU runs in userspace.
- *
- * If you add or remove a call to rcu_user_enter(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-noinstr void rcu_user_enter(void)
-{
-	rcu_eqs_enter(true);
-}
-
-#endif /* CONFIG_NO_HZ_FULL */
-
-/**
- * rcu_nmi_exit - inform RCU of exit from NMI context
- *
- * If we are returning from the outermost NMI handler that interrupted an
- * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
- * to let the RCU grace-period handling know that the CPU is back to
- * being RCU-idle.
- *
- * If you add or remove a call to rcu_nmi_exit(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-noinstr void rcu_nmi_exit(void)
-{
-	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-
-	instrumentation_begin();
-	/*
-	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
-	 * (We are exiting an NMI handler, so RCU better be paying attention
-	 * to us!)
-	 */
-	WARN_ON_ONCE(ct->dynticks_nmi_nesting <= 0);
-	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
-
-	/*
-	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
-	 * leave it in non-RCU-idle state.
-	 */
-	if (ct->dynticks_nmi_nesting != 1) {
-		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
-				  atomic_read(&ct->dynticks));
-		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
-			   ct->dynticks_nmi_nesting - 2);
-		instrumentation_end();
-		return;
-	}
-
-	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
-	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
-	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
-
-	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
-	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
-	instrumentation_end();
-
-	// RCU is watching here ...
-	rcu_dynticks_eqs_enter();
-	// ... but is no longer watching here.
-
-	if (!in_nmi())
-		rcu_dynticks_task_enter();
-}
+#endif /* #if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)) */
 
 #ifdef CONFIG_PROVE_RCU
 /**
@@ -793,77 +605,6 @@ void rcu_irq_exit_check_preempt(void)
 }
 #endif /* #ifdef CONFIG_PROVE_RCU */
 
-/*
- * Exit an RCU extended quiescent state, which can be either the
- * idle loop or adaptive-tickless usermode execution.
- *
- * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
- * allow for the possibility of usermode upcalls messing up our count of
- * interrupt nesting level during the busy period that is just now starting.
- */
-static void noinstr rcu_eqs_exit(bool user)
-{
-	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-	long oldval;
-
-	lockdep_assert_irqs_disabled();
-	oldval = ct->dynticks_nesting;
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
-	if (oldval) {
-		// RCU was already watching, so just do accounting and leave.
-		ct->dynticks_nesting++;
-		return;
-	}
-	rcu_dynticks_task_exit();
-	// RCU is not watching here ...
-	rcu_dynticks_eqs_exit();
-	// ... but is watching here.
-	instrumentation_begin();
-
-	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
-	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
-
-	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
-	WRITE_ONCE(ct->dynticks_nesting, 1);
-	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
-	WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
-	instrumentation_end();
-}
-
-/**
- * rcu_idle_exit - inform RCU that current CPU is leaving idle
- *
- * Exit idle mode, in other words, -enter- the mode in which RCU
- * read-side critical sections can occur.
- *
- * If you add or remove a call to rcu_idle_exit(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_idle_exit(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	rcu_eqs_exit(false);
-	local_irq_restore(flags);
-}
-
-#ifdef CONFIG_NO_HZ_FULL
-/**
- * rcu_user_exit - inform RCU that we are exiting userspace.
- *
- * Exit RCU idle mode while entering the kernel because it can
- * run a RCU read side critical section anytime.
- *
- * If you add or remove a call to rcu_user_exit(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void noinstr rcu_user_exit(void)
-{
-	rcu_eqs_exit(true);
-}
-
 /**
  * __rcu_irq_enter_check_tick - Enable scheduler tick on CPU if RCU needs it.
  *
@@ -924,67 +665,6 @@ void __rcu_irq_enter_check_tick(void)
 	}
 	raw_spin_unlock_rcu_node(rdp->mynode);
 }
-#endif /* CONFIG_NO_HZ_FULL */
-
-/**
- * rcu_nmi_enter - inform RCU of entry to NMI context
- *
- * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
- * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
- * that the CPU is active.  This implementation permits nested NMIs, as
- * long as the nesting level does not overflow an int.  (You will probably
- * run out of stack space first.)
- *
- * If you add or remove a call to rcu_nmi_enter(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-noinstr void rcu_nmi_enter(void)
-{
-	long incby = 2;
-	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-
-	/* Complain about underflow. */
-	WARN_ON_ONCE(ct->dynticks_nmi_nesting < 0);
-
-	/*
-	 * If idle from RCU viewpoint, atomically increment ->dynticks
-	 * to mark non-idle and increment ->dynticks_nmi_nesting by one.
-	 * Otherwise, increment ->dynticks_nmi_nesting by two.  This means
-	 * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
-	 * to be in the outermost NMI handler that interrupted an RCU-idle
-	 * period (observation due to Andy Lutomirski).
-	 */
-	if (rcu_dynticks_curr_cpu_in_eqs()) {
-
-		if (!in_nmi())
-			rcu_dynticks_task_exit();
-
-		// RCU is not watching here ...
-		rcu_dynticks_eqs_exit();
-		// ... but is watching here.
-
-		instrumentation_begin();
-		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
-		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
-		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
-		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
-
-		incby = 1;
-	} else if (!in_nmi()) {
-		instrumentation_begin();
-		rcu_irq_enter_check_tick();
-	} else  {
-		instrumentation_begin();
-	}
-
-	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
-			  ct->dynticks_nmi_nesting,
-			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
-	instrumentation_end();
-	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
-		   ct->dynticks_nmi_nesting + incby);
-	barrier();
-}
 
 /*
  * Check to see if any future non-offloaded RCU-related work will need
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 56d38568292b..a42c2a737e24 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -426,7 +426,6 @@ static void rcu_cpu_kthread_setup(unsigned int cpu);
 static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp);
 static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
 static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
-static void rcu_preempt_deferred_qs(struct task_struct *t);
 static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
 static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
@@ -466,10 +465,6 @@ do {								\
 
 static void rcu_bind_gp_kthread(void);
 static bool rcu_nohz_full_cpu(void);
-static void rcu_dynticks_task_enter(void);
-static void rcu_dynticks_task_exit(void);
-static void rcu_dynticks_task_trace_enter(void);
-static void rcu_dynticks_task_trace_exit(void);
 
 /* Forward declarations for tree_stall.h */
 static void record_gp_stall_check_time(void);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 6b9bcd45c7b2..be4b74b46109 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -595,7 +595,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
  * evaluate safety in terms of interrupt, softirq, and preemption
  * disabling.
  */
-static void rcu_preempt_deferred_qs(struct task_struct *t)
+void rcu_preempt_deferred_qs(struct task_struct *t)
 {
 	unsigned long flags;
 
@@ -1283,37 +1283,3 @@ static void rcu_bind_gp_kthread(void)
 		return;
 	housekeeping_affine(current, HK_FLAG_RCU);
 }
-
-/* Record the current task on dyntick-idle entry. */
-static __always_inline void rcu_dynticks_task_enter(void)
-{
-#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
-	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
-#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
-}
-
-/* Record no current task on dyntick-idle exit. */
-static __always_inline void rcu_dynticks_task_exit(void)
-{
-#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
-	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
-#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
-}
-
-/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
-static __always_inline void rcu_dynticks_task_trace_enter(void)
-{
-#ifdef CONFIG_TASKS_TRACE_RCU
-	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
-		current->trc_reader_special.b.need_mb = true;
-#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
-}
-
-/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
-static __always_inline void rcu_dynticks_task_trace_exit(void)
-{
-#ifdef CONFIG_TASKS_TRACE_RCU
-	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
-		current->trc_reader_special.b.need_mb = false;
-#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
-}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (13 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-09 16:40   ` nicolas saenz julienne
  2022-03-02 15:48 ` [PATCH 16/19] context_tracking: Convert state to atomic_t Frederic Weisbecker
                   ` (4 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Some eqs functions are now only used internally by context tracking, so
their public declarations can be removed.

Also middle functions such as rcu_user_*() and rcu_idle_*()
which now directly call to rcu_eqs_enter() and rcu_eqs_exit() can be
wiped out as well.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/hardirq.h   |   8 ---
 include/linux/rcutiny.h   |   2 -
 include/linux/rcutree.h   |   2 -
 kernel/context_tracking.c | 137 ++++++++++++--------------------------
 4 files changed, 44 insertions(+), 105 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 345cdbe9c1b7..d57cab4d4c06 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -92,14 +92,6 @@ void irq_exit_rcu(void);
 #define arch_nmi_exit()		do { } while (0)
 #endif
 
-#ifdef CONFIG_TINY_RCU
-static inline void rcu_nmi_enter(void) { }
-static inline void rcu_nmi_exit(void) { }
-#else
-extern void rcu_nmi_enter(void);
-extern void rcu_nmi_exit(void);
-#endif
-
 /*
  * NMI vs Tracing
  * --------------
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 9e07f8d9d544..f3326d88693a 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -96,8 +96,6 @@ static inline int rcu_needs_cpu(void)
 static inline void rcu_virt_note_context_switch(int cpu) { }
 static inline void rcu_cpu_stall_reset(void) { }
 static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; }
-static inline void rcu_idle_enter(void) { }
-static inline void rcu_idle_exit(void) { }
 static inline void rcu_irq_exit_check_preempt(void) { }
 #define rcu_is_idle_cpu(cpu) \
 	(is_idle_task(current) && !in_nmi() && !in_hardirq() && !in_serving_softirq())
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 408435ff7a06..54e84020e7e0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -49,8 +49,6 @@ unsigned long start_poll_synchronize_rcu(void);
 bool poll_state_synchronize_rcu(unsigned long oldstate);
 void cond_synchronize_rcu(unsigned long oldstate);
 
-void rcu_idle_enter(void);
-void rcu_idle_exit(void);
 bool rcu_is_idle_cpu(int cpu);
 
 #ifdef CONFIG_PROVE_RCU
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index dc24a9782bbd..de247e758767 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -149,52 +149,17 @@ static noinstr void rcu_eqs_enter(bool user)
 }
 
 /**
- * rcu_idle_enter - inform RCU that current CPU is entering idle
- *
- * Enter idle mode, in other words, -leave- the mode in which RCU
- * read-side critical sections can occur.  (Though RCU read-side
- * critical sections can occur in irq handlers in idle, a possibility
- * handled by irq_enter() and irq_exit().)
- *
- * If you add or remove a call to rcu_idle_enter(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_idle_enter(void)
-{
-	lockdep_assert_irqs_disabled();
-	rcu_eqs_enter(false);
-}
-
-#ifdef CONFIG_NO_HZ_FULL
-/**
- * rcu_user_enter - inform RCU that we are resuming userspace.
- *
- * Enter RCU idle mode right before resuming userspace.  No use of RCU
- * is permitted between this call and rcu_user_exit(). This way the
- * CPU doesn't need to maintain the tick for RCU maintenance purposes
- * when the CPU runs in userspace.
- *
- * If you add or remove a call to rcu_user_enter(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-noinstr void rcu_user_enter(void)
-{
-	rcu_eqs_enter(true);
-}
-#endif /* CONFIG_NO_HZ_FULL */
-
-/**
- * rcu_nmi_exit - inform RCU of exit from NMI context
+ * ct_nmi_exit - inform RCU of exit from NMI context
  *
  * If we are returning from the outermost NMI handler that interrupted an
  * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
  * to let the RCU grace-period handling know that the CPU is back to
  * being RCU-idle.
  *
- * If you add or remove a call to rcu_nmi_exit(), be sure to test
+ * If you add or remove a call to ct_nmi_exit(), be sure to test
  * with CONFIG_RCU_EQS_DEBUG=y.
  */
-noinstr void rcu_nmi_exit(void)
+noinstr void ct_nmi_exit(void)
 {
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
@@ -275,42 +240,7 @@ static void noinstr rcu_eqs_exit(bool user)
 }
 
 /**
- * rcu_idle_exit - inform RCU that current CPU is leaving idle
- *
- * Exit idle mode, in other words, -enter- the mode in which RCU
- * read-side critical sections can occur.
- *
- * If you add or remove a call to rcu_idle_exit(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_idle_exit(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	rcu_eqs_exit(false);
-	local_irq_restore(flags);
-}
-EXPORT_SYMBOL_GPL(rcu_idle_exit);
-
-#ifdef CONFIG_NO_HZ_FULL
-/**
- * rcu_user_exit - inform RCU that we are exiting userspace.
- *
- * Exit RCU idle mode while entering the kernel because it can
- * run a RCU read side critical section anytime.
- *
- * If you add or remove a call to rcu_user_exit(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void noinstr rcu_user_exit(void)
-{
-	rcu_eqs_exit(true);
-}
-#endif /* ifdef CONFIG_NO_HZ_FULL */
-
-/**
- * rcu_nmi_enter - inform RCU of entry to NMI context
+ * ct_nmi_enter - inform RCU of entry to NMI context
  *
  * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
  * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
@@ -318,10 +248,10 @@ void noinstr rcu_user_exit(void)
  * long as the nesting level does not overflow an int.  (You will probably
  * run out of stack space first.)
  *
- * If you add or remove a call to rcu_nmi_enter(), be sure to test
+ * If you add or remove a call to ct_nmi_enter(), be sure to test
  * with CONFIG_RCU_EQS_DEBUG=y.
  */
-noinstr void rcu_nmi_enter(void)
+noinstr void ct_nmi_enter(void)
 {
 	long incby = 2;
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
@@ -436,7 +366,13 @@ void noinstr __ct_user_enter(enum ctx_state state)
 			 * that will fire and reschedule once we resume in user/guest mode.
 			 */
 			rcu_irq_work_resched();
-			rcu_user_enter();
+			/*
+			 * Enter RCU idle mode right before resuming userspace.  No use of RCU
+			 * is permitted between this call and rcu_eqs_exit(). This way the
+			 * CPU doesn't need to maintain the tick for RCU maintenance purposes
+			 * when the CPU runs in userspace.
+			 */
+			rcu_eqs_enter(true);
 		}
 		/*
 		 * Even if context tracking is disabled on this CPU, because it's outside
@@ -510,10 +446,10 @@ void noinstr __ct_user_exit(enum ctx_state state)
 	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
-			 * We are going to run code that may use RCU. Inform
-			 * RCU core about that (ie: we may need the tick again).
+			 * Exit RCU idle mode while entering the kernel because it can
+			 * run a RCU read side critical section anytime.
 			 */
-			rcu_user_exit();
+			rcu_eqs_exit(true);
 			if (state == CONTEXT_USER) {
 				instrumentation_begin();
 				vtime_user_exit(current);
@@ -595,16 +531,41 @@ DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
 };
 EXPORT_SYMBOL_GPL(context_tracking);
 
-
+/**
+ * ct_idle_enter - inform RCU that current CPU is entering idle
+ *
+ * Enter idle mode, in other words, -leave- the mode in which RCU
+ * read-side critical sections can occur.  (Though RCU read-side
+ * critical sections can occur in irq handlers in idle, a possibility
+ * handled by irq_enter() and irq_exit().)
+ *
+ * If you add or remove a call to ct_idle_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
 void ct_idle_enter(void)
 {
-	rcu_idle_enter();
+	lockdep_assert_irqs_disabled();
+	rcu_eqs_enter(false);
 }
 EXPORT_SYMBOL_GPL(ct_idle_enter);
 
+/**
+ * ct_idle_exit - inform RCU that current CPU is leaving idle
+ *
+ * Exit idle mode, in other words, -enter- the mode in which RCU
+ * read-side critical sections can occur.
+ *
+ * If you add or remove a call to ct_idle_exit(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
+ */
 void ct_idle_exit(void)
 {
-	rcu_idle_exit();
+	unsigned long flags;
+
+	local_irq_save(flags);
+	rcu_eqs_exit(false);
+	local_irq_restore(flags);
+
 }
 EXPORT_SYMBOL_GPL(ct_idle_exit);
 
@@ -678,13 +639,3 @@ void ct_irq_exit_irqson(void)
 	ct_irq_exit();
 	local_irq_restore(flags);
 }
-
-noinstr void ct_nmi_enter(void)
-{
-	rcu_nmi_enter();
-}
-
-noinstr void ct_nmi_exit(void)
-{
-	rcu_nmi_exit();
-}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 16/19] context_tracking: Convert state to atomic_t
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (14 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-09 17:17   ` nicolas saenz julienne
  2022-03-12 22:54   ` Peter Zijlstra
  2022-03-02 15:48 ` [PATCH 17/19] rcu/context-tracking: Use accessor for dynticks counter value Frederic Weisbecker
                   ` (3 subsequent siblings)
  19 siblings, 2 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Context tracking's state and dynticks counter are going to be merged
in a single field so that both updates can happen atomically and at the
same time. Prepare for that with converting the state into an atomic_t.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking.h       | 18 ++---------
 include/linux/context_tracking_state.h | 42 +++++++++++++++++++++-----
 kernel/context_tracking.c              | 15 +++++----
 3 files changed, 47 insertions(+), 28 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 086546569d14..63343c34ab4e 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -56,7 +56,7 @@ static inline enum ctx_state exception_enter(void)
 	    !context_tracking_enabled())
 		return 0;
 
-	prev_ctx = this_cpu_read(context_tracking.state);
+	prev_ctx = __ct_state();
 	if (prev_ctx != CONTEXT_KERNEL)
 		ct_user_exit(prev_ctx);
 
@@ -86,26 +86,14 @@ static __always_inline void context_tracking_guest_exit(void)
 		__ct_user_exit(CONTEXT_GUEST);
 }
 
-/**
- * ct_state() - return the current context tracking state if known
- *
- * Returns the current cpu's context tracking state if context tracking
- * is enabled.  If context tracking is disabled, returns
- * CONTEXT_DISABLED.  This should be used primarily for debugging.
- */
-static __always_inline enum ctx_state ct_state(void)
-{
-	return context_tracking_enabled() ?
-		this_cpu_read(context_tracking.state) : CONTEXT_DISABLED;
-}
 #else
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline void user_enter_irqoff(void) { }
 static inline void user_exit_irqoff(void) { }
-static inline enum ctx_state exception_enter(void) { return 0; }
+static inline int exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
-static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
+static inline int ct_state(void) { return -1; }
 static __always_inline bool context_tracking_guest_enter(void) { return false; }
 static inline void context_tracking_guest_exit(void) { }
 
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 4efb97fe6518..3da44987dc71 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -9,6 +9,13 @@
 /* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
 #define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
 
+enum ctx_state {
+	CONTEXT_DISABLED = -1,	/* returned by ct_state() if unknown */
+	CONTEXT_KERNEL = 0,
+	CONTEXT_USER,
+	CONTEXT_GUEST,
+};
+
 struct context_tracking {
 #ifdef CONFIG_CONTEXT_TRACKING_USER
 	/*
@@ -19,12 +26,7 @@ struct context_tracking {
 	 */
 	bool active;
 	int recursion;
-	enum ctx_state {
-		CONTEXT_DISABLED = -1,	/* returned by ct_state() if unknown */
-		CONTEXT_KERNEL = 0,
-		CONTEXT_USER,
-		CONTEXT_GUEST,
-	} state;
+	atomic_t state;
 #endif
 	atomic_t dynticks;		/* Even value for idle, else odd. */
 	long dynticks_nesting;		/* Track process nesting level. */
@@ -33,6 +35,11 @@ struct context_tracking {
 
 #ifdef CONFIG_CONTEXT_TRACKING
 DECLARE_PER_CPU(struct context_tracking, context_tracking);
+
+static __always_inline int __ct_state(void)
+{
+	return atomic_read(this_cpu_ptr(&context_tracking.state));
+}
 #endif
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER
@@ -53,9 +60,30 @@ static inline bool context_tracking_enabled_this_cpu(void)
 	return context_tracking_enabled() && __this_cpu_read(context_tracking.active);
 }
 
+/**
+ * ct_state() - return the current context tracking state if known
+ *
+ * Returns the current cpu's context tracking state if context tracking
+ * is enabled.  If context tracking is disabled, returns
+ * CONTEXT_DISABLED.  This should be used primarily for debugging.
+ */
+static __always_inline int ct_state(void)
+{
+	int ret;
+
+	if (!context_tracking_enabled())
+		return CONTEXT_DISABLED;
+
+	preempt_disable();
+	ret = __ct_state();
+	preempt_enable();
+
+	return ret;
+}
+
 static __always_inline bool context_tracking_in_user(void)
 {
-	return __this_cpu_read(context_tracking.state) == CONTEXT_USER;
+	return __ct_state() == CONTEXT_USER;
 }
 #else
 static inline bool context_tracking_in_user(void) { return false; }
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index de247e758767..69db43548768 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -337,6 +337,7 @@ static __always_inline void context_tracking_recursion_exit(void)
  */
 void noinstr __ct_user_enter(enum ctx_state state)
 {
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 	lockdep_assert_irqs_disabled();
 
 	/* Kernel threads aren't supposed to go to userspace */
@@ -345,8 +346,8 @@ void noinstr __ct_user_enter(enum ctx_state state)
 	if (!context_tracking_recursion_enter())
 		return;
 
-	if ( __this_cpu_read(context_tracking.state) != state) {
-		if (__this_cpu_read(context_tracking.active)) {
+	if (__ct_state() != state) {
+		if (ct->active) {
 			/*
 			 * At this stage, only low level arch entry code remains and
 			 * then we'll run in userspace. We can assume there won't be
@@ -387,7 +388,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
 		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
 		 * is false because we know that CPU is not tickless.
 		 */
-		__this_cpu_write(context_tracking.state, state);
+		atomic_set(&ct->state, state);
 	}
 	context_tracking_recursion_exit();
 }
@@ -440,11 +441,13 @@ NOKPROBE_SYMBOL(user_enter_callable);
  */
 void noinstr __ct_user_exit(enum ctx_state state)
 {
+	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
+
 	if (!context_tracking_recursion_enter())
 		return;
 
-	if (__this_cpu_read(context_tracking.state) == state) {
-		if (__this_cpu_read(context_tracking.active)) {
+	if (__ct_state() == state) {
+		if (ct->active) {
 			/*
 			 * Exit RCU idle mode while entering the kernel because it can
 			 * run a RCU read side critical section anytime.
@@ -457,7 +460,7 @@ void noinstr __ct_user_exit(enum ctx_state state)
 				instrumentation_end();
 			}
 		}
-		__this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
+		atomic_set(&ct->state, CONTEXT_KERNEL);
 	}
 	context_tracking_recursion_exit();
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 17/19] rcu/context-tracking: Use accessor for dynticks counter value
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (15 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 16/19] context_tracking: Convert state to atomic_t Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:08   ` Paul E. McKenney
  2022-03-02 15:48 ` [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states Frederic Weisbecker
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

The dynticks counter value is going to join the context tracking state
in a single field. Use an accessor for this value to make the transition
easier for all readers.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking_state.h | 17 +++++++++++++++++
 kernel/context_tracking.c              | 10 +++++-----
 kernel/rcu/tree.c                      | 13 ++++---------
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 3da44987dc71..bca0d3e0bd3d 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -40,6 +40,23 @@ static __always_inline int __ct_state(void)
 {
 	return atomic_read(this_cpu_ptr(&context_tracking.state));
 }
+
+static __always_inline int ct_dynticks(void)
+{
+	return atomic_read(this_cpu_ptr(&context_tracking.dynticks));
+}
+
+static __always_inline int ct_dynticks_cpu(int cpu)
+{
+	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
+	return atomic_read(&ct->dynticks);
+}
+
+static __always_inline int ct_dynticks_cpu_acquire(int cpu)
+{
+	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
+	return atomic_read_acquire(&ct->state);
+}
 #endif
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 69db43548768..fe9066fdfaab 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -133,7 +133,7 @@ static noinstr void rcu_eqs_enter(bool user)
 
 	lockdep_assert_irqs_disabled();
 	instrumentation_begin();
-	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
+	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, ct_dynticks());
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	rcu_preempt_deferred_qs(current);
 
@@ -178,7 +178,7 @@ noinstr void ct_nmi_exit(void)
 	 */
 	if (ct->dynticks_nmi_nesting != 1) {
 		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
-				  atomic_read(&ct->dynticks));
+				  ct_dynticks());
 		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
 			   ct->dynticks_nmi_nesting - 2);
 		instrumentation_end();
@@ -186,7 +186,7 @@ noinstr void ct_nmi_exit(void)
 	}
 
 	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
-	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
+	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, ct_dynticks());
 	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
 
 	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
@@ -231,7 +231,7 @@ static void noinstr rcu_eqs_exit(bool user)
 	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
 	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
 
-	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
+	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, ct_dynticks());
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	WRITE_ONCE(ct->dynticks_nesting, 1);
 	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
@@ -292,7 +292,7 @@ noinstr void ct_nmi_enter(void)
 
 	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
 			  ct->dynticks_nmi_nesting,
-			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
+			  ct->dynticks_nmi_nesting + incby, ct_dynticks());
 	instrumentation_end();
 	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
 		   ct->dynticks_nmi_nesting + incby);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e55a44ed19b6..90a22dd2189d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -272,9 +272,7 @@ void rcu_softirq_qs(void)
  */
 static void rcu_dynticks_eqs_online(void)
 {
-	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
-
-	if (atomic_read(&ct->dynticks) & 0x1)
+	if (ct_dynticks() & 0x1)
 		return;
 	rcu_dynticks_inc(1);
 }
@@ -285,10 +283,8 @@ static void rcu_dynticks_eqs_online(void)
  */
 static int rcu_dynticks_snap(int cpu)
 {
-	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
-
 	smp_mb();  // Fundamental RCU ordering guarantee.
-	return atomic_read_acquire(&ct->dynticks);
+	return ct_dynticks_cpu_acquire(cpu);
 }
 
 /*
@@ -322,11 +318,10 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
  */
 bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
 {
-	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
 	int snap;
 
 	// If not quiescent, force back to earlier extended quiescent state.
-	snap = atomic_read(&ct->dynticks) & ~0x1;
+	snap = ct_dynticks_cpu(cpu) & ~0x1;
 
 	smp_rmb(); // Order ->dynticks and *vp reads.
 	if (READ_ONCE(*vp))
@@ -334,7 +329,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
 	smp_rmb(); // Order *vp read and ->dynticks re-read.
 
 	// If still in the same extended quiescent state, we are good!
-	return snap == atomic_read(&ct->dynticks);
+	return snap == ct_dynticks_cpu(cpu);
 }
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (16 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 17/19] rcu/context-tracking: Use accessor for dynticks counter value Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-10 20:32   ` Paul E. McKenney
  2022-03-02 15:48 ` [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking Frederic Weisbecker
  2022-03-11 11:37 ` [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking nicolas saenz julienne
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Updating the context tracking state and the RCU dynticks counter
atomically in a single operation is a first step towards improving CPU
isolation. This makes the context tracking state updates fully ordered
and therefore allow for later enhancements such as postponing some work
while a task is running isolated in userspace until it ever comes back
to the kernel.

The state field becomes divided in two parts:

1) Lower bits for context tracking state:

   	CONTEXT_IDLE = 1,
	CONTEXT_USER = 2,
	CONTEXT_GUEST = 4,

   A value of 0 means we are in CONTEXT_KERNEL.

2) Higher bits for RCU eqs dynticks counting:

    RCU_DYNTICKS_IDX = 8

   The dynticks counting is always incremented by this value.
   (state & RCU_DYNTICKS_IDX) means we are NOT in an extended quiescent
   state. This makes the chance for a collision more likely between two
   RCU dynticks snapshots but wrapping up 24 bits of eqs dynticks
   increments still takes some bad luck (also rdp.dynticks_snap could be
   converted from int to long?)

Some RCU eqs functions have been renamed to better reflect their broader
scope that now include context tracking state.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 include/linux/context_tracking.h       |  4 +-
 include/linux/context_tracking_state.h | 33 ++++++---
 kernel/context_tracking.c              | 92 +++++++++++++-------------
 kernel/rcu/tree.c                      | 13 ++--
 kernel/rcu/tree_stall.h                |  2 +-
 5 files changed, 81 insertions(+), 63 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 63343c34ab4e..aa0e6fbf6946 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -110,7 +110,7 @@ static inline void context_tracking_init(void) { }
 #ifdef CONFIG_CONTEXT_TRACKING
 extern void ct_idle_enter(void);
 extern void ct_idle_exit(void);
-extern unsigned long rcu_dynticks_inc(int incby);
+extern unsigned long ct_state_inc(int incby);
 
 /*
  * Is the current CPU in an extended quiescent state?
@@ -119,7 +119,7 @@ extern unsigned long rcu_dynticks_inc(int incby);
  */
 static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
 {
-	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
+	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.state)) & RCU_DYNTICKS_IDX);
 }
 
 #else
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index bca0d3e0bd3d..b8a309532c18 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -9,13 +9,29 @@
 /* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
 #define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
 
+enum {
+	CONTEXT_IDLE_OFFSET = 0,
+	CONTEXT_USER_OFFSET,
+	CONTEXT_GUEST_OFFSET,
+	CONTEXT_MAX_OFFSET,
+};
+
 enum ctx_state {
 	CONTEXT_DISABLED = -1,	/* returned by ct_state() if unknown */
 	CONTEXT_KERNEL = 0,
-	CONTEXT_USER,
-	CONTEXT_GUEST,
+	CONTEXT_IDLE = BIT(CONTEXT_IDLE_OFFSET),
+	CONTEXT_USER = BIT(CONTEXT_USER_OFFSET),
+	CONTEXT_GUEST = BIT(CONTEXT_GUEST_OFFSET),
+	CONTEXT_MAX = BIT(CONTEXT_MAX_OFFSET),
 };
 
+/* Even value for idle, else odd. */
+#define RCU_DYNTICKS_SHIFT CONTEXT_MAX_OFFSET
+#define RCU_DYNTICKS_IDX CONTEXT_MAX
+
+#define CT_STATE_MASK (CONTEXT_MAX - 1)
+#define CT_DYNTICKS_MASK (~CT_STATE_MASK)
+
 struct context_tracking {
 #ifdef CONFIG_CONTEXT_TRACKING_USER
 	/*
@@ -26,9 +42,8 @@ struct context_tracking {
 	 */
 	bool active;
 	int recursion;
+#endif
 	atomic_t state;
-#endif
-	atomic_t dynticks;		/* Even value for idle, else odd. */
 	long dynticks_nesting;		/* Track process nesting level. */
 	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
 };
@@ -38,24 +53,26 @@ DECLARE_PER_CPU(struct context_tracking, context_tracking);
 
 static __always_inline int __ct_state(void)
 {
-	return atomic_read(this_cpu_ptr(&context_tracking.state));
+	return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK;
 }
 
 static __always_inline int ct_dynticks(void)
 {
-	return atomic_read(this_cpu_ptr(&context_tracking.dynticks));
+	return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_DYNTICKS_MASK;
 }
 
 static __always_inline int ct_dynticks_cpu(int cpu)
 {
 	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
-	return atomic_read(&ct->dynticks);
+
+	return atomic_read(&ct->state) & CT_DYNTICKS_MASK;
 }
 
 static __always_inline int ct_dynticks_cpu_acquire(int cpu)
 {
 	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
-	return atomic_read_acquire(&ct->state);
+
+	return atomic_read_acquire(&ct->state) & CT_DYNTICKS_MASK;
 }
 #endif
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index fe9066fdfaab..87e7b748791c 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -63,9 +63,9 @@ static __always_inline void rcu_dynticks_task_trace_exit(void)
  * Increment the current CPU's context_tracking structure's ->dynticks field
  * with ordering.  Return the new value.
  */
-noinstr unsigned long rcu_dynticks_inc(int incby)
+noinstr unsigned long ct_state_inc(int incby)
 {
-	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
+	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.state));
 }
 
 /*
@@ -74,7 +74,7 @@ noinstr unsigned long rcu_dynticks_inc(int incby)
  * RCU is watching prior to the call to this function and is no longer
  * watching upon return.
  */
-static noinstr void rcu_dynticks_eqs_enter(void)
+static noinstr void ct_kernel_exit_state(int offset)
 {
 	int seq;
 
@@ -84,9 +84,9 @@ static noinstr void rcu_dynticks_eqs_enter(void)
 	 * next idle sojourn.
 	 */
 	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
-	seq = rcu_dynticks_inc(1);
+	seq = ct_state_inc(offset);
 	// RCU is no longer watching.  Better be in extended quiescent state!
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & RCU_DYNTICKS_IDX));
 }
 
 /*
@@ -94,7 +94,7 @@ static noinstr void rcu_dynticks_eqs_enter(void)
  * called from an extended quiescent state, that is, RCU is not watching
  * prior to the call to this function and is watching upon return.
  */
-static noinstr void rcu_dynticks_eqs_exit(void)
+static noinstr void ct_kernel_enter_state(int offset)
 {
 	int seq;
 
@@ -103,10 +103,10 @@ static noinstr void rcu_dynticks_eqs_exit(void)
 	 * and we also must force ordering with the next RCU read-side
 	 * critical section.
 	 */
-	seq = rcu_dynticks_inc(1);
+	seq = ct_state_inc(offset);
 	// RCU is now watching.  Better not be in an extended quiescent state!
 	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & RCU_DYNTICKS_IDX));
 }
 
 /*
@@ -117,7 +117,7 @@ static noinstr void rcu_dynticks_eqs_exit(void)
  * the possibility of usermode upcalls having messed up our count
  * of interrupt nesting level during the prior busy period.
  */
-static noinstr void rcu_eqs_enter(bool user)
+static noinstr void ct_kernel_exit(bool user, int offset)
 {
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 
@@ -137,13 +137,13 @@ static noinstr void rcu_eqs_enter(bool user)
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	rcu_preempt_deferred_qs(current);
 
-	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
-	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+	// instrumentation for the noinstr ct_kernel_exit_state()
+	instrument_atomic_write(&ct->state, sizeof(ct->state));
 
 	instrumentation_end();
 	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
 	// RCU is watching here ...
-	rcu_dynticks_eqs_enter();
+	ct_kernel_exit_state(offset);
 	// ... but is no longer watching here.
 	rcu_dynticks_task_enter();
 }
@@ -152,7 +152,7 @@ static noinstr void rcu_eqs_enter(bool user)
  * ct_nmi_exit - inform RCU of exit from NMI context
  *
  * If we are returning from the outermost NMI handler that interrupted an
- * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
+ * RCU-idle period, update ct->state and ct->dynticks_nmi_nesting
  * to let the RCU grace-period handling know that the CPU is back to
  * being RCU-idle.
  *
@@ -189,12 +189,12 @@ noinstr void ct_nmi_exit(void)
 	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, ct_dynticks());
 	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
 
-	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
-	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+	// instrumentation for the noinstr ct_kernel_exit_state()
+	instrument_atomic_write(&ct->state, sizeof(ct->state));
 	instrumentation_end();
 
 	// RCU is watching here ...
-	rcu_dynticks_eqs_enter();
+	ct_kernel_exit_state(RCU_DYNTICKS_IDX);
 	// ... but is no longer watching here.
 
 	if (!in_nmi())
@@ -209,7 +209,7 @@ noinstr void ct_nmi_exit(void)
  * allow for the possibility of usermode upcalls messing up our count of
  * interrupt nesting level during the busy period that is just now starting.
  */
-static void noinstr rcu_eqs_exit(bool user)
+static void noinstr ct_kernel_enter(bool user, int offset)
 {
 	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
 	long oldval;
@@ -224,12 +224,12 @@ static void noinstr rcu_eqs_exit(bool user)
 	}
 	rcu_dynticks_task_exit();
 	// RCU is not watching here ...
-	rcu_dynticks_eqs_exit();
+	ct_kernel_enter_state(offset);
 	// ... but is watching here.
 	instrumentation_begin();
 
-	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
-	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+	// instrumentation for the noinstr ct_kernel_enter_state()
+	instrument_atomic_write(&ct->state, sizeof(ct->state));
 
 	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, ct_dynticks());
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
@@ -242,7 +242,7 @@ static void noinstr rcu_eqs_exit(bool user)
 /**
  * ct_nmi_enter - inform RCU of entry to NMI context
  *
- * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
+ * If the CPU was idle from RCU's viewpoint, update ct->state and
  * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
  * that the CPU is active.  This implementation permits nested NMIs, as
  * long as the nesting level does not overflow an int.  (You will probably
@@ -273,14 +273,14 @@ noinstr void ct_nmi_enter(void)
 			rcu_dynticks_task_exit();
 
 		// RCU is not watching here ...
-		rcu_dynticks_eqs_exit();
+		ct_kernel_enter_state(RCU_DYNTICKS_IDX);
 		// ... but is watching here.
 
 		instrumentation_begin();
 		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
-		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
-		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
-		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
+		instrument_atomic_read(&ct->state, sizeof(ct->state));
+		// instrumentation for the noinstr ct_kernel_enter_state()
+		instrument_atomic_write(&ct->state, sizeof(ct->state));
 
 		incby = 1;
 	} else if (!in_nmi()) {
@@ -373,22 +373,23 @@ void noinstr __ct_user_enter(enum ctx_state state)
 			 * CPU doesn't need to maintain the tick for RCU maintenance purposes
 			 * when the CPU runs in userspace.
 			 */
-			rcu_eqs_enter(true);
+			ct_kernel_exit(true, RCU_DYNTICKS_IDX + state);
+		} else {
+			/*
+			 * Even if context tracking is disabled on this CPU, because it's outside
+			 * the full dynticks mask for example, we still have to keep track of the
+			 * context transitions and states to prevent inconsistency on those of
+			 * other CPUs.
+			 * If a task triggers an exception in userspace, sleep on the exception
+			 * handler and then migrate to another CPU, that new CPU must know where
+			 * the exception returns by the time we call exception_exit().
+			 * This information can only be provided by the previous CPU when it called
+			 * exception_enter().
+			 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
+			 * is false because we know that CPU is not tickless.
+			 */
+			atomic_add(state, &ct->state);
 		}
-		/*
-		 * Even if context tracking is disabled on this CPU, because it's outside
-		 * the full dynticks mask for example, we still have to keep track of the
-		 * context transitions and states to prevent inconsistency on those of
-		 * other CPUs.
-		 * If a task triggers an exception in userspace, sleep on the exception
-		 * handler and then migrate to another CPU, that new CPU must know where
-		 * the exception returns by the time we call exception_exit().
-		 * This information can only be provided by the previous CPU when it called
-		 * exception_enter().
-		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
-		 * is false because we know that CPU is not tickless.
-		 */
-		atomic_set(&ct->state, state);
 	}
 	context_tracking_recursion_exit();
 }
@@ -452,15 +453,16 @@ void noinstr __ct_user_exit(enum ctx_state state)
 			 * Exit RCU idle mode while entering the kernel because it can
 			 * run a RCU read side critical section anytime.
 			 */
-			rcu_eqs_exit(true);
+			ct_kernel_enter(true, RCU_DYNTICKS_IDX - state);
 			if (state == CONTEXT_USER) {
 				instrumentation_begin();
 				vtime_user_exit(current);
 				trace_user_exit(0);
 				instrumentation_end();
 			}
+		} else {
+			atomic_sub(state, &ct->state);
 		}
-		atomic_set(&ct->state, CONTEXT_KERNEL);
 	}
 	context_tracking_recursion_exit();
 }
@@ -530,7 +532,7 @@ void __init context_tracking_init(void)
 DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
 		.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
 		.dynticks_nesting = 1,
-		.dynticks = ATOMIC_INIT(1),
+		.state = ATOMIC_INIT(RCU_DYNTICKS_IDX),
 };
 EXPORT_SYMBOL_GPL(context_tracking);
 
@@ -548,7 +550,7 @@ EXPORT_SYMBOL_GPL(context_tracking);
 void ct_idle_enter(void)
 {
 	lockdep_assert_irqs_disabled();
-	rcu_eqs_enter(false);
+	ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE);
 }
 EXPORT_SYMBOL_GPL(ct_idle_enter);
 
@@ -566,7 +568,7 @@ void ct_idle_exit(void)
 	unsigned long flags;
 
 	local_irq_save(flags);
-	rcu_eqs_exit(false);
+	ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);
 	local_irq_restore(flags);
 
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 90a22dd2189d..98fac3d327c9 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -272,9 +272,9 @@ void rcu_softirq_qs(void)
  */
 static void rcu_dynticks_eqs_online(void)
 {
-	if (ct_dynticks() & 0x1)
+	if (ct_dynticks() & RCU_DYNTICKS_IDX)
 		return;
-	rcu_dynticks_inc(1);
+	ct_state_inc(RCU_DYNTICKS_IDX);
 }
 
 /*
@@ -293,7 +293,7 @@ static int rcu_dynticks_snap(int cpu)
  */
 static bool rcu_dynticks_in_eqs(int snap)
 {
-	return !(snap & 0x1);
+	return !(snap & RCU_DYNTICKS_IDX);
 }
 
 /* Return true if the specified CPU is currently idle from an RCU viewpoint.  */
@@ -321,8 +321,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
 	int snap;
 
 	// If not quiescent, force back to earlier extended quiescent state.
-	snap = ct_dynticks_cpu(cpu) & ~0x1;
-
+	snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX;
 	smp_rmb(); // Order ->dynticks and *vp reads.
 	if (READ_ONCE(*vp))
 		return false;  // Non-zero, so report failure;
@@ -348,9 +347,9 @@ notrace void rcu_momentary_dyntick_idle(void)
 	int seq;
 
 	raw_cpu_write(rcu_data.rcu_need_heavy_qs, false);
-	seq = rcu_dynticks_inc(2);
+	seq = ct_state_inc(2 * RCU_DYNTICKS_IDX);
 	/* It is illegal to call this from idle state. */
-	WARN_ON_ONCE(!(seq & 0x1));
+	WARN_ON_ONCE(!(seq & RCU_DYNTICKS_IDX));
 	rcu_preempt_deferred_qs(current);
 }
 EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 9bf5cc79d5eb..1ac48c804006 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -459,7 +459,7 @@ static void print_cpu_stall_info(int cpu)
 			rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
 				"!."[!delta],
 	       ticks_value, ticks_title,
-	       rcu_dynticks_snap(cpu) & 0xfff,
+	       (rcu_dynticks_snap(cpu) >> RCU_DYNTICKS_SHIFT) & 0xfff ,
 	       ct->dynticks_nesting, ct->dynticks_nmi_nesting,
 	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
 	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (17 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states Frederic Weisbecker
@ 2022-03-02 15:48 ` Frederic Weisbecker
  2022-03-08 16:15   ` nicolas saenz julienne
  2022-03-11 11:37 ` [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking nicolas saenz julienne
  19 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-02 15:48 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes

Since a CPU may save the state of the context tracking using
exception_enter() before calling into schedule(), we need all CPUs in
the system to track user <-> kernel transitions and not just those that
really need it (nohz_full CPUs).

The following illustrates the issue that could otherwise happen:

     CPU 0 (not tracking)                       CPU 1 (tracking)
     -------------------                       --------------------
     // we are past user_enter()
     // but this CPU is always in
     // CONTEXT_KERNEL
     // because it doesn't track user <-> kernel

     ctx = exception_enter(); //ctx == CONTEXT_KERNEL
     schedule();
     ===========================================>
                                               return from schedule();
                                               exception_exit(ctx);
                                               //go to user in CONTEXT_KERNEL

However CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK doesn't play those
games because schedule() can't be called between user_enter() and
user_exit() under such config. In this situation we can spare context
tracking on the CPUs that don't need it.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao<liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
---
 kernel/context_tracking.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 87e7b748791c..b1934264f77f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -374,7 +374,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
 			 * when the CPU runs in userspace.
 			 */
 			ct_kernel_exit(true, RCU_DYNTICKS_IDX + state);
-		} else {
+		} else if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)) {
 			/*
 			 * Even if context tracking is disabled on this CPU, because it's outside
 			 * the full dynticks mask for example, we still have to keep track of the
@@ -384,7 +384,8 @@ void noinstr __ct_user_enter(enum ctx_state state)
 			 * handler and then migrate to another CPU, that new CPU must know where
 			 * the exception returns by the time we call exception_exit().
 			 * This information can only be provided by the previous CPU when it called
-			 * exception_enter().
+			 * exception_enter(). CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK is
+			 * excused though because it doesn't use exception_enter().
 			 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
 			 * is false because we know that CPU is not tickless.
 			 */
@@ -460,7 +461,7 @@ void noinstr __ct_user_exit(enum ctx_state state)
 				trace_user_exit(0);
 				instrumentation_end();
 			}
-		} else {
+		} else if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)) {
 			atomic_sub(state, &ct->state);
 		}
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable()
  2022-03-02 15:47 ` [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable() Frederic Weisbecker
@ 2022-03-05 13:59   ` Peter Zijlstra
  2022-03-09 20:53     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-05 13:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:53PM +0100, Frederic Weisbecker wrote:
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index ad2a973393a6..83e050675b23 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -125,11 +125,16 @@ void context_tracking_enter(enum ctx_state state)
>  NOKPROBE_SYMBOL(context_tracking_enter);
>  EXPORT_SYMBOL_GPL(context_tracking_enter);
>  
> -void context_tracking_user_enter(void)
> +/**
> + * user_enter_callable() - Unfortunate ASM callable version of user_enter() for
> + * 			   archs that didn't manage to check the context tracking
> + * 			   static key from low level code.
> + */
> +void user_enter_callable(void)
>  {
>  	user_enter();
>  }
> -NOKPROBE_SYMBOL(context_tracking_user_enter);
> +NOKPROBE_SYMBOL(user_enter_callable);
>  
>  /**
>   * __ct_user_exit - Inform the context tracking that the CPU is
> @@ -182,11 +187,16 @@ void context_tracking_exit(enum ctx_state state)
>  NOKPROBE_SYMBOL(context_tracking_exit);
>  EXPORT_SYMBOL_GPL(context_tracking_exit);
>  
> -void context_tracking_user_exit(void)
> +/**
> + * user_exit_callable() - Unfortunate ASM callable version of user_exit() for
> + * 			  archs that didn't manage to check the context tracking
> + * 			  static key from low level code.
> + */
> +void user_exit_callable(void)
>  {
>  	user_exit();
>  }
> -NOKPROBE_SYMBOL(context_tracking_user_exit);
> +NOKPROBE_SYMBOL(user_exit_callable);

I'm thinking all this wants to be noinstr instead of NOKPROBE_SYMBOL..

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit()
  2022-03-02 15:47 ` [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit() Frederic Weisbecker
@ 2022-03-05 14:02   ` Peter Zijlstra
  2022-03-09 21:21     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-05 14:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:54PM +0100, Frederic Weisbecker wrote:
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 83e050675b23..e8e58c10f135 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -103,7 +103,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
>  }
>  EXPORT_SYMBOL_GPL(__ct_user_enter);
>  
> -void context_tracking_enter(enum ctx_state state)
> +void ct_user_enter(enum ctx_state state)
>  {
>  	unsigned long flags;
>  
> @@ -122,8 +122,8 @@ void context_tracking_enter(enum ctx_state state)
>  	__ct_user_enter(state);
>  	local_irq_restore(flags);
>  }
> -NOKPROBE_SYMBOL(context_tracking_enter);
> -EXPORT_SYMBOL_GPL(context_tracking_enter);
> +NOKPROBE_SYMBOL(ct_user_enter);
> +EXPORT_SYMBOL_GPL(ct_user_enter);
>  
>  /**
>   * user_enter_callable() - Unfortunate ASM callable version of user_enter() for
> @@ -173,7 +173,7 @@ void noinstr __ct_user_exit(enum ctx_state state)
>  }
>  EXPORT_SYMBOL_GPL(__ct_user_exit);
>  
> -void context_tracking_exit(enum ctx_state state)
> +void ct_user_exit(enum ctx_state state)
>  {
>  	unsigned long flags;
>  
> @@ -184,8 +184,8 @@ void context_tracking_exit(enum ctx_state state)
>  	__ct_user_exit(state);
>  	local_irq_restore(flags);
>  }
> -NOKPROBE_SYMBOL(context_tracking_exit);
> -EXPORT_SYMBOL_GPL(context_tracking_exit);
> +NOKPROBE_SYMBOL(ct_user_exit);
> +EXPORT_SYMBOL_GPL(ct_user_exit);
>  
>  /**
>   * user_exit_callable() - Unfortunate ASM callable version of user_exit() for

Why is it NOKPROBE but not notrace, also local_irq_*() include explicit
tracepoints.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user()
  2022-03-02 15:47 ` [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user() Frederic Weisbecker
@ 2022-03-05 14:03   ` Peter Zijlstra
  2022-03-09 21:11     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-05 14:03 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:55PM +0100, Frederic Weisbecker wrote:
> context_tracking_cpu_set() is called in order to tell a CPU to track
> user/kernel transitions. Since context tracking is going to expand in
> to also track transitions from/to idle/IRQ/NMIs, the scope
> of this function name becomes too broad and needs to be made more
> specific.

The previous patches did: s/context_tracking_/ct_/ on these names, and
this one makes it longer still?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU
  2022-03-02 15:47 ` [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU Frederic Weisbecker
@ 2022-03-05 14:05   ` Peter Zijlstra
  2022-03-09 21:12     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-05 14:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:57PM +0100, Frederic Weisbecker wrote:
> The RCU dynticks counter is going to be merged into the context tracking
> subsystem. Start with moving the idle extended quiescent states
> entrypoints to RCU. For now those are dumb redirection to existing RCU
> calls.

s/to/from/ right? You're taking them away from RCU and moving them to
ct.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit()
  2022-03-02 15:48 ` [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit() Frederic Weisbecker
@ 2022-03-05 14:16   ` Peter Zijlstra
  2022-03-09 22:25     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-05 14:16 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:00PM +0100, Frederic Weisbecker wrote:
>  void ct_irq_enter_irqson(void)
>  {
> -	rcu_irq_enter_irqson();
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	ct_irq_enter();
> +	local_irq_restore(flags);
>  }
>  
>  void ct_irq_exit_irqson(void)
>  {
> -	rcu_irq_exit_irqson();
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	ct_irq_exit();
> +	local_irq_restore(flags);
>  }

I know you're just copying code around, but this is broken per
construction :/

On the irq_enter site, local_irq_save() will hit a tracepoint, which
requires RCU, which will only be made available by the ct_irq_enter().
Same in reverse for the exit case.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking
  2022-03-02 15:48 ` [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking Frederic Weisbecker
@ 2022-03-08 16:15   ` nicolas saenz julienne
  2022-03-11 15:16     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: nicolas saenz julienne @ 2022-03-08 16:15 UTC (permalink / raw)
  To: Frederic Weisbecker, LKML
  Cc: Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

Hi Frederic,

On Wed, 2022-03-02 at 16:48 +0100, Frederic Weisbecker wrote:
> Since a CPU may save the state of the context tracking using
> exception_enter() before calling into schedule(), we need all CPUs in
> the system to track user <-> kernel transitions and not just those that
> really need it (nohz_full CPUs).
> 
> The following illustrates the issue that could otherwise happen:
> 
>      CPU 0 (not tracking)                       CPU 1 (tracking)
>      -------------------                       --------------------
>      // we are past user_enter()
>      // but this CPU is always in
>      // CONTEXT_KERNEL
>      // because it doesn't track user <-> kernel
> 
>      ctx = exception_enter(); //ctx == CONTEXT_KERNEL
>      schedule();
>      ===========================================>
>                                                return from schedule();
>                                                exception_exit(ctx);
>                                                //go to user in CONTEXT_KERNEL
> 
> However CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK doesn't play those
> games because schedule() can't be called between user_enter() and
> user_exit() under such config. In this situation we can spare context
> tracking on the CPUs that don't need it.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  kernel/context_tracking.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 87e7b748791c..b1934264f77f 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -374,7 +374,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
>  			 * when the CPU runs in userspace.
>  			 */
>  			ct_kernel_exit(true, RCU_DYNTICKS_IDX + state);
> -		} else {
> +		} else if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)) {

user entry code assumes that state will be kept on all CPUs as long as context
tracking is enabled. See kernel/entry/common.c:

   static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
   {
           arch_check_user_regs(regs);
           lockdep_hardirqs_off(CALLER_ADDR0);
           
           CT_WARN_ON(ct_state() != CONTEXT_USER); <-- NOT HAPPY ABOUT THIS CHANGE
           user_exit_irqoff();
           
           instrumentation_begin();
           trace_hardirqs_off_finish();
           instrumentation_end();
   }

Regards,
Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions
  2022-03-02 15:48 ` [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions Frederic Weisbecker
@ 2022-03-09 16:40   ` nicolas saenz julienne
  2022-03-11 15:19     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: nicolas saenz julienne @ 2022-03-09 16:40 UTC (permalink / raw)
  To: Frederic Weisbecker, LKML
  Cc: Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, 2022-03-02 at 16:48 +0100, Frederic Weisbecker wrote:
> Some eqs functions are now only used internally by context tracking, so
> their public declarations can be removed.
> 
> Also middle functions such as rcu_user_*() and rcu_idle_*()
> which now directly call to rcu_eqs_enter() and rcu_eqs_exit() can be
> wiped out as well.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---

You missed rcu_user_{enter,exit} declarations in rcupdate.h

There are also comments refering to them in kernel/context_tracking.c and
Documentation/RCU/stallwarn.rst.

Regards,
Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/19] context_tracking: Convert state to atomic_t
  2022-03-02 15:48 ` [PATCH 16/19] context_tracking: Convert state to atomic_t Frederic Weisbecker
@ 2022-03-09 17:17   ` nicolas saenz julienne
  2022-03-11 15:24     ` Frederic Weisbecker
  2022-03-12 22:54   ` Peter Zijlstra
  1 sibling, 1 reply; 57+ messages in thread
From: nicolas saenz julienne @ 2022-03-09 17:17 UTC (permalink / raw)
  To: Frederic Weisbecker, LKML
  Cc: Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, 2022-03-02 at 16:48 +0100, Frederic Weisbecker wrote:
> Context tracking's state and dynticks counter are going to be merged
> in a single field so that both updates can happen atomically and at the
> same time. Prepare for that with converting the state into an atomic_t.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  static __always_inline bool context_tracking_in_user(void)
>  {
> -	return __this_cpu_read(context_tracking.state) == CONTEXT_USER;
> +	return __ct_state() == CONTEXT_USER;
>  }

I was wondering whether it'd make more sense to use ct_state() for extra safety
vs preemption, but it turns out the function isn't being used at all.

I figure it'd be better to remove it altogether and leave ct_state() as the
goto function for this sort of checks.

>  #else
>  static inline bool context_tracking_in_user(void) { return false; }
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index de247e758767..69db43548768 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -337,6 +337,7 @@ static __always_inline void context_tracking_recursion_exit(void)
>   */
>  void noinstr __ct_user_enter(enum ctx_state state)
>  {
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);

I wonder if there is any value to having __ct_state() take 'struct
context_tracking *ct' as an argument to avoid a redundant this_cpu_ptr()...

>  	lockdep_assert_irqs_disabled();
>  
>  	/* Kernel threads aren't supposed to go to userspace */
> @@ -345,8 +346,8 @@ void noinstr __ct_user_enter(enum ctx_state state)
>  	if (!context_tracking_recursion_enter())
>  		return;
>  
> -	if ( __this_cpu_read(context_tracking.state) != state) {
> -		if (__this_cpu_read(context_tracking.active)) {
> +	if (__ct_state() != state) {

...here (and in __ct_user_exit()).

Regards,
Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable()
  2022-03-05 13:59   ` Peter Zijlstra
@ 2022-03-09 20:53     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-09 20:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Sat, Mar 05, 2022 at 02:59:40PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 02, 2022 at 04:47:53PM +0100, Frederic Weisbecker wrote:
> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index ad2a973393a6..83e050675b23 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -125,11 +125,16 @@ void context_tracking_enter(enum ctx_state state)
> >  NOKPROBE_SYMBOL(context_tracking_enter);
> >  EXPORT_SYMBOL_GPL(context_tracking_enter);
> >  
> > -void context_tracking_user_enter(void)
> > +/**
> > + * user_enter_callable() - Unfortunate ASM callable version of user_enter() for
> > + * 			   archs that didn't manage to check the context tracking
> > + * 			   static key from low level code.
> > + */
> > +void user_enter_callable(void)
> >  {
> >  	user_enter();
> >  }
> > -NOKPROBE_SYMBOL(context_tracking_user_enter);
> > +NOKPROBE_SYMBOL(user_enter_callable);
> >  
> >  /**
> >   * __ct_user_exit - Inform the context tracking that the CPU is
> > @@ -182,11 +187,16 @@ void context_tracking_exit(enum ctx_state state)
> >  NOKPROBE_SYMBOL(context_tracking_exit);
> >  EXPORT_SYMBOL_GPL(context_tracking_exit);
> >  
> > -void context_tracking_user_exit(void)
> > +/**
> > + * user_exit_callable() - Unfortunate ASM callable version of user_exit() for
> > + * 			  archs that didn't manage to check the context tracking
> > + * 			  static key from low level code.
> > + */
> > +void user_exit_callable(void)
> >  {
> >  	user_exit();
> >  }
> > -NOKPROBE_SYMBOL(context_tracking_user_exit);
> > +NOKPROBE_SYMBOL(user_exit_callable);
> 
> I'm thinking all this wants to be noinstr instead of NOKPROBE_SYMBOL..

Good point, I'll fix that ahead.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user()
  2022-03-05 14:03   ` Peter Zijlstra
@ 2022-03-09 21:11     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-09 21:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Sat, Mar 05, 2022 at 03:03:21PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 02, 2022 at 04:47:55PM +0100, Frederic Weisbecker wrote:
> > context_tracking_cpu_set() is called in order to tell a CPU to track
> > user/kernel transitions. Since context tracking is going to expand in
> > to also track transitions from/to idle/IRQ/NMIs, the scope
> > of this function name becomes too broad and needs to be made more
> > specific.
> 
> The previous patches did: s/context_tracking_/ct_/ on these names, and
> this one makes it longer still?

Right, in fact I added the renames afterward and forgot to update that patch.
Will do.

Thanks.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU
  2022-03-05 14:05   ` Peter Zijlstra
@ 2022-03-09 21:12     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-09 21:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Sat, Mar 05, 2022 at 03:05:26PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 02, 2022 at 04:47:57PM +0100, Frederic Weisbecker wrote:
> > The RCU dynticks counter is going to be merged into the context tracking
> > subsystem. Start with moving the idle extended quiescent states
> > entrypoints to RCU. For now those are dumb redirection to existing RCU
> > calls.
> 
> s/to/from/ right? You're taking them away from RCU and moving them to
> ct.

Right!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit()
  2022-03-05 14:02   ` Peter Zijlstra
@ 2022-03-09 21:21     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-09 21:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Sat, Mar 05, 2022 at 03:02:33PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 02, 2022 at 04:47:54PM +0100, Frederic Weisbecker wrote:
> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index 83e050675b23..e8e58c10f135 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -103,7 +103,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
> >  }
> >  EXPORT_SYMBOL_GPL(__ct_user_enter);
> >  
> > -void context_tracking_enter(enum ctx_state state)
> > +void ct_user_enter(enum ctx_state state)
> >  {
> >  	unsigned long flags;
> >  
> > @@ -122,8 +122,8 @@ void context_tracking_enter(enum ctx_state state)
> >  	__ct_user_enter(state);
> >  	local_irq_restore(flags);
> >  }
> > -NOKPROBE_SYMBOL(context_tracking_enter);
> > -EXPORT_SYMBOL_GPL(context_tracking_enter);
> > +NOKPROBE_SYMBOL(ct_user_enter);
> > +EXPORT_SYMBOL_GPL(ct_user_enter);
> >  
> >  /**
> >   * user_enter_callable() - Unfortunate ASM callable version of user_enter() for
> > @@ -173,7 +173,7 @@ void noinstr __ct_user_exit(enum ctx_state state)
> >  }
> >  EXPORT_SYMBOL_GPL(__ct_user_exit);
> >  
> > -void context_tracking_exit(enum ctx_state state)
> > +void ct_user_exit(enum ctx_state state)
> >  {
> >  	unsigned long flags;
> >  
> > @@ -184,8 +184,8 @@ void context_tracking_exit(enum ctx_state state)
> >  	__ct_user_exit(state);
> >  	local_irq_restore(flags);
> >  }
> > -NOKPROBE_SYMBOL(context_tracking_exit);
> > -EXPORT_SYMBOL_GPL(context_tracking_exit);
> > +NOKPROBE_SYMBOL(ct_user_exit);
> > +EXPORT_SYMBOL_GPL(ct_user_exit);
> >  
> >  /**
> >   * user_exit_callable() - Unfortunate ASM callable version of user_exit() for
> 
> Why is it NOKPROBE but not notrace, also local_irq_*() include explicit
> tracepoints.

Again stuff I need to fix ahead. Thanks for catching that.

Thanks!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit()
  2022-03-05 14:16   ` Peter Zijlstra
@ 2022-03-09 22:25     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-09 22:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Sat, Mar 05, 2022 at 03:16:03PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 02, 2022 at 04:48:00PM +0100, Frederic Weisbecker wrote:
> >  void ct_irq_enter_irqson(void)
> >  {
> > -	rcu_irq_enter_irqson();
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> > +	ct_irq_enter();
> > +	local_irq_restore(flags);
> >  }
> >  
> >  void ct_irq_exit_irqson(void)
> >  {
> > -	rcu_irq_exit_irqson();
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> > +	ct_irq_exit();
> > +	local_irq_restore(flags);
> >  }
> 
> I know you're just copying code around, but this is broken per
> construction :/
> 
> On the irq_enter site, local_irq_save() will hit a tracepoint, which
> requires RCU, which will only be made available by the ct_irq_enter().
> Same in reverse for the exit case.

Ouch. And playing a game similar to that in default_idle_call() is going to
be trickier because rcu_irq_enter() may or may not exit an RCU-off mode. But it's
feasible. Again I'll fix it ahead.

Thanks!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit()
  2022-03-02 15:47 ` [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit() Frederic Weisbecker
@ 2022-03-10 19:27   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 19:27 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:52PM +0100, Frederic Weisbecker wrote:
> The context tracking namespace is going to expand and some new functions
> will require even longer names. Start shrinking the context_tracking
> prefix to "ct" as is already the case for some existing macros, this
> will make the introduction of new functions easier.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking.h | 12 ++++++------
>  kernel/context_tracking.c        | 20 ++++++++++----------
>  2 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 7a14807c9d1a..773035124bad 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -14,8 +14,8 @@
>  extern void context_tracking_cpu_set(int cpu);
>  
>  /* Called with interrupts disabled.  */
> -extern void __context_tracking_enter(enum ctx_state state);
> -extern void __context_tracking_exit(enum ctx_state state);
> +extern void __ct_user_enter(enum ctx_state state);
> +extern void __ct_user_exit(enum ctx_state state);
>  
>  extern void context_tracking_enter(enum ctx_state state);
>  extern void context_tracking_exit(enum ctx_state state);
> @@ -38,13 +38,13 @@ static inline void user_exit(void)
>  static __always_inline void user_enter_irqoff(void)
>  {
>  	if (context_tracking_enabled())
> -		__context_tracking_enter(CONTEXT_USER);
> +		__ct_user_enter(CONTEXT_USER);
>  
>  }
>  static __always_inline void user_exit_irqoff(void)
>  {
>  	if (context_tracking_enabled())
> -		__context_tracking_exit(CONTEXT_USER);
> +		__ct_user_exit(CONTEXT_USER);
>  }
>  
>  static inline enum ctx_state exception_enter(void)
> @@ -74,7 +74,7 @@ static inline void exception_exit(enum ctx_state prev_ctx)
>  static __always_inline bool context_tracking_guest_enter(void)
>  {
>  	if (context_tracking_enabled())
> -		__context_tracking_enter(CONTEXT_GUEST);
> +		__ct_user_enter(CONTEXT_GUEST);
>  
>  	return context_tracking_enabled_this_cpu();
>  }
> @@ -82,7 +82,7 @@ static __always_inline bool context_tracking_guest_enter(void)
>  static __always_inline void context_tracking_guest_exit(void)
>  {
>  	if (context_tracking_enabled())
> -		__context_tracking_exit(CONTEXT_GUEST);
> +		__ct_user_exit(CONTEXT_GUEST);
>  }
>  
>  /**
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 36a98c48aedc..ad2a973393a6 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -51,15 +51,15 @@ static __always_inline void context_tracking_recursion_exit(void)
>  }
>  
>  /**
> - * context_tracking_enter - Inform the context tracking that the CPU is going
> - *                          enter user or guest space mode.
> + * __ct_user_enter - Inform the context tracking that the CPU is going
> + *		     to enter user or guest space mode.
>   *
>   * This function must be called right before we switch from the kernel
>   * to user or guest space, when it's guaranteed the remaining kernel
>   * instructions to execute won't use any RCU read side critical section
>   * because this function sets RCU in extended quiescent state.
>   */
> -void noinstr __context_tracking_enter(enum ctx_state state)
> +void noinstr __ct_user_enter(enum ctx_state state)
>  {
>  	/* Kernel threads aren't supposed to go to userspace */
>  	WARN_ON_ONCE(!current->mm);
> @@ -101,7 +101,7 @@ void noinstr __context_tracking_enter(enum ctx_state state)
>  	}
>  	context_tracking_recursion_exit();
>  }
> -EXPORT_SYMBOL_GPL(__context_tracking_enter);
> +EXPORT_SYMBOL_GPL(__ct_user_enter);
>  
>  void context_tracking_enter(enum ctx_state state)
>  {
> @@ -119,7 +119,7 @@ void context_tracking_enter(enum ctx_state state)
>  		return;
>  
>  	local_irq_save(flags);
> -	__context_tracking_enter(state);
> +	__ct_user_enter(state);
>  	local_irq_restore(flags);
>  }
>  NOKPROBE_SYMBOL(context_tracking_enter);
> @@ -132,8 +132,8 @@ void context_tracking_user_enter(void)
>  NOKPROBE_SYMBOL(context_tracking_user_enter);
>  
>  /**
> - * context_tracking_exit - Inform the context tracking that the CPU is
> - *                         exiting user or guest mode and entering the kernel.
> + * __ct_user_exit - Inform the context tracking that the CPU is
> + * 		    exiting user or guest mode and entering the kernel.
>   *
>   * This function must be called after we entered the kernel from user or
>   * guest space before any use of RCU read side critical section. This
> @@ -143,7 +143,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
>   * This call supports re-entrancy. This way it can be called from any exception
>   * handler without needing to know if we came from userspace or not.
>   */
> -void noinstr __context_tracking_exit(enum ctx_state state)
> +void noinstr __ct_user_exit(enum ctx_state state)
>  {
>  	if (!context_tracking_recursion_enter())
>  		return;
> @@ -166,7 +166,7 @@ void noinstr __context_tracking_exit(enum ctx_state state)
>  	}
>  	context_tracking_recursion_exit();
>  }
> -EXPORT_SYMBOL_GPL(__context_tracking_exit);
> +EXPORT_SYMBOL_GPL(__ct_user_exit);
>  
>  void context_tracking_exit(enum ctx_state state)
>  {
> @@ -176,7 +176,7 @@ void context_tracking_exit(enum ctx_state state)
>  		return;
>  
>  	local_irq_save(flags);
> -	__context_tracking_exit(state);
> +	__ct_user_exit(state);
>  	local_irq_restore(flags);
>  }
>  NOKPROBE_SYMBOL(context_tracking_exit);
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 05/19] context_tracking: Split user tracking Kconfig
  2022-03-02 15:47 ` [PATCH 05/19] context_tracking: Split user tracking Kconfig Frederic Weisbecker
@ 2022-03-10 19:43   ` Paul E. McKenney
  2022-03-11 15:49     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 19:43 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:56PM +0100, Frederic Weisbecker wrote:
> Context tracking is going to be used not only to track user transitions
> but also idle/IRQs/NMIs. The user tracking part will then become a
> seperate feature. Prepare Kconfig for that.

s/seperate/separate/ # nit

> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  .../time/context-tracking/arch-support.txt    |  6 ++---
>  arch/Kconfig                                  |  4 ++--
>  arch/arm/Kconfig                              |  2 +-
>  arch/arm/kernel/entry-common.S                |  4 ++--
>  arch/arm/kernel/entry-header.S                |  4 ++--
>  arch/arm64/Kconfig                            |  2 +-
>  arch/csky/Kconfig                             |  2 +-
>  arch/csky/kernel/entry.S                      |  4 ++--
>  arch/mips/Kconfig                             |  2 +-
>  arch/powerpc/Kconfig                          |  2 +-
>  arch/powerpc/include/asm/context_tracking.h   |  2 +-
>  arch/riscv/Kconfig                            |  2 +-
>  arch/riscv/kernel/entry.S                     |  6 ++---
>  arch/sparc/Kconfig                            |  2 +-
>  arch/sparc/kernel/rtrap_64.S                  |  2 +-
>  arch/x86/Kconfig                              |  4 ++--
>  include/linux/context_tracking.h              | 12 +++++-----
>  include/linux/context_tracking_state.h        |  4 ++--
>  init/Kconfig                                  |  4 ++--
>  kernel/context_tracking.c                     |  6 ++++-
>  kernel/sched/core.c                           |  2 +-
>  kernel/time/Kconfig                           | 22 +++++++++++--------
>  22 files changed, 54 insertions(+), 46 deletions(-)
> 
> diff --git a/Documentation/features/time/context-tracking/arch-support.txt b/Documentation/features/time/context-tracking/arch-support.txt
> index 4ed116c2ec39..0696fd08429e 100644
> --- a/Documentation/features/time/context-tracking/arch-support.txt
> +++ b/Documentation/features/time/context-tracking/arch-support.txt
> @@ -1,7 +1,7 @@
>  #
> -# Feature name:          context-tracking
> -#         Kconfig:       HAVE_CONTEXT_TRACKING
> -#         description:   arch supports context tracking for NO_HZ_FULL
> +# Feature name:          user-context-tracking
> +#         Kconfig:       HAVE_CONTEXT_TRACKING_USER
> +#         description:   arch supports user context tracking for NO_HZ_FULL
>  #
>      -----------------------
>      |         arch |status|
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 678a80713b21..1a3b79cfc9e3 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -762,7 +762,7 @@ config HAVE_ARCH_WITHIN_STACK_FRAMES
>  	  and similar) by implementing an inline arch_within_stack_frames(),
>  	  which is used by CONFIG_HARDENED_USERCOPY.
>  
> -config HAVE_CONTEXT_TRACKING
> +config HAVE_CONTEXT_TRACKING_USER

Just checking...  This means that only some configs will see userland
execution as being different than kernel execution, correct?  (Which
is the case today, to be fair.)

							Thanx, Paul

>  	bool
>  	help
>  	  Provide kernel/user boundaries probes necessary for subsystems
> @@ -773,7 +773,7 @@ config HAVE_CONTEXT_TRACKING
>  	  protected inside rcu_irq_enter/rcu_irq_exit() but preemption or signal
>  	  handling on irq exit still need to be protected.
>  
> -config HAVE_CONTEXT_TRACKING_OFFSTACK
> +config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
>  	bool
>  	help
>  	  Architecture neither relies on exception_enter()/exception_exit()
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index fabe39169b12..2c5688f20421 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -81,7 +81,7 @@ config ARM
>  	select HAVE_ARCH_TRANSPARENT_HUGEPAGE if ARM_LPAE
>  	select HAVE_ARM_SMCCC if CPU_V7
>  	select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32
> -	select HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_C_RECORDMCOUNT
>  	select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL
>  	select HAVE_DMA_CONTIGUOUS if MMU
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index ac86c34682bb..5be34b7fe41e 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -26,7 +26,7 @@
>  #include "entry-header.S"
>  
>  saved_psr	.req	r8
> -#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING)
> +#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING_USER)
>  saved_pc	.req	r9
>  #define TRACE(x...) x
>  #else
> @@ -36,7 +36,7 @@ saved_pc	.req	lr
>  
>  	.section .entry.text,"ax",%progbits
>  	.align	5
> -#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING) || \
> +#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING_USER) || \
>  	IS_ENABLED(CONFIG_DEBUG_RSEQ))
>  /*
>   * This is the fast syscall return path.  We do as little as possible here,
> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> index 3af2a521e1d6..cd1ce0a9c652 100644
> --- a/arch/arm/kernel/entry-header.S
> +++ b/arch/arm/kernel/entry-header.S
> @@ -361,7 +361,7 @@
>   * between user and kernel mode.
>   */
>  	.macro ct_user_exit, save = 1
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	.if	\save
>  	stmdb   sp!, {r0-r3, ip, lr}
>  	bl	user_exit_callable
> @@ -373,7 +373,7 @@
>  	.endm
>  
>  	.macro ct_user_enter, save = 1
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	.if	\save
>  	stmdb   sp!, {r0-r3, ip, lr}
>  	bl	user_enter_callable
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6978140edfa4..96e75d7fa0a3 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -169,7 +169,7 @@ config ARM64
>  	select HAVE_C_RECORDMCOUNT
>  	select HAVE_CMPXCHG_DOUBLE
>  	select HAVE_CMPXCHG_LOCAL
> -	select HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_DEBUG_KMEMLEAK
>  	select HAVE_DMA_CONTIGUOUS
>  	select HAVE_DYNAMIC_FTRACE
> diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
> index 132f43f12dd8..c94cc907b828 100644
> --- a/arch/csky/Kconfig
> +++ b/arch/csky/Kconfig
> @@ -42,7 +42,7 @@ config CSKY
>  	select HAVE_ARCH_AUDITSYSCALL
>  	select HAVE_ARCH_MMAP_RND_BITS
>  	select HAVE_ARCH_SECCOMP_FILTER
> -	select HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_VIRT_CPU_ACCOUNTING_GEN
>  	select HAVE_DEBUG_BUGVERBOSE
>  	select HAVE_DEBUG_KMEMLEAK
> diff --git a/arch/csky/kernel/entry.S b/arch/csky/kernel/entry.S
> index bc734d17c16f..547b4cd1b24b 100644
> --- a/arch/csky/kernel/entry.S
> +++ b/arch/csky/kernel/entry.S
> @@ -19,7 +19,7 @@
>  .endm
>  
>  .macro	context_tracking
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	mfcr	a0, epsr
>  	btsti	a0, 31
>  	bt	1f
> @@ -159,7 +159,7 @@ ret_from_exception:
>  	and	r10, r9
>  	cmpnei	r10, 0
>  	bt	exit_work
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	jbsr	user_enter_callable
>  #endif
>  1:
> diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
> index 058446f01487..efcab39667ea 100644
> --- a/arch/mips/Kconfig
> +++ b/arch/mips/Kconfig
> @@ -55,7 +55,7 @@ config MIPS
>  	select HAVE_ARCH_TRACEHOOK
>  	select HAVE_ARCH_TRANSPARENT_HUGEPAGE if CPU_SUPPORTS_HUGEPAGES
>  	select HAVE_ASM_MODVERSIONS
> -	select HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_TIF_NOHZ
>  	select HAVE_C_RECORDMCOUNT
>  	select HAVE_DEBUG_KMEMLEAK
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b779603978e1..9a889f919fed 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -192,7 +192,7 @@ config PPC
>  	select HAVE_ARCH_SECCOMP_FILTER
>  	select HAVE_ARCH_TRACEHOOK
>  	select HAVE_ASM_MODVERSIONS
> -	select HAVE_CONTEXT_TRACKING		if PPC64
> +	select HAVE_CONTEXT_TRACKING_USER		if PPC64
>  	select HAVE_C_RECORDMCOUNT
>  	select HAVE_DEBUG_KMEMLEAK
>  	select HAVE_DEBUG_STACKOVERFLOW
> diff --git a/arch/powerpc/include/asm/context_tracking.h b/arch/powerpc/include/asm/context_tracking.h
> index f2682b28b050..4b63931c49e0 100644
> --- a/arch/powerpc/include/asm/context_tracking.h
> +++ b/arch/powerpc/include/asm/context_tracking.h
> @@ -2,7 +2,7 @@
>  #ifndef _ASM_POWERPC_CONTEXT_TRACKING_H
>  #define _ASM_POWERPC_CONTEXT_TRACKING_H
>  
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  #define SCHEDULE_USER bl	schedule_user
>  #else
>  #define SCHEDULE_USER bl	schedule
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 5adcbd9b5e88..36953ec26294 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -80,7 +80,7 @@ config RISCV
>  	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
>  	select HAVE_ARCH_VMAP_STACK if MMU && 64BIT
>  	select HAVE_ASM_MODVERSIONS
> -	select HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_DEBUG_KMEMLEAK
>  	select HAVE_DMA_CONTIGUOUS if MMU
>  	select HAVE_EBPF_JIT if MMU
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 5fbaa7be18a2..a773526fb3cc 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -111,7 +111,7 @@ _save_context:
>  	call trace_hardirqs_off
>  #endif
>  
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	/* If previous state is in user mode, call user_exit_callable(). */
>  	li   a0, SR_PP
>  	and a0, s1, a0
> @@ -176,7 +176,7 @@ handle_syscall:
>  	 */
>  	csrs CSR_STATUS, SR_IE
>  #endif
> -#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING)
> +#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING_USER)
>  	/* Recover a0 - a7 for system calls */
>  	REG_L a0, PT_A0(sp)
>  	REG_L a1, PT_A1(sp)
> @@ -251,7 +251,7 @@ resume_userspace:
>  	andi s1, s0, _TIF_WORK_MASK
>  	bnez s1, work_pending
>  
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	call user_enter_callable
>  #endif
>  
> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
> index 1cab1b284f1a..e736120f4333 100644
> --- a/arch/sparc/Kconfig
> +++ b/arch/sparc/Kconfig
> @@ -71,7 +71,7 @@ config SPARC64
>  	select HAVE_DYNAMIC_FTRACE
>  	select HAVE_FTRACE_MCOUNT_RECORD
>  	select HAVE_SYSCALL_TRACEPOINTS
> -	select HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_TIF_NOHZ
>  	select HAVE_DEBUG_KMEMLEAK
>  	select IOMMU_HELPER
> diff --git a/arch/sparc/kernel/rtrap_64.S b/arch/sparc/kernel/rtrap_64.S
> index c5fd4b450d9b..eef102765a7e 100644
> --- a/arch/sparc/kernel/rtrap_64.S
> +++ b/arch/sparc/kernel/rtrap_64.S
> @@ -15,7 +15,7 @@
>  #include <asm/visasm.h>
>  #include <asm/processor.h>
>  
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  # define SCHEDULE_USER schedule_user
>  #else
>  # define SCHEDULE_USER schedule
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index ebe8fc76949a..fbda20f6cf08 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -182,8 +182,8 @@ config X86
>  	select HAVE_ASM_MODVERSIONS
>  	select HAVE_CMPXCHG_DOUBLE
>  	select HAVE_CMPXCHG_LOCAL
> -	select HAVE_CONTEXT_TRACKING		if X86_64
> -	select HAVE_CONTEXT_TRACKING_OFFSTACK	if HAVE_CONTEXT_TRACKING
> +	select HAVE_CONTEXT_TRACKING_USER		if X86_64
> +	select HAVE_CONTEXT_TRACKING_USER_OFFSTACK	if HAVE_CONTEXT_TRACKING_USER
>  	select HAVE_C_RECORDMCOUNT
>  	select HAVE_OBJTOOL_MCOUNT		if STACK_VALIDATION
>  	select HAVE_DEBUG_KMEMLEAK
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 40badd62ad56..75738f20e111 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -10,7 +10,7 @@
>  #include <asm/ptrace.h>
>  
>  
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  extern void context_tracking_cpu_track_user(int cpu);
>  
>  /* Called with interrupts disabled.  */
> @@ -52,7 +52,7 @@ static inline enum ctx_state exception_enter(void)
>  {
>  	enum ctx_state prev_ctx;
>  
> -	if (IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK) ||
> +	if (IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK) ||
>  	    !context_tracking_enabled())
>  		return 0;
>  
> @@ -65,7 +65,7 @@ static inline enum ctx_state exception_enter(void)
>  
>  static inline void exception_exit(enum ctx_state prev_ctx)
>  {
> -	if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK) &&
> +	if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK) &&
>  	    context_tracking_enabled()) {
>  		if (prev_ctx != CONTEXT_KERNEL)
>  			ct_user_enter(prev_ctx);
> @@ -109,14 +109,14 @@ static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
>  static __always_inline bool context_tracking_guest_enter(void) { return false; }
>  static inline void context_tracking_guest_exit(void) { }
>  
> -#endif /* !CONFIG_CONTEXT_TRACKING */
> +#endif /* !CONFIG_CONTEXT_TRACKING_USER */
>  
>  #define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled() && (cond))
>  
> -#ifdef CONFIG_CONTEXT_TRACKING_FORCE
> +#ifdef CONFIG_CONTEXT_TRACKING_USER_FORCE
>  extern void context_tracking_init(void);
>  #else
>  static inline void context_tracking_init(void) { }
> -#endif /* CONFIG_CONTEXT_TRACKING_FORCE */
> +#endif /* CONFIG_CONTEXT_TRACKING_USER_FORCE */
>  
>  #endif
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 65a60d3313b0..64dbbb880378 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -22,7 +22,7 @@ struct context_tracking {
>  	} state;
>  };
>  
> -#ifdef CONFIG_CONTEXT_TRACKING
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  extern struct static_key_false context_tracking_key;
>  DECLARE_PER_CPU(struct context_tracking, context_tracking);
>  
> @@ -50,6 +50,6 @@ static inline bool context_tracking_in_user(void) { return false; }
>  static inline bool context_tracking_enabled(void) { return false; }
>  static inline bool context_tracking_enabled_cpu(int cpu) { return false; }
>  static inline bool context_tracking_enabled_this_cpu(void) { return false; }
> -#endif /* CONFIG_CONTEXT_TRACKING */
> +#endif /* CONFIG_CONTEXT_TRACKING_USER */
>  
>  #endif
> diff --git a/init/Kconfig b/init/Kconfig
> index e9119bf54b1f..22525443de90 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -498,11 +498,11 @@ config VIRT_CPU_ACCOUNTING_NATIVE
>  
>  config VIRT_CPU_ACCOUNTING_GEN
>  	bool "Full dynticks CPU time accounting"
> -	depends on HAVE_CONTEXT_TRACKING
> +	depends on HAVE_CONTEXT_TRACKING_USER
>  	depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
>  	depends on GENERIC_CLOCKEVENTS
>  	select VIRT_CPU_ACCOUNTING
> -	select CONTEXT_TRACKING
> +	select CONTEXT_TRACKING_USER
>  	help
>  	  Select this option to enable task and CPU time accounting on full
>  	  dynticks systems. This accounting is implemented by watching every
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 7b6643d2075d..42054841af3f 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -22,6 +22,8 @@
>  #include <linux/export.h>
>  #include <linux/kprobes.h>
>  
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
> +
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/context_tracking.h>
>  
> @@ -222,7 +224,7 @@ void __init context_tracking_cpu_track_user(int cpu)
>  	initialized = true;
>  }
>  
> -#ifdef CONFIG_CONTEXT_TRACKING_FORCE
> +#ifdef CONFIG_CONTEXT_TRACKING_USER_FORCE
>  void __init context_tracking_init(void)
>  {
>  	int cpu;
> @@ -231,3 +233,5 @@ void __init context_tracking_init(void)
>  		context_tracking_cpu_track_user(cpu);
>  }
>  #endif
> +
> +#endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 2e4ae00e52d1..e79485afb58c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6398,7 +6398,7 @@ void __sched schedule_idle(void)
>  	} while (need_resched());
>  }
>  
> -#if defined(CONFIG_CONTEXT_TRACKING) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK)
> +#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
>  asmlinkage __visible void __sched schedule_user(void)
>  {
>  	/*
> diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
> index 27b7868b5c30..aad89cc96787 100644
> --- a/kernel/time/Kconfig
> +++ b/kernel/time/Kconfig
> @@ -111,7 +111,7 @@ config NO_HZ_FULL
>  	# NO_HZ_COMMON dependency
>  	# We need at least one periodic CPU for timekeeping
>  	depends on SMP
> -	depends on HAVE_CONTEXT_TRACKING
> +	depends on HAVE_CONTEXT_TRACKING_USER
>  	# VIRT_CPU_ACCOUNTING_GEN dependency
>  	depends on HAVE_VIRT_CPU_ACCOUNTING_GEN
>  	select NO_HZ_COMMON
> @@ -140,28 +140,32 @@ endchoice
>  config CONTEXT_TRACKING
>         bool
>  
> -config CONTEXT_TRACKING_FORCE
> -	bool "Force context tracking"
> -	depends on CONTEXT_TRACKING
> +config CONTEXT_TRACKING_USER
> +       select CONTEXT_TRACKING
> +       bool
> +
> +config CONTEXT_TRACKING_USER_FORCE
> +	bool "Force user context tracking"
> +	depends on CONTEXT_TRACKING_USER
>  	default y if !NO_HZ_FULL
>  	help
>  	  The major pre-requirement for full dynticks to work is to
> -	  support the context tracking subsystem. But there are also
> +	  support the user context tracking subsystem. But there are also
>  	  other dependencies to provide in order to make the full
>  	  dynticks working.
>  
>  	  This option stands for testing when an arch implements the
> -	  context tracking backend but doesn't yet fulfill all the
> +	  user context tracking backend but doesn't yet fulfill all the
>  	  requirements to make the full dynticks feature working.
>  	  Without the full dynticks, there is no way to test the support
> -	  for context tracking and the subsystems that rely on it: RCU
> +	  for user context tracking and the subsystems that rely on it: RCU
>  	  userspace extended quiescent state and tickless cputime
>  	  accounting. This option copes with the absence of the full
> -	  dynticks subsystem by forcing the context tracking on all
> +	  dynticks subsystem by forcing the user context tracking on all
>  	  CPUs in the system.
>  
>  	  Say Y only if you're working on the development of an
> -	  architecture backend for the context tracking.
> +	  architecture backend for the user context tracking.
>  
>  	  Say N otherwise, this option brings an overhead that you
>  	  don't want in production.
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 07/19] context_tracking: Take IRQ eqs entrypoints over RCU
  2022-03-02 15:47 ` [PATCH 07/19] context_tracking: Take IRQ " Frederic Weisbecker
@ 2022-03-10 19:46   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 19:46 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:58PM +0100, Frederic Weisbecker wrote:
> The RCU dynticks counter is going to be merged into the context tracking
> subsystem. Prepare with moving the IRQ extended quiescent states
> entrypoints to context tracking. For now those are dumb redirection to
> existing RCU calls.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  .../RCU/Design/Requirements/Requirements.rst  | 10 ++++----
>  Documentation/RCU/stallwarn.rst               |  4 ++--
>  arch/Kconfig                                  |  2 +-
>  arch/arm64/kernel/entry-common.c              |  6 ++---
>  arch/x86/mm/fault.c                           |  2 +-
>  drivers/cpuidle/cpuidle-psci.c                |  8 +++----
>  include/linux/context_tracking_irq.h          | 17 +++++++++++++
>  include/linux/context_tracking_state.h        |  1 +
>  include/linux/entry-common.h                  | 10 ++++----
>  include/linux/rcupdate.h                      |  5 ++--
>  include/linux/tracepoint.h                    |  4 ++--
>  kernel/context_tracking.c                     | 24 +++++++++++++++++--
>  kernel/cpu_pm.c                               |  8 +++----
>  kernel/entry/common.c                         | 12 +++++-----
>  kernel/softirq.c                              |  4 ++--
>  kernel/trace/trace.c                          |  6 ++---
>  16 files changed, 81 insertions(+), 42 deletions(-)
>  create mode 100644 include/linux/context_tracking_irq.h
> 
> diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
> index ff2be1ac54c4..e3dd5d71c798 100644
> --- a/Documentation/RCU/Design/Requirements/Requirements.rst
> +++ b/Documentation/RCU/Design/Requirements/Requirements.rst
> @@ -1844,10 +1844,10 @@ that meets this requirement.
>  
>  Furthermore, NMI handlers can be interrupted by what appear to RCU to be
>  normal interrupts. One way that this can happen is for code that
> -directly invokes rcu_irq_enter() and rcu_irq_exit() to be called
> +directly invokes ct_irq_enter() and ct_irq_exit() to be called
>  from an NMI handler. This astonishing fact of life prompted the current
> -code structure, which has rcu_irq_enter() invoking
> -rcu_nmi_enter() and rcu_irq_exit() invoking rcu_nmi_exit().
> +code structure, which has ct_irq_enter() invoking
> +rcu_nmi_enter() and ct_irq_exit() invoking rcu_nmi_exit().
>  And yes, I also learned of this requirement the hard way.
>  
>  Loadable Modules
> @@ -2195,7 +2195,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
>     sections, and RCU believes this CPU to be idle, no problem. This
>     sort of thing is used by some architectures for light-weight
>     exception handlers, which can then avoid the overhead of
> -   rcu_irq_enter() and rcu_irq_exit() at exception entry and
> +   ct_irq_enter() and ct_irq_exit() at exception entry and
>     exit, respectively. Some go further and avoid the entireties of
>     irq_enter() and irq_exit().
>     Just make very sure you are running some of your tests with
> @@ -2226,7 +2226,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
>  +-----------------------------------------------------------------------+
>  | **Answer**:                                                           |
>  +-----------------------------------------------------------------------+
> -| One approach is to do ``rcu_irq_exit();rcu_irq_enter();`` every so    |
> +| One approach is to do ``ct_irq_exit();ct_irq_enter();`` every so    |
>  | often. But given that long-running interrupt handlers can cause other |
>  | problems, not least for response time, shouldn't you work to keep     |
>  | your interrupt handler's runtime within reasonable bounds?            |
> diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst
> index bdd52b40f307..7858c3afa1f4 100644
> --- a/Documentation/RCU/stallwarn.rst
> +++ b/Documentation/RCU/stallwarn.rst
> @@ -98,11 +98,11 @@ warnings:
>  
>  -	A low-level kernel issue that either fails to invoke one of the
>  	variants of rcu_user_enter(), rcu_user_exit(), ct_idle_enter(),
> -	ct_idle_exit(), rcu_irq_enter(), or rcu_irq_exit() on the one
> +	ct_idle_exit(), ct_irq_enter(), or ct_irq_exit() on the one
>  	hand, or that invokes one of them too many times on the other.
>  	Historically, the most frequent issue has been an omission
>  	of either irq_enter() or irq_exit(), which in turn invoke
> -	rcu_irq_enter() or rcu_irq_exit(), respectively.  Building your
> +	ct_irq_enter() or ct_irq_exit(), respectively.  Building your
>  	kernel with CONFIG_RCU_EQS_DEBUG=y can help track down these types
>  	of issues, which sometimes arise in architecture-specific code.
>  
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 1a3b79cfc9e3..66b2b6d4717b 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -770,7 +770,7 @@ config HAVE_CONTEXT_TRACKING_USER
>  	  Syscalls need to be wrapped inside user_exit()-user_enter(), either
>  	  optimized behind static key or through the slow path using TIF_NOHZ
>  	  flag. Exceptions handlers must be wrapped as well. Irqs are already
> -	  protected inside rcu_irq_enter/rcu_irq_exit() but preemption or signal
> +	  protected inside ct_irq_enter/ct_irq_exit() but preemption or signal
>  	  handling on irq exit still need to be protected.
>  
>  config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index ef7fcefb96bd..43ca8cf4e1dd 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -40,7 +40,7 @@ static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)
>  
>  	if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
>  		lockdep_hardirqs_off(CALLER_ADDR0);
> -		rcu_irq_enter();
> +		ct_irq_enter();
>  		trace_hardirqs_off_finish();
>  
>  		regs->exit_rcu = true;
> @@ -74,7 +74,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
>  		if (regs->exit_rcu) {
>  			trace_hardirqs_on_prepare();
>  			lockdep_hardirqs_on_prepare(CALLER_ADDR0);
> -			rcu_irq_exit();
> +			ct_irq_exit();
>  			lockdep_hardirqs_on(CALLER_ADDR0);
>  			return;
>  		}
> @@ -82,7 +82,7 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
>  		trace_hardirqs_on();
>  	} else {
>  		if (regs->exit_rcu)
> -			rcu_irq_exit();
> +			ct_irq_exit();
>  	}
>  }
>  
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index d0074c6ed31a..b781785b1ff3 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1526,7 +1526,7 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
>  
>  	/*
>  	 * Entry handling for valid #PF from kernel mode is slightly
> -	 * different: RCU is already watching and rcu_irq_enter() must not
> +	 * different: RCU is already watching and ct_irq_enter() must not
>  	 * be invoked because a kernel fault on a user space address might
>  	 * sleep.
>  	 *
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index b51b5df08450..fe31b2d522b3 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -68,12 +68,12 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev,
>  		return -1;
>  
>  	/* Do runtime PM to manage a hierarchical CPU toplogy. */
> -	rcu_irq_enter_irqson();
> +	ct_irq_enter_irqson();
>  	if (s2idle)
>  		dev_pm_genpd_suspend(pd_dev);
>  	else
>  		pm_runtime_put_sync_suspend(pd_dev);
> -	rcu_irq_exit_irqson();
> +	ct_irq_exit_irqson();
>  
>  	state = psci_get_domain_state();
>  	if (!state)
> @@ -81,12 +81,12 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev,
>  
>  	ret = psci_cpu_suspend_enter(state) ? -1 : idx;
>  
> -	rcu_irq_enter_irqson();
> +	ct_irq_enter_irqson();
>  	if (s2idle)
>  		dev_pm_genpd_resume(pd_dev);
>  	else
>  		pm_runtime_get_sync(pd_dev);
> -	rcu_irq_exit_irqson();
> +	ct_irq_exit_irqson();
>  
>  	cpu_pm_exit();
>  
> diff --git a/include/linux/context_tracking_irq.h b/include/linux/context_tracking_irq.h
> new file mode 100644
> index 000000000000..60e3ed15a04e
> --- /dev/null
> +++ b/include/linux/context_tracking_irq.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_CONTEXT_TRACKING_IRQ_H
> +#define _LINUX_CONTEXT_TRACKING_IRQ_H
> +
> +#ifdef CONFIG_CONTEXT_TRACKING
> +void ct_irq_enter(void);
> +void ct_irq_exit(void);
> +void ct_irq_enter_irqson(void);
> +void ct_irq_exit_irqson(void);
> +#else
> +static inline void ct_irq_enter(void) { }
> +static inline void ct_irq_exit(void) { }
> +static inline void ct_irq_enter_irqson(void) { }
> +static inline void ct_irq_exit_irqson(void) { }
> +#endif
> +
> +#endif
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 64dbbb880378..cdc692caa01d 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -4,6 +4,7 @@
>  
>  #include <linux/percpu.h>
>  #include <linux/static_key.h>
> +#include <linux/context_tracking_irq.h>
>  
>  struct context_tracking {
>  	/*
> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
> index 2e2b8d6140ed..7c6b1d864448 100644
> --- a/include/linux/entry-common.h
> +++ b/include/linux/entry-common.h
> @@ -396,7 +396,7 @@ void irqentry_exit_to_user_mode(struct pt_regs *regs);
>  /**
>   * struct irqentry_state - Opaque object for exception state storage
>   * @exit_rcu: Used exclusively in the irqentry_*() calls; signals whether the
> - *            exit path has to invoke rcu_irq_exit().
> + *            exit path has to invoke ct_irq_exit().
>   * @lockdep: Used exclusively in the irqentry_nmi_*() calls; ensures that
>   *           lockdep state is restored correctly on exit from nmi.
>   *
> @@ -434,12 +434,12 @@ typedef struct irqentry_state {
>   *
>   * For kernel mode entries RCU handling is done conditional. If RCU is
>   * watching then the only RCU requirement is to check whether the tick has
> - * to be restarted. If RCU is not watching then rcu_irq_enter() has to be
> - * invoked on entry and rcu_irq_exit() on exit.
> + * to be restarted. If RCU is not watching then ct_irq_enter() has to be
> + * invoked on entry and ct_irq_exit() on exit.
>   *
> - * Avoiding the rcu_irq_enter/exit() calls is an optimization but also
> + * Avoiding the ct_irq_enter/exit() calls is an optimization but also
>   * solves the problem of kernel mode pagefaults which can schedule, which
> - * is not possible after invoking rcu_irq_enter() without undoing it.
> + * is not possible after invoking ct_irq_enter() without undoing it.
>   *
>   * For user mode entries irqentry_enter_from_user_mode() is invoked to
>   * establish the proper context for NOHZ_FULL. Otherwise scheduling on exit
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 38258542a6c3..5efba2bfa689 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -29,6 +29,7 @@
>  #include <linux/lockdep.h>
>  #include <asm/processor.h>
>  #include <linux/cpumask.h>
> +#include <linux/context_tracking_irq.h>
>  
>  #define ULONG_CMP_GE(a, b)	(ULONG_MAX / 2 >= (a) - (b))
>  #define ULONG_CMP_LT(a, b)	(ULONG_MAX / 2 < (a) - (b))
> @@ -143,9 +144,9 @@ static inline void rcu_nocb_flush_deferred_wakeup(void) { }
>   */
>  #define RCU_NONIDLE(a) \
>  	do { \
> -		rcu_irq_enter_irqson(); \
> +		ct_irq_enter_irqson(); \
>  		do { a; } while (0); \
> -		rcu_irq_exit_irqson(); \
> +		ct_irq_exit_irqson(); \
>  	} while (0)
>  
>  /*
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index 28031b15f878..55717a2eda08 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -200,13 +200,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
>  		 */							\
>  		if (rcuidle) {						\
>  			__idx = srcu_read_lock_notrace(&tracepoint_srcu);\
> -			rcu_irq_enter_irqson();				\
> +			ct_irq_enter_irqson();				\
>  		}							\
>  									\
>  		__DO_TRACE_CALL(name, TP_ARGS(args));			\
>  									\
>  		if (rcuidle) {						\
> -			rcu_irq_exit_irqson();				\
> +			ct_irq_exit_irqson();				\
>  			srcu_read_unlock_notrace(&tracepoint_srcu, __idx);\
>  		}							\
>  									\
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 3d479f363275..b63ff851472e 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -75,7 +75,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
>  			 * At this stage, only low level arch entry code remains and
>  			 * then we'll run in userspace. We can assume there won't be
>  			 * any RCU read-side critical section until the next call to
> -			 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
> +			 * user_exit() or ct_irq_enter(). Let's remove RCU's dependency
>  			 * on the tick.
>  			 */
>  			if (state == CONTEXT_USER) {
> @@ -112,7 +112,7 @@ void ct_user_enter(enum ctx_state state)
>  	/*
>  	 * Some contexts may involve an exception occuring in an irq,
>  	 * leading to that nesting:
> -	 * rcu_irq_enter() rcu_user_exit() rcu_user_exit() rcu_irq_exit()
> +	 * ct_irq_enter() rcu_user_exit() rcu_user_exit() ct_irq_exit()
>  	 * This would mess up the dyntick_nesting count though. And rcu_irq_*()
>  	 * helpers are enough to protect RCU uses inside the exception. So
>  	 * just return immediately if we detect we are in an IRQ.
> @@ -247,3 +247,23 @@ void ct_idle_exit(void)
>  	rcu_idle_exit();
>  }
>  EXPORT_SYMBOL_GPL(ct_idle_exit);
> +
> +noinstr void ct_irq_enter(void)
> +{
> +	rcu_irq_enter();
> +}
> +
> +noinstr void ct_irq_exit(void)
> +{
> +	rcu_irq_exit();
> +}
> +
> +void ct_irq_enter_irqson(void)
> +{
> +	rcu_irq_enter_irqson();
> +}
> +
> +void ct_irq_exit_irqson(void)
> +{
> +	rcu_irq_exit_irqson();
> +}
> diff --git a/kernel/cpu_pm.c b/kernel/cpu_pm.c
> index 246efc74e3f3..ba4ba71facf9 100644
> --- a/kernel/cpu_pm.c
> +++ b/kernel/cpu_pm.c
> @@ -35,11 +35,11 @@ static int cpu_pm_notify(enum cpu_pm_event event)
>  	 * disfunctional in cpu idle. Copy RCU_NONIDLE code to let RCU know
>  	 * this.
>  	 */
> -	rcu_irq_enter_irqson();
> +	ct_irq_enter_irqson();
>  	rcu_read_lock();
>  	ret = raw_notifier_call_chain(&cpu_pm_notifier.chain, event, NULL);
>  	rcu_read_unlock();
> -	rcu_irq_exit_irqson();
> +	ct_irq_exit_irqson();
>  
>  	return notifier_to_errno(ret);
>  }
> @@ -49,11 +49,11 @@ static int cpu_pm_notify_robust(enum cpu_pm_event event_up, enum cpu_pm_event ev
>  	unsigned long flags;
>  	int ret;
>  
> -	rcu_irq_enter_irqson();
> +	ct_irq_enter_irqson();
>  	raw_spin_lock_irqsave(&cpu_pm_notifier.lock, flags);
>  	ret = raw_notifier_call_chain_robust(&cpu_pm_notifier.chain, event_up, event_down, NULL);
>  	raw_spin_unlock_irqrestore(&cpu_pm_notifier.lock, flags);
> -	rcu_irq_exit_irqson();
> +	ct_irq_exit_irqson();
>  
>  	return notifier_to_errno(ret);
>  }
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index bad713684c2e..cebc98b8adc6 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -327,7 +327,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
>  	}
>  
>  	/*
> -	 * If this entry hit the idle task invoke rcu_irq_enter() whether
> +	 * If this entry hit the idle task invoke ct_irq_enter() whether
>  	 * RCU is watching or not.
>  	 *
>  	 * Interrupts can nest when the first interrupt invokes softirq
> @@ -338,12 +338,12 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
>  	 * not nested into another interrupt.
>  	 *
>  	 * Checking for rcu_is_watching() here would prevent the nesting
> -	 * interrupt to invoke rcu_irq_enter(). If that nested interrupt is
> +	 * interrupt to invoke ct_irq_enter(). If that nested interrupt is
>  	 * the tick then rcu_flavor_sched_clock_irq() would wrongfully
>  	 * assume that it is the first interrupt and eventually claim
>  	 * quiescent state and end grace periods prematurely.
>  	 *
> -	 * Unconditionally invoke rcu_irq_enter() so RCU state stays
> +	 * Unconditionally invoke ct_irq_enter() so RCU state stays
>  	 * consistent.
>  	 *
>  	 * TINY_RCU does not support EQS, so let the compiler eliminate
> @@ -356,7 +356,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
>  		 * as in irqentry_enter_from_user_mode().
>  		 */
>  		lockdep_hardirqs_off(CALLER_ADDR0);
> -		rcu_irq_enter();
> +		ct_irq_enter();
>  		instrumentation_begin();
>  		trace_hardirqs_off_finish();
>  		instrumentation_end();
> @@ -414,7 +414,7 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state)
>  			trace_hardirqs_on_prepare();
>  			lockdep_hardirqs_on_prepare(CALLER_ADDR0);
>  			instrumentation_end();
> -			rcu_irq_exit();
> +			ct_irq_exit();
>  			lockdep_hardirqs_on(CALLER_ADDR0);
>  			return;
>  		}
> @@ -436,7 +436,7 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state)
>  		 * was not watching on entry.
>  		 */
>  		if (state.exit_rcu)
> -			rcu_irq_exit();
> +			ct_irq_exit();
>  	}
>  }
>  
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 41f470929e99..7b6761c1a0f3 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -607,7 +607,7 @@ void irq_enter_rcu(void)
>   */
>  void irq_enter(void)
>  {
> -	rcu_irq_enter();
> +	ct_irq_enter();
>  	irq_enter_rcu();
>  }
>  
> @@ -659,7 +659,7 @@ void irq_exit_rcu(void)
>  void irq_exit(void)
>  {
>  	__irq_exit_rcu();
> -	rcu_irq_exit();
> +	ct_irq_exit();
>  	 /* must be last! */
>  	lockdep_hardirq_exit();
>  }
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index a569a0cb81ee..7c500c708180 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -3088,15 +3088,15 @@ void __trace_stack(struct trace_array *tr, unsigned int trace_ctx,
>  	/*
>  	 * When an NMI triggers, RCU is enabled via rcu_nmi_enter(),
>  	 * but if the above rcu_is_watching() failed, then the NMI
> -	 * triggered someplace critical, and rcu_irq_enter() should
> +	 * triggered someplace critical, and ct_irq_enter() should
>  	 * not be called from NMI.
>  	 */
>  	if (unlikely(in_nmi()))
>  		return;
>  
> -	rcu_irq_enter_irqson();
> +	ct_irq_enter_irqson();
>  	__ftrace_trace_stack(buffer, trace_ctx, skip, NULL);
> -	rcu_irq_exit_irqson();
> +	ct_irq_exit_irqson();
>  }
>  
>  /**
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 08/19] context_tracking: Take NMI eqs entrypoints over RCU
  2022-03-02 15:47 ` [PATCH 08/19] context_tracking: Take NMI " Frederic Weisbecker
@ 2022-03-10 19:47   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 19:47 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:47:59PM +0100, Frederic Weisbecker wrote:
> The RCU dynticks counter is going to be merged into the context tracking
> subsystem. Prepare with moving the NMI extended quiescent states
> entrypoints to context tracking. For now those are dumb redirection to
> existing RCU calls.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  Documentation/RCU/Design/Requirements/Requirements.rst |  2 +-
>  arch/Kconfig                                           |  2 +-
>  arch/arm64/kernel/entry-common.c                       |  8 ++++----
>  include/linux/context_tracking_irq.h                   |  4 ++++
>  include/linux/hardirq.h                                |  4 ++--
>  kernel/context_tracking.c                              | 10 ++++++++++
>  kernel/entry/common.c                                  |  4 ++--
>  kernel/extable.c                                       |  4 ++--
>  kernel/trace/trace.c                                   |  2 +-
>  9 files changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
> index e3dd5d71c798..256cf260e864 100644
> --- a/Documentation/RCU/Design/Requirements/Requirements.rst
> +++ b/Documentation/RCU/Design/Requirements/Requirements.rst
> @@ -1847,7 +1847,7 @@ normal interrupts. One way that this can happen is for code that
>  directly invokes ct_irq_enter() and ct_irq_exit() to be called
>  from an NMI handler. This astonishing fact of life prompted the current
>  code structure, which has ct_irq_enter() invoking
> -rcu_nmi_enter() and ct_irq_exit() invoking rcu_nmi_exit().
> +ct_nmi_enter() and ct_irq_exit() invoking ct_nmi_exit().
>  And yes, I also learned of this requirement the hard way.
>  
>  Loadable Modules
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 66b2b6d4717b..c22b8ca0eb01 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -785,7 +785,7 @@ config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
>  
>  	  - Critical entry code isn't preemptible (or better yet:
>  	    not interruptible).
> -	  - No use of RCU read side critical sections, unless rcu_nmi_enter()
> +	  - No use of RCU read side critical sections, unless ct_nmi_enter()
>  	    got called.
>  	  - No use of instrumentation, unless instrumentation_begin() got
>  	    called.
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index 43ca8cf4e1dd..6a1ea28731c8 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -158,7 +158,7 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
>  	__nmi_enter();
>  	lockdep_hardirqs_off(CALLER_ADDR0);
>  	lockdep_hardirq_enter();
> -	rcu_nmi_enter();
> +	ct_nmi_enter();
>  
>  	trace_hardirqs_off_finish();
>  	ftrace_nmi_enter();
> @@ -179,7 +179,7 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs)
>  		lockdep_hardirqs_on_prepare(CALLER_ADDR0);
>  	}
>  
> -	rcu_nmi_exit();
> +	ct_nmi_exit();
>  	lockdep_hardirq_exit();
>  	if (restore)
>  		lockdep_hardirqs_on(CALLER_ADDR0);
> @@ -196,7 +196,7 @@ static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
>  	regs->lockdep_hardirqs = lockdep_hardirqs_enabled();
>  
>  	lockdep_hardirqs_off(CALLER_ADDR0);
> -	rcu_nmi_enter();
> +	ct_nmi_enter();
>  
>  	trace_hardirqs_off_finish();
>  }
> @@ -215,7 +215,7 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs)
>  		lockdep_hardirqs_on_prepare(CALLER_ADDR0);
>  	}
>  
> -	rcu_nmi_exit();
> +	ct_nmi_exit();
>  	if (restore)
>  		lockdep_hardirqs_on(CALLER_ADDR0);
>  }
> diff --git a/include/linux/context_tracking_irq.h b/include/linux/context_tracking_irq.h
> index 60e3ed15a04e..11043bf724b7 100644
> --- a/include/linux/context_tracking_irq.h
> +++ b/include/linux/context_tracking_irq.h
> @@ -7,11 +7,15 @@ void ct_irq_enter(void);
>  void ct_irq_exit(void);
>  void ct_irq_enter_irqson(void);
>  void ct_irq_exit_irqson(void);
> +void ct_nmi_enter(void);
> +void ct_nmi_exit(void);
>  #else
>  static inline void ct_irq_enter(void) { }
>  static inline void ct_irq_exit(void) { }
>  static inline void ct_irq_enter_irqson(void) { }
>  static inline void ct_irq_exit_irqson(void) { }
> +static inline void ct_nmi_enter(void) { }
> +static inline void ct_nmi_exit(void) { }
>  #endif
>  
>  #endif
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index 76878b357ffa..345cdbe9c1b7 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -124,7 +124,7 @@ extern void rcu_nmi_exit(void);
>  	do {							\
>  		__nmi_enter();					\
>  		lockdep_hardirq_enter();			\
> -		rcu_nmi_enter();				\
> +		ct_nmi_enter();				\
>  		instrumentation_begin();			\
>  		ftrace_nmi_enter();				\
>  		instrumentation_end();				\
> @@ -143,7 +143,7 @@ extern void rcu_nmi_exit(void);
>  		instrumentation_begin();			\
>  		ftrace_nmi_exit();				\
>  		instrumentation_end();				\
> -		rcu_nmi_exit();					\
> +		ct_nmi_exit();					\
>  		lockdep_hardirq_exit();				\
>  		__nmi_exit();					\
>  	} while (0)
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index b63ff851472e..1686cd528966 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -267,3 +267,13 @@ void ct_irq_exit_irqson(void)
>  {
>  	rcu_irq_exit_irqson();
>  }
> +
> +noinstr void ct_nmi_enter(void)
> +{
> +	rcu_nmi_enter();
> +}
> +
> +noinstr void ct_nmi_exit(void)
> +{
> +	rcu_nmi_exit();
> +}
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index cebc98b8adc6..08230507793f 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -449,7 +449,7 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs)
>  	__nmi_enter();
>  	lockdep_hardirqs_off(CALLER_ADDR0);
>  	lockdep_hardirq_enter();
> -	rcu_nmi_enter();
> +	ct_nmi_enter();
>  
>  	instrumentation_begin();
>  	trace_hardirqs_off_finish();
> @@ -469,7 +469,7 @@ void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state)
>  	}
>  	instrumentation_end();
>  
> -	rcu_nmi_exit();
> +	ct_nmi_exit();
>  	lockdep_hardirq_exit();
>  	if (irq_state.lockdep)
>  		lockdep_hardirqs_on(CALLER_ADDR0);
> diff --git a/kernel/extable.c b/kernel/extable.c
> index b6f330f0fe74..88d4d739c5a1 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -113,7 +113,7 @@ int kernel_text_address(unsigned long addr)
>  
>  	/* Treat this like an NMI as it can happen anywhere */
>  	if (no_rcu)
> -		rcu_nmi_enter();
> +		ct_nmi_enter();
>  
>  	if (is_module_text_address(addr))
>  		goto out;
> @@ -126,7 +126,7 @@ int kernel_text_address(unsigned long addr)
>  	ret = 0;
>  out:
>  	if (no_rcu)
> -		rcu_nmi_exit();
> +		ct_nmi_exit();
>  
>  	return ret;
>  }
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 7c500c708180..9434da82af8a 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -3086,7 +3086,7 @@ void __trace_stack(struct trace_array *tr, unsigned int trace_ctx,
>  	}
>  
>  	/*
> -	 * When an NMI triggers, RCU is enabled via rcu_nmi_enter(),
> +	 * When an NMI triggers, RCU is enabled via ct_nmi_enter(),
>  	 * but if the above rcu_is_watching() failed, then the NMI
>  	 * triggered someplace critical, and ct_irq_enter() should
>  	 * not be called from NMI.
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/19] rcu/context_tracking: Move dynticks counter to context tracking
  2022-03-02 15:48 ` [PATCH 10/19] rcu/context_tracking: Move dynticks counter to context tracking Frederic Weisbecker
@ 2022-03-10 20:00   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:00 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:01PM +0100, Frederic Weisbecker wrote:
> In order to prepare for merging RCU dynticks counter into the context
> tracking state, move the rcu_data's dynticks field to the context
> tracking structure. It will later be mixed within the context tracking
> state itself.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Assuming that the context_tracking_state.h definitions get everywhere
they need to go in all configurations:

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking_state.h | 10 ++++-
>  kernel/context_tracking.c              |  9 ++--
>  kernel/rcu/tree.c                      | 59 ++++++++++++++------------
>  kernel/rcu/tree.h                      |  1 -
>  kernel/rcu/tree_exp.h                  |  2 +-
>  kernel/rcu/tree_stall.h                |  4 +-
>  6 files changed, 48 insertions(+), 37 deletions(-)
> 
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index cdc692caa01d..5ad0e481c5a3 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -7,6 +7,7 @@
>  #include <linux/context_tracking_irq.h>
>  
>  struct context_tracking {
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
>  	/*
>  	 * When active is false, probes are unset in order
>  	 * to minimize overhead: TIF flags are cleared
> @@ -21,11 +22,16 @@ struct context_tracking {
>  		CONTEXT_USER,
>  		CONTEXT_GUEST,
>  	} state;
> +#endif
> +	atomic_t dynticks;		/* Even value for idle, else odd. */
>  };
>  
> -#ifdef CONFIG_CONTEXT_TRACKING_USER
> -extern struct static_key_false context_tracking_key;
> +#ifdef CONFIG_CONTEXT_TRACKING
>  DECLARE_PER_CPU(struct context_tracking, context_tracking);
> +#endif
> +
> +#ifdef CONFIG_CONTEXT_TRACKING_USER
> +extern struct static_key_false context_tracking_key;
>  
>  static __always_inline bool context_tracking_enabled(void)
>  {
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index ea22eb04750f..77b61a7c9890 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -30,9 +30,6 @@
>  DEFINE_STATIC_KEY_FALSE(context_tracking_key);
>  EXPORT_SYMBOL_GPL(context_tracking_key);
>  
> -DEFINE_PER_CPU(struct context_tracking, context_tracking);
> -EXPORT_SYMBOL_GPL(context_tracking);
> -
>  static noinstr bool context_tracking_recursion_enter(void)
>  {
>  	int recursion;
> @@ -236,6 +233,12 @@ void __init context_tracking_init(void)
>  
>  #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
>  
> +DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
> +		.dynticks = ATOMIC_INIT(1),
> +};
> +EXPORT_SYMBOL_GPL(context_tracking);
> +
> +
>  void ct_idle_enter(void)
>  {
>  	rcu_idle_enter();
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index cadf5f5a4700..96eb8503f28e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -77,7 +77,6 @@
>  static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
>  	.dynticks_nesting = 1,
>  	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
> -	.dynticks = ATOMIC_INIT(1),
>  #ifdef CONFIG_RCU_NOCB_CPU
>  	.cblist.flags = SEGCBLIST_RCU_CORE,
>  #endif
> @@ -268,7 +267,7 @@ void rcu_softirq_qs(void)
>   */
>  static noinline noinstr unsigned long rcu_dynticks_inc(int incby)
>  {
> -	return arch_atomic_add_return(incby, this_cpu_ptr(&rcu_data.dynticks));
> +	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
>  }
>  
>  /*
> @@ -324,9 +323,9 @@ static noinstr void rcu_dynticks_eqs_exit(void)
>   */
>  static void rcu_dynticks_eqs_online(void)
>  {
> -	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  
> -	if (atomic_read(&rdp->dynticks) & 0x1)
> +	if (atomic_read(&ct->dynticks) & 0x1)
>  		return;
>  	rcu_dynticks_inc(1);
>  }
> @@ -338,17 +337,19 @@ static void rcu_dynticks_eqs_online(void)
>   */
>  static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
>  {
> -	return !(arch_atomic_read(this_cpu_ptr(&rcu_data.dynticks)) & 0x1);
> +	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
>  }
>  
>  /*
>   * Snapshot the ->dynticks counter with full ordering so as to allow
>   * stable comparison of this counter with past and future snapshots.
>   */
> -static int rcu_dynticks_snap(struct rcu_data *rdp)
> +static int rcu_dynticks_snap(int cpu)
>  {
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> +
>  	smp_mb();  // Fundamental RCU ordering guarantee.
> -	return atomic_read_acquire(&rdp->dynticks);
> +	return atomic_read_acquire(&ct->dynticks);
>  }
>  
>  /*
> @@ -363,9 +364,7 @@ static bool rcu_dynticks_in_eqs(int snap)
>  /* Return true if the specified CPU is currently idle from an RCU viewpoint.  */
>  bool rcu_is_idle_cpu(int cpu)
>  {
> -	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> -
> -	return rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp));
> +	return rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
>  }
>  
>  /*
> @@ -375,7 +374,7 @@ bool rcu_is_idle_cpu(int cpu)
>   */
>  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
>  {
> -	return snap != rcu_dynticks_snap(rdp);
> +	return snap != rcu_dynticks_snap(rdp->cpu);
>  }
>  
>  /*
> @@ -384,11 +383,11 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
>   */
>  bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
>  {
> -	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>  	int snap;
>  
>  	// If not quiescent, force back to earlier extended quiescent state.
> -	snap = atomic_read(&rdp->dynticks) & ~0x1;
> +	snap = atomic_read(&ct->dynticks) & ~0x1;
>  
>  	smp_rmb(); // Order ->dynticks and *vp reads.
>  	if (READ_ONCE(*vp))
> @@ -396,7 +395,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
>  	smp_rmb(); // Order *vp read and ->dynticks re-read.
>  
>  	// If still in the same extended quiescent state, we are good!
> -	return snap == atomic_read(&rdp->dynticks);
> +	return snap == atomic_read(&ct->dynticks);
>  }
>  
>  /*
> @@ -620,6 +619,7 @@ EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
>  static noinstr void rcu_eqs_enter(bool user)
>  {
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  
>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
> @@ -633,12 +633,12 @@ static noinstr void rcu_eqs_enter(bool user)
>  
>  	lockdep_assert_irqs_disabled();
>  	instrumentation_begin();
> -	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
> +	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&ct->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	rcu_preempt_deferred_qs(current);
>  
>  	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> -	instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
> +	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  
>  	instrumentation_end();
>  	WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
> @@ -740,7 +740,7 @@ noinstr void rcu_user_enter(void)
>   * rcu_nmi_exit - inform RCU of exit from NMI context
>   *
>   * If we are returning from the outermost NMI handler that interrupted an
> - * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
> + * RCU-idle period, update ct->dynticks and rdp->dynticks_nmi_nesting
>   * to let the RCU grace-period handling know that the CPU is back to
>   * being RCU-idle.
>   *
> @@ -749,6 +749,7 @@ noinstr void rcu_user_enter(void)
>   */
>  noinstr void rcu_nmi_exit(void)
>  {
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  
>  	instrumentation_begin();
> @@ -766,7 +767,7 @@ noinstr void rcu_nmi_exit(void)
>  	 */
>  	if (rdp->dynticks_nmi_nesting != 1) {
>  		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2,
> -				  atomic_read(&rdp->dynticks));
> +				  atomic_read(&ct->dynticks));
>  		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
>  			   rdp->dynticks_nmi_nesting - 2);
>  		instrumentation_end();
> @@ -774,11 +775,11 @@ noinstr void rcu_nmi_exit(void)
>  	}
>  
>  	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> -	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> +	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>  
>  	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> -	instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
> +	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  	instrumentation_end();
>  
>  	// RCU is watching here ...
> @@ -817,6 +818,7 @@ void rcu_irq_exit_check_preempt(void)
>   */
>  static void noinstr rcu_eqs_exit(bool user)
>  {
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	struct rcu_data *rdp;
>  	long oldval;
>  
> @@ -836,9 +838,9 @@ static void noinstr rcu_eqs_exit(bool user)
>  	instrumentation_begin();
>  
>  	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> -	instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
> +	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  
> -	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&rdp->dynticks));
> +	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&ct->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	WRITE_ONCE(rdp->dynticks_nesting, 1);
>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
> @@ -944,7 +946,7 @@ void __rcu_irq_enter_check_tick(void)
>  /**
>   * rcu_nmi_enter - inform RCU of entry to NMI context
>   *
> - * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
> + * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
>   * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know
>   * that the CPU is active.  This implementation permits nested NMIs, as
>   * long as the nesting level does not overflow an int.  (You will probably
> @@ -957,6 +959,7 @@ noinstr void rcu_nmi_enter(void)
>  {
>  	long incby = 2;
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  
>  	/* Complain about underflow. */
>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0);
> @@ -980,9 +983,9 @@ noinstr void rcu_nmi_enter(void)
>  
>  		instrumentation_begin();
>  		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
> -		instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks));
> +		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
>  		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> -		instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
> +		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  
>  		incby = 1;
>  	} else if (!in_nmi()) {
> @@ -994,7 +997,7 @@ noinstr void rcu_nmi_enter(void)
>  
>  	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
>  			  rdp->dynticks_nmi_nesting,
> -			  rdp->dynticks_nmi_nesting + incby, atomic_read(&rdp->dynticks));
> +			  rdp->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
>  	instrumentation_end();
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */
>  		   rdp->dynticks_nmi_nesting + incby);
> @@ -1138,7 +1141,7 @@ static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp)
>   */
>  static int dyntick_save_progress_counter(struct rcu_data *rdp)
>  {
> -	rdp->dynticks_snap = rcu_dynticks_snap(rdp);
> +	rdp->dynticks_snap = rcu_dynticks_snap(rdp->cpu);
>  	if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) {
>  		trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti"));
>  		rcu_gpnum_ovf(rdp->mynode, rdp);
> @@ -4125,7 +4128,7 @@ rcu_boot_init_percpu_data(int cpu)
>  	rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
>  	INIT_WORK(&rdp->strict_work, strict_work_handler);
>  	WARN_ON_ONCE(rdp->dynticks_nesting != 1);
> -	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp)));
> +	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu)));
>  	rdp->barrier_seq_snap = rcu_state.barrier_sequence;
>  	rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
>  	rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED;
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index b8d07bf92d29..15246a3f0734 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -188,7 +188,6 @@ struct rcu_data {
>  	int dynticks_snap;		/* Per-GP tracking for dynticks. */
>  	long dynticks_nesting;		/* Track process nesting level. */
>  	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
> -	atomic_t dynticks;		/* Even value for idle, else odd. */
>  	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
>  	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
>  	bool rcu_forced_tick;		/* Forced tick to provide QS. */
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index d5f30085b0cf..2210110990f4 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -358,7 +358,7 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
>  		    !(rnp->qsmaskinitnext & mask)) {
>  			mask_ofl_test |= mask;
>  		} else {
> -			snap = rcu_dynticks_snap(rdp);
> +			snap = rcu_dynticks_snap(cpu);
>  			if (rcu_dynticks_in_eqs(snap))
>  				mask_ofl_test |= mask;
>  			else
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 84b812a3ab44..202129b1c7e4 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -448,7 +448,7 @@ static void print_cpu_stall_info(int cpu)
>  	}
>  	delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
>  	falsepositive = rcu_is_gp_kthread_starving(NULL) &&
> -			rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp));
> +			rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
>  	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
>  	       cpu,
>  	       "O."[!!cpu_online(cpu)],
> @@ -458,7 +458,7 @@ static void print_cpu_stall_info(int cpu)
>  			rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
>  				"!."[!delta],
>  	       ticks_value, ticks_title,
> -	       rcu_dynticks_snap(rdp) & 0xfff,
> +	       rcu_dynticks_snap(cpu) & 0xfff,
>  	       rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
>  	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
>  	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting to context tracking
  2022-03-02 15:48 ` [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting " Frederic Weisbecker
@ 2022-03-10 20:01   ` Paul E. McKenney
  2022-03-12 23:23   ` Peter Zijlstra
  1 sibling, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:01 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:02PM +0100, Frederic Weisbecker wrote:
> The RCU eqs tracking is going to be performed by the context tracking
> subsystem. The related nesting counters thus need to be moved to the
> context tracking structure.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking_state.h |  1 +
>  kernel/context_tracking.c              |  1 +
>  kernel/rcu/tree.c                      | 31 +++++++++++++-------------
>  kernel/rcu/tree.h                      |  1 -
>  kernel/rcu/tree_stall.h                |  3 ++-
>  5 files changed, 20 insertions(+), 17 deletions(-)
> 
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 5ad0e481c5a3..bcb942945265 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -24,6 +24,7 @@ struct context_tracking {
>  	} state;
>  #endif
>  	atomic_t dynticks;		/* Even value for idle, else odd. */
> +	long dynticks_nesting;		/* Track process nesting level. */
>  };
>  
>  #ifdef CONFIG_CONTEXT_TRACKING
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 77b61a7c9890..09a77884a4e3 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -234,6 +234,7 @@ void __init context_tracking_init(void)
>  #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
>  
>  DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
> +		.dynticks_nesting = 1,
>  		.dynticks = ATOMIC_INIT(1),
>  };
>  EXPORT_SYMBOL_GPL(context_tracking);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 96eb8503f28e..8708d1a99565 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -75,7 +75,6 @@
>  /* Data structures. */
>  
>  static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
> -	.dynticks_nesting = 1,
>  	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
>  #ifdef CONFIG_RCU_NOCB_CPU
>  	.cblist.flags = SEGCBLIST_RCU_CORE,
> @@ -441,7 +440,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  	lockdep_assert_irqs_disabled();
>  
>  	/* Check for counter underflows */
> -	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) < 0,
> +	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) < 0,
>  			 "RCU dynticks_nesting counter underflow!");
>  	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 0,
>  			 "RCU dynticks_nmi_nesting counter underflow/zero!");
> @@ -457,7 +456,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  	WARN_ON_ONCE(!nesting && !is_idle_task(current));
>  
>  	/* Does CPU appear to be idle from an RCU standpoint? */
> -	return __this_cpu_read(rcu_data.dynticks_nesting) == 0;
> +	return __this_cpu_read(context_tracking.dynticks_nesting) == 0;
>  }
>  
>  #define DEFAULT_RCU_BLIMIT (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) ? 1000 : 10)
> @@ -624,16 +623,16 @@ static noinstr void rcu_eqs_enter(bool user)
>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> -		     rdp->dynticks_nesting == 0);
> -	if (rdp->dynticks_nesting != 1) {
> +		     ct->dynticks_nesting == 0);
> +	if (ct->dynticks_nesting != 1) {
>  		// RCU will still be watching, so just do accounting and leave.
> -		rdp->dynticks_nesting--;
> +		ct->dynticks_nesting--;
>  		return;
>  	}
>  
>  	lockdep_assert_irqs_disabled();
>  	instrumentation_begin();
> -	trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&ct->dynticks));
> +	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	rcu_preempt_deferred_qs(current);
>  
> @@ -641,7 +640,7 @@ static noinstr void rcu_eqs_enter(bool user)
>  	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  
>  	instrumentation_end();
> -	WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
> +	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
>  	// RCU is watching here ...
>  	rcu_dynticks_eqs_enter();
>  	// ... but is no longer watching here.
> @@ -798,7 +797,7 @@ void rcu_irq_exit_check_preempt(void)
>  {
>  	lockdep_assert_irqs_disabled();
>  
> -	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) <= 0,
> +	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) <= 0,
>  			 "RCU dynticks_nesting counter underflow/zero!");
>  	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
>  			 DYNTICK_IRQ_NONIDLE,
> @@ -824,11 +823,11 @@ static void noinstr rcu_eqs_exit(bool user)
>  
>  	lockdep_assert_irqs_disabled();
>  	rdp = this_cpu_ptr(&rcu_data);
> -	oldval = rdp->dynticks_nesting;
> +	oldval = ct->dynticks_nesting;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
>  	if (oldval) {
>  		// RCU was already watching, so just do accounting and leave.
> -		rdp->dynticks_nesting++;
> +		ct->dynticks_nesting++;
>  		return;
>  	}
>  	rcu_dynticks_task_exit();
> @@ -840,9 +839,9 @@ static void noinstr rcu_eqs_exit(bool user)
>  	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
>  	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  
> -	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&ct->dynticks));
> +	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> -	WRITE_ONCE(rdp->dynticks_nesting, 1);
> +	WRITE_ONCE(ct->dynticks_nesting, 1);
>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
>  	instrumentation_end();
> @@ -4122,12 +4121,13 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
>  static void __init
>  rcu_boot_init_percpu_data(int cpu)
>  {
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  
>  	/* Set up local state, ensuring consistent view of global state. */
>  	rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
>  	INIT_WORK(&rdp->strict_work, strict_work_handler);
> -	WARN_ON_ONCE(rdp->dynticks_nesting != 1);
> +	WARN_ON_ONCE(ct->dynticks_nesting != 1);
>  	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu)));
>  	rdp->barrier_seq_snap = rcu_state.barrier_sequence;
>  	rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
> @@ -4152,6 +4152,7 @@ rcu_boot_init_percpu_data(int cpu)
>  int rcutree_prepare_cpu(unsigned int cpu)
>  {
>  	unsigned long flags;
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  	struct rcu_node *rnp = rcu_get_root();
>  
> @@ -4160,7 +4161,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
>  	rdp->qlen_last_fqs_check = 0;
>  	rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
>  	rdp->blimit = blimit;
> -	rdp->dynticks_nesting = 1;	/* CPU not up, no tearing. */
> +	ct->dynticks_nesting = 1;	/* CPU not up, no tearing. */
>  	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
>  
>  	/*
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 15246a3f0734..8050bab08f39 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -186,7 +186,6 @@ struct rcu_data {
>  
>  	/* 3) dynticks interface. */
>  	int dynticks_snap;		/* Per-GP tracking for dynticks. */
> -	long dynticks_nesting;		/* Track process nesting level. */
>  	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
>  	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
>  	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 202129b1c7e4..30a5e0a8ddb3 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -429,6 +429,7 @@ static void print_cpu_stall_info(int cpu)
>  {
>  	unsigned long delta;
>  	bool falsepositive;
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  	char *ticks_title;
>  	unsigned long ticks_value;
> @@ -459,7 +460,7 @@ static void print_cpu_stall_info(int cpu)
>  				"!."[!delta],
>  	       ticks_value, ticks_title,
>  	       rcu_dynticks_snap(cpu) & 0xfff,
> -	       rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
> +	       ct->dynticks_nesting, rdp->dynticks_nmi_nesting,
>  	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
>  	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
>  	       falsepositive ? " (false positive?)" : "");
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 12/19] rcu/context_tracking: Move dynticks_nmi_nesting to context tracking
  2022-03-02 15:48 ` [PATCH 12/19] rcu/context_tracking: Move dynticks_nmi_nesting " Frederic Weisbecker
@ 2022-03-10 20:02   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:03PM +0100, Frederic Weisbecker wrote:
> The RCU eqs tracking is going to be performed by the context tracking
> subsystem. The related nesting counters thus need to be moved to the
> context tracking structure.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking_state.h |  4 +++
>  kernel/context_tracking.c              |  1 +
>  kernel/rcu/rcu.h                       |  4 ---
>  kernel/rcu/tree.c                      | 48 +++++++++++---------------
>  kernel/rcu/tree.h                      |  1 -
>  kernel/rcu/tree_stall.h                |  2 +-
>  6 files changed, 27 insertions(+), 33 deletions(-)
> 
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index bcb942945265..4efb97fe6518 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -6,6 +6,9 @@
>  #include <linux/static_key.h>
>  #include <linux/context_tracking_irq.h>
>  
> +/* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
> +#define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
> +
>  struct context_tracking {
>  #ifdef CONFIG_CONTEXT_TRACKING_USER
>  	/*
> @@ -25,6 +28,7 @@ struct context_tracking {
>  #endif
>  	atomic_t dynticks;		/* Even value for idle, else odd. */
>  	long dynticks_nesting;		/* Track process nesting level. */
> +	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
>  };
>  
>  #ifdef CONFIG_CONTEXT_TRACKING
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 09a77884a4e3..155534c409fc 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -234,6 +234,7 @@ void __init context_tracking_init(void)
>  #endif /* #ifdef CONFIG_CONTEXT_TRACKING_USER */
>  
>  DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
> +		.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
>  		.dynticks_nesting = 1,
>  		.dynticks = ATOMIC_INIT(1),
>  };
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index eccbdbdaa02e..d3cd9e7d11fa 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -12,10 +12,6 @@
>  
>  #include <trace/events/rcu.h>
>  
> -/* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
> -#define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
> -
> -
>  /*
>   * Grace-period counter management.
>   */
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8708d1a99565..c2528e65de0c 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -75,7 +75,6 @@
>  /* Data structures. */
>  
>  static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
> -	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
>  #ifdef CONFIG_RCU_NOCB_CPU
>  	.cblist.flags = SEGCBLIST_RCU_CORE,
>  #endif
> @@ -442,11 +441,11 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  	/* Check for counter underflows */
>  	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) < 0,
>  			 "RCU dynticks_nesting counter underflow!");
> -	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 0,
> +	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nmi_nesting) <= 0,
>  			 "RCU dynticks_nmi_nesting counter underflow/zero!");
>  
>  	/* Are we at first interrupt nesting level? */
> -	nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
> +	nesting = __this_cpu_read(context_tracking.dynticks_nmi_nesting);
>  	if (nesting > 1)
>  		return false;
>  
> @@ -617,11 +616,10 @@ EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
>   */
>  static noinstr void rcu_eqs_enter(bool user)
>  {
> -	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  
> -	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
>  		     ct->dynticks_nesting == 0);
>  	if (ct->dynticks_nesting != 1) {
> @@ -739,7 +737,7 @@ noinstr void rcu_user_enter(void)
>   * rcu_nmi_exit - inform RCU of exit from NMI context
>   *
>   * If we are returning from the outermost NMI handler that interrupted an
> - * RCU-idle period, update ct->dynticks and rdp->dynticks_nmi_nesting
> + * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
>   * to let the RCU grace-period handling know that the CPU is back to
>   * being RCU-idle.
>   *
> @@ -749,7 +747,6 @@ noinstr void rcu_user_enter(void)
>  noinstr void rcu_nmi_exit(void)
>  {
>  	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  
>  	instrumentation_begin();
>  	/*
> @@ -757,25 +754,25 @@ noinstr void rcu_nmi_exit(void)
>  	 * (We are exiting an NMI handler, so RCU better be paying attention
>  	 * to us!)
>  	 */
> -	WARN_ON_ONCE(rdp->dynticks_nmi_nesting <= 0);
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting <= 0);
>  	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
>  
>  	/*
>  	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
>  	 * leave it in non-RCU-idle state.
>  	 */
> -	if (rdp->dynticks_nmi_nesting != 1) {
> -		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2,
> +	if (ct->dynticks_nmi_nesting != 1) {
> +		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
>  				  atomic_read(&ct->dynticks));
> -		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
> -			   rdp->dynticks_nmi_nesting - 2);
> +		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
> +			   ct->dynticks_nmi_nesting - 2);
>  		instrumentation_end();
>  		return;
>  	}
>  
>  	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> -	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> +	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>  
>  	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
>  	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> @@ -799,7 +796,7 @@ void rcu_irq_exit_check_preempt(void)
>  
>  	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) <= 0,
>  			 "RCU dynticks_nesting counter underflow/zero!");
> -	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
> +	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nmi_nesting) !=
>  			 DYNTICK_IRQ_NONIDLE,
>  			 "Bad RCU  dynticks_nmi_nesting counter\n");
>  	RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
> @@ -818,11 +815,9 @@ void rcu_irq_exit_check_preempt(void)
>  static void noinstr rcu_eqs_exit(bool user)
>  {
>  	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -	struct rcu_data *rdp;
>  	long oldval;
>  
>  	lockdep_assert_irqs_disabled();
> -	rdp = this_cpu_ptr(&rcu_data);
>  	oldval = ct->dynticks_nesting;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
>  	if (oldval) {
> @@ -842,8 +837,8 @@ static void noinstr rcu_eqs_exit(bool user)
>  	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	WRITE_ONCE(ct->dynticks_nesting, 1);
> -	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
>  	instrumentation_end();
>  }
>  
> @@ -946,7 +941,7 @@ void __rcu_irq_enter_check_tick(void)
>   * rcu_nmi_enter - inform RCU of entry to NMI context
>   *
>   * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
> - * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know
> + * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
>   * that the CPU is active.  This implementation permits nested NMIs, as
>   * long as the nesting level does not overflow an int.  (You will probably
>   * run out of stack space first.)
> @@ -957,11 +952,10 @@ void __rcu_irq_enter_check_tick(void)
>  noinstr void rcu_nmi_enter(void)
>  {
>  	long incby = 2;
> -	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  
>  	/* Complain about underflow. */
> -	WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0);
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting < 0);
>  
>  	/*
>  	 * If idle from RCU viewpoint, atomically increment ->dynticks
> @@ -995,11 +989,11 @@ noinstr void rcu_nmi_enter(void)
>  	}
>  
>  	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
> -			  rdp->dynticks_nmi_nesting,
> -			  rdp->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
> +			  ct->dynticks_nmi_nesting,
> +			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
>  	instrumentation_end();
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */
> -		   rdp->dynticks_nmi_nesting + incby);
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
> +		   ct->dynticks_nmi_nesting + incby);
>  	barrier();
>  }
>  
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 8050bab08f39..56d38568292b 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -186,7 +186,6 @@ struct rcu_data {
>  
>  	/* 3) dynticks interface. */
>  	int dynticks_snap;		/* Per-GP tracking for dynticks. */
> -	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
>  	bool rcu_need_heavy_qs;		/* GP old, so heavy quiescent state! */
>  	bool rcu_urgent_qs;		/* GP old need light quiescent state. */
>  	bool rcu_forced_tick;		/* Forced tick to provide QS. */
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 30a5e0a8ddb3..9bf5cc79d5eb 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -460,7 +460,7 @@ static void print_cpu_stall_info(int cpu)
>  				"!."[!delta],
>  	       ticks_value, ticks_title,
>  	       rcu_dynticks_snap(cpu) & 0xfff,
> -	       ct->dynticks_nesting, rdp->dynticks_nmi_nesting,
> +	       ct->dynticks_nesting, ct->dynticks_nmi_nesting,
>  	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
>  	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
>  	       falsepositive ? " (false positive?)" : "");
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 13/19] rcu/context-tracking: Move deferred nocb resched to context tracking
  2022-03-02 15:48 ` [PATCH 13/19] rcu/context-tracking: Move deferred nocb resched " Frederic Weisbecker
@ 2022-03-10 20:04   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:04 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:04PM +0100, Frederic Weisbecker wrote:
> To prepare for migrating the RCU eqs accounting code to context tracking,
> split the last-resort deferred nocb resched from rcu_user_enter() and
> move it into a separate call from context tracking.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/rcutree.h   |  6 ++++++
>  kernel/context_tracking.c |  8 ++++++++
>  kernel/rcu/tree.c         | 15 ++-------------
>  3 files changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index e05334c4c3d1..6d111a3c0cc0 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -78,4 +78,10 @@ int rcutree_dead_cpu(unsigned int cpu);
>  int rcutree_dying_cpu(unsigned int cpu);
>  void rcu_cpu_starting(unsigned int cpu);
>  
> +#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK))
> +void rcu_irq_work_resched(void);
> +#else
> +static inline void rcu_irq_work_resched(void) { }
> +#endif
> +
>  #endif /* __LINUX_RCUTREE_H */
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 155534c409fc..7be7a2044d3a 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -60,6 +60,8 @@ static __always_inline void context_tracking_recursion_exit(void)
>   */
>  void noinstr __ct_user_enter(enum ctx_state state)
>  {
> +	lockdep_assert_irqs_disabled();
> +
>  	/* Kernel threads aren't supposed to go to userspace */
>  	WARN_ON_ONCE(!current->mm);
>  
> @@ -81,6 +83,12 @@ void noinstr __ct_user_enter(enum ctx_state state)
>  				vtime_user_enter(current);
>  				instrumentation_end();
>  			}
> +			/*
> +			 * Other than generic entry implementation, we may be past the last
> +			 * rescheduling opportunity in the entry code. Trigger a self IPI
> +			 * that will fire and reschedule once we resume in user/guest mode.
> +			 */
> +			rcu_irq_work_resched();
>  			rcu_user_enter();
>  		}
>  		/*
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index c2528e65de0c..938537958c27 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -686,7 +686,7 @@ static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) =
>   * last resort is to fire a local irq_work that will trigger a reschedule once IRQs
>   * get re-enabled again.
>   */
> -noinstr static void rcu_irq_work_resched(void)
> +noinstr void rcu_irq_work_resched(void)
>  {
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  
> @@ -702,10 +702,7 @@ noinstr static void rcu_irq_work_resched(void)
>  	}
>  	instrumentation_end();
>  }
> -
> -#else
> -static inline void rcu_irq_work_resched(void) { }
> -#endif
> +#endif /* #if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK) */
>  
>  /**
>   * rcu_user_enter - inform RCU that we are resuming userspace.
> @@ -720,14 +717,6 @@ static inline void rcu_irq_work_resched(void) { }
>   */
>  noinstr void rcu_user_enter(void)
>  {
> -	lockdep_assert_irqs_disabled();
> -
> -	/*
> -	 * Other than generic entry implementation, we may be past the last
> -	 * rescheduling opportunity in the entry code. Trigger a self IPI
> -	 * that will fire and reschedule once we resume in user/guest mode.
> -	 */
> -	rcu_irq_work_resched();
>  	rcu_eqs_enter(true);
>  }
>  
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
  2022-03-02 15:48 ` [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Frederic Weisbecker
@ 2022-03-10 20:07   ` Paul E. McKenney
  2022-03-11 16:02     ` Frederic Weisbecker
  2022-03-12 23:10   ` Peter Zijlstra
  1 sibling, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:05PM +0100, Frederic Weisbecker wrote:
> Move the core RCU eqs/dynticks functions to context tracking so that
> we can later merge all that code within context tracking.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

I am not sure that you want rcu_dynticks_task_enter() and friends in
context tracking, but I have no objection to them living there.  ;-)

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking.h |  12 ++
>  include/linux/rcutree.h          |   3 +
>  kernel/context_tracking.c        | 347 +++++++++++++++++++++++++++++++
>  kernel/rcu/tree.c                | 326 +----------------------------
>  kernel/rcu/tree.h                |   5 -
>  kernel/rcu/tree_plugin.h         |  36 +---
>  6 files changed, 366 insertions(+), 363 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 52a2e23d5107..086546569d14 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -122,6 +122,18 @@ static inline void context_tracking_init(void) { }
>  #ifdef CONFIG_CONTEXT_TRACKING
>  extern void ct_idle_enter(void);
>  extern void ct_idle_exit(void);
> +extern unsigned long rcu_dynticks_inc(int incby);
> +
> +/*
> + * Is the current CPU in an extended quiescent state?
> + *
> + * No ordering, as we are sampling CPU-local information.
> + */
> +static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
> +{
> +	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
> +}
> +
>  #else
>  static inline void ct_idle_enter(void) { }
>  static inline void ct_idle_exit(void) { }
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 6d111a3c0cc0..408435ff7a06 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -59,6 +59,9 @@ void rcu_irq_exit_check_preempt(void);
>  static inline void rcu_irq_exit_check_preempt(void) { }
>  #endif
>  
> +struct task_struct;
> +void rcu_preempt_deferred_qs(struct task_struct *t);
> +
>  void exit_rcu(void);
>  
>  void rcu_scheduler_starting(void);
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 7be7a2044d3a..dc24a9782bbd 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -21,6 +21,353 @@
>  #include <linux/hardirq.h>
>  #include <linux/export.h>
>  #include <linux/kprobes.h>
> +#include <trace/events/rcu.h>
> +
> +#define TPS(x)  tracepoint_string(x)
> +
> +/* Record the current task on dyntick-idle entry. */
> +static __always_inline void rcu_dynticks_task_enter(void)
> +{
> +#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> +	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
> +#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> +}
> +
> +/* Record no current task on dyntick-idle exit. */
> +static __always_inline void rcu_dynticks_task_exit(void)
> +{
> +#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> +	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
> +#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> +}
> +
> +/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> +static __always_inline void rcu_dynticks_task_trace_enter(void)
> +{
> +#ifdef CONFIG_TASKS_TRACE_RCU
> +	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> +		current->trc_reader_special.b.need_mb = true;
> +#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
> +}
> +
> +/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
> +static __always_inline void rcu_dynticks_task_trace_exit(void)
> +{
> +#ifdef CONFIG_TASKS_TRACE_RCU
> +	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> +		current->trc_reader_special.b.need_mb = false;
> +#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
> +}
> +
> +/*
> + * Increment the current CPU's context_tracking structure's ->dynticks field
> + * with ordering.  Return the new value.
> + */
> +noinstr unsigned long rcu_dynticks_inc(int incby)
> +{
> +	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
> +}
> +
> +/*
> + * Record entry into an extended quiescent state.  This is only to be
> + * called when not already in an extended quiescent state, that is,
> + * RCU is watching prior to the call to this function and is no longer
> + * watching upon return.
> + */
> +static noinstr void rcu_dynticks_eqs_enter(void)
> +{
> +	int seq;
> +
> +	/*
> +	 * CPUs seeing atomic_add_return() must see prior RCU read-side
> +	 * critical sections, and we also must force ordering with the
> +	 * next idle sojourn.
> +	 */
> +	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
> +	seq = rcu_dynticks_inc(1);
> +	// RCU is no longer watching.  Better be in extended quiescent state!
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
> +}
> +
> +/*
> + * Record exit from an extended quiescent state.  This is only to be
> + * called from an extended quiescent state, that is, RCU is not watching
> + * prior to the call to this function and is watching upon return.
> + */
> +static noinstr void rcu_dynticks_eqs_exit(void)
> +{
> +	int seq;
> +
> +	/*
> +	 * CPUs seeing atomic_add_return() must see prior idle sojourns,
> +	 * and we also must force ordering with the next RCU read-side
> +	 * critical section.
> +	 */
> +	seq = rcu_dynticks_inc(1);
> +	// RCU is now watching.  Better not be in an extended quiescent state!
> +	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
> +}
> +
> +/*
> + * Enter an RCU extended quiescent state, which can be either the
> + * idle loop or adaptive-tickless usermode execution.
> + *
> + * We crowbar the ->dynticks_nmi_nesting field to zero to allow for
> + * the possibility of usermode upcalls having messed up our count
> + * of interrupt nesting level during the prior busy period.
> + */
> +static noinstr void rcu_eqs_enter(bool user)
> +{
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> +
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> +		     ct->dynticks_nesting == 0);
> +	if (ct->dynticks_nesting != 1) {
> +		// RCU will still be watching, so just do accounting and leave.
> +		ct->dynticks_nesting--;
> +		return;
> +	}
> +
> +	lockdep_assert_irqs_disabled();
> +	instrumentation_begin();
> +	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> +	rcu_preempt_deferred_qs(current);
> +
> +	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> +	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +
> +	instrumentation_end();
> +	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
> +	// RCU is watching here ...
> +	rcu_dynticks_eqs_enter();
> +	// ... but is no longer watching here.
> +	rcu_dynticks_task_enter();
> +}
> +
> +/**
> + * rcu_idle_enter - inform RCU that current CPU is entering idle
> + *
> + * Enter idle mode, in other words, -leave- the mode in which RCU
> + * read-side critical sections can occur.  (Though RCU read-side
> + * critical sections can occur in irq handlers in idle, a possibility
> + * handled by irq_enter() and irq_exit().)
> + *
> + * If you add or remove a call to rcu_idle_enter(), be sure to test with
> + * CONFIG_RCU_EQS_DEBUG=y.
> + */
> +void rcu_idle_enter(void)
> +{
> +	lockdep_assert_irqs_disabled();
> +	rcu_eqs_enter(false);
> +}
> +
> +#ifdef CONFIG_NO_HZ_FULL
> +/**
> + * rcu_user_enter - inform RCU that we are resuming userspace.
> + *
> + * Enter RCU idle mode right before resuming userspace.  No use of RCU
> + * is permitted between this call and rcu_user_exit(). This way the
> + * CPU doesn't need to maintain the tick for RCU maintenance purposes
> + * when the CPU runs in userspace.
> + *
> + * If you add or remove a call to rcu_user_enter(), be sure to test with
> + * CONFIG_RCU_EQS_DEBUG=y.
> + */
> +noinstr void rcu_user_enter(void)
> +{
> +	rcu_eqs_enter(true);
> +}
> +#endif /* CONFIG_NO_HZ_FULL */
> +
> +/**
> + * rcu_nmi_exit - inform RCU of exit from NMI context
> + *
> + * If we are returning from the outermost NMI handler that interrupted an
> + * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
> + * to let the RCU grace-period handling know that the CPU is back to
> + * being RCU-idle.
> + *
> + * If you add or remove a call to rcu_nmi_exit(), be sure to test
> + * with CONFIG_RCU_EQS_DEBUG=y.
> + */
> +noinstr void rcu_nmi_exit(void)
> +{
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> +
> +	instrumentation_begin();
> +	/*
> +	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
> +	 * (We are exiting an NMI handler, so RCU better be paying attention
> +	 * to us!)
> +	 */
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting <= 0);
> +	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
> +
> +	/*
> +	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
> +	 * leave it in non-RCU-idle state.
> +	 */
> +	if (ct->dynticks_nmi_nesting != 1) {
> +		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
> +				  atomic_read(&ct->dynticks));
> +		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
> +			   ct->dynticks_nmi_nesting - 2);
> +		instrumentation_end();
> +		return;
> +	}
> +
> +	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> +	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> +
> +	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> +	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +	instrumentation_end();
> +
> +	// RCU is watching here ...
> +	rcu_dynticks_eqs_enter();
> +	// ... but is no longer watching here.
> +
> +	if (!in_nmi())
> +		rcu_dynticks_task_enter();
> +}
> +
> +/*
> + * Exit an RCU extended quiescent state, which can be either the
> + * idle loop or adaptive-tickless usermode execution.
> + *
> + * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
> + * allow for the possibility of usermode upcalls messing up our count of
> + * interrupt nesting level during the busy period that is just now starting.
> + */
> +static void noinstr rcu_eqs_exit(bool user)
> +{
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> +	long oldval;
> +
> +	lockdep_assert_irqs_disabled();
> +	oldval = ct->dynticks_nesting;
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
> +	if (oldval) {
> +		// RCU was already watching, so just do accounting and leave.
> +		ct->dynticks_nesting++;
> +		return;
> +	}
> +	rcu_dynticks_task_exit();
> +	// RCU is not watching here ...
> +	rcu_dynticks_eqs_exit();
> +	// ... but is watching here.
> +	instrumentation_begin();
> +
> +	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> +	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +
> +	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> +	WRITE_ONCE(ct->dynticks_nesting, 1);
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
> +	instrumentation_end();
> +}
> +
> +/**
> + * rcu_idle_exit - inform RCU that current CPU is leaving idle
> + *
> + * Exit idle mode, in other words, -enter- the mode in which RCU
> + * read-side critical sections can occur.
> + *
> + * If you add or remove a call to rcu_idle_exit(), be sure to test with
> + * CONFIG_RCU_EQS_DEBUG=y.
> + */
> +void rcu_idle_exit(void)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	rcu_eqs_exit(false);
> +	local_irq_restore(flags);
> +}
> +EXPORT_SYMBOL_GPL(rcu_idle_exit);
> +
> +#ifdef CONFIG_NO_HZ_FULL
> +/**
> + * rcu_user_exit - inform RCU that we are exiting userspace.
> + *
> + * Exit RCU idle mode while entering the kernel because it can
> + * run a RCU read side critical section anytime.
> + *
> + * If you add or remove a call to rcu_user_exit(), be sure to test with
> + * CONFIG_RCU_EQS_DEBUG=y.
> + */
> +void noinstr rcu_user_exit(void)
> +{
> +	rcu_eqs_exit(true);
> +}
> +#endif /* ifdef CONFIG_NO_HZ_FULL */
> +
> +/**
> + * rcu_nmi_enter - inform RCU of entry to NMI context
> + *
> + * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
> + * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
> + * that the CPU is active.  This implementation permits nested NMIs, as
> + * long as the nesting level does not overflow an int.  (You will probably
> + * run out of stack space first.)
> + *
> + * If you add or remove a call to rcu_nmi_enter(), be sure to test
> + * with CONFIG_RCU_EQS_DEBUG=y.
> + */
> +noinstr void rcu_nmi_enter(void)
> +{
> +	long incby = 2;
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> +
> +	/* Complain about underflow. */
> +	WARN_ON_ONCE(ct->dynticks_nmi_nesting < 0);
> +
> +	/*
> +	 * If idle from RCU viewpoint, atomically increment ->dynticks
> +	 * to mark non-idle and increment ->dynticks_nmi_nesting by one.
> +	 * Otherwise, increment ->dynticks_nmi_nesting by two.  This means
> +	 * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
> +	 * to be in the outermost NMI handler that interrupted an RCU-idle
> +	 * period (observation due to Andy Lutomirski).
> +	 */
> +	if (rcu_dynticks_curr_cpu_in_eqs()) {
> +
> +		if (!in_nmi())
> +			rcu_dynticks_task_exit();
> +
> +		// RCU is not watching here ...
> +		rcu_dynticks_eqs_exit();
> +		// ... but is watching here.
> +
> +		instrumentation_begin();
> +		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
> +		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
> +		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> +		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +
> +		incby = 1;
> +	} else if (!in_nmi()) {
> +		instrumentation_begin();
> +		rcu_irq_enter_check_tick();
> +	} else  {
> +		instrumentation_begin();
> +	}
> +
> +	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
> +			  ct->dynticks_nmi_nesting,
> +			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
> +	instrumentation_end();
> +	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
> +		   ct->dynticks_nmi_nesting + incby);
> +	barrier();
> +}
>  
>  #ifdef CONFIG_CONTEXT_TRACKING_USER
>  
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 938537958c27..e55a44ed19b6 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -62,6 +62,7 @@
>  #include <linux/vmalloc.h>
>  #include <linux/mm.h>
>  #include <linux/kasan.h>
> +#include <linux/context_tracking.h>
>  #include "../time/tick-internal.h"
>  
>  #include "tree.h"
> @@ -259,56 +260,6 @@ void rcu_softirq_qs(void)
>  	rcu_tasks_qs(current, false);
>  }
>  
> -/*
> - * Increment the current CPU's rcu_data structure's ->dynticks field
> - * with ordering.  Return the new value.
> - */
> -static noinline noinstr unsigned long rcu_dynticks_inc(int incby)
> -{
> -	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
> -}
> -
> -/*
> - * Record entry into an extended quiescent state.  This is only to be
> - * called when not already in an extended quiescent state, that is,
> - * RCU is watching prior to the call to this function and is no longer
> - * watching upon return.
> - */
> -static noinstr void rcu_dynticks_eqs_enter(void)
> -{
> -	int seq;
> -
> -	/*
> -	 * CPUs seeing atomic_add_return() must see prior RCU read-side
> -	 * critical sections, and we also must force ordering with the
> -	 * next idle sojourn.
> -	 */
> -	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
> -	seq = rcu_dynticks_inc(1);
> -	// RCU is no longer watching.  Better be in extended quiescent state!
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
> -}
> -
> -/*
> - * Record exit from an extended quiescent state.  This is only to be
> - * called from an extended quiescent state, that is, RCU is not watching
> - * prior to the call to this function and is watching upon return.
> - */
> -static noinstr void rcu_dynticks_eqs_exit(void)
> -{
> -	int seq;
> -
> -	/*
> -	 * CPUs seeing atomic_add_return() must see prior idle sojourns,
> -	 * and we also must force ordering with the next RCU read-side
> -	 * critical section.
> -	 */
> -	seq = rcu_dynticks_inc(1);
> -	// RCU is now watching.  Better not be in an extended quiescent state!
> -	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
> -}
> -
>  /*
>   * Reset the current CPU's ->dynticks counter to indicate that the
>   * newly onlined CPU is no longer in an extended quiescent state.
> @@ -328,16 +279,6 @@ static void rcu_dynticks_eqs_online(void)
>  	rcu_dynticks_inc(1);
>  }
>  
> -/*
> - * Is the current CPU in an extended quiescent state?
> - *
> - * No ordering, as we are sampling CPU-local information.
> - */
> -static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
> -{
> -	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
> -}
> -
>  /*
>   * Snapshot the ->dynticks counter with full ordering so as to allow
>   * stable comparison of this counter with past and future snapshots.
> @@ -606,65 +547,7 @@ void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
>  }
>  EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
>  
> -/*
> - * Enter an RCU extended quiescent state, which can be either the
> - * idle loop or adaptive-tickless usermode execution.
> - *
> - * We crowbar the ->dynticks_nmi_nesting field to zero to allow for
> - * the possibility of usermode upcalls having messed up our count
> - * of interrupt nesting level during the prior busy period.
> - */
> -static noinstr void rcu_eqs_enter(bool user)
> -{
> -	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -
> -	WARN_ON_ONCE(ct->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
> -	WRITE_ONCE(ct->dynticks_nmi_nesting, 0);
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> -		     ct->dynticks_nesting == 0);
> -	if (ct->dynticks_nesting != 1) {
> -		// RCU will still be watching, so just do accounting and leave.
> -		ct->dynticks_nesting--;
> -		return;
> -	}
> -
> -	lockdep_assert_irqs_disabled();
> -	instrumentation_begin();
> -	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> -	rcu_preempt_deferred_qs(current);
> -
> -	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> -	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> -
> -	instrumentation_end();
> -	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
> -	// RCU is watching here ...
> -	rcu_dynticks_eqs_enter();
> -	// ... but is no longer watching here.
> -	rcu_dynticks_task_enter();
> -}
> -
> -/**
> - * rcu_idle_enter - inform RCU that current CPU is entering idle
> - *
> - * Enter idle mode, in other words, -leave- the mode in which RCU
> - * read-side critical sections can occur.  (Though RCU read-side
> - * critical sections can occur in irq handlers in idle, a possibility
> - * handled by irq_enter() and irq_exit().)
> - *
> - * If you add or remove a call to rcu_idle_enter(), be sure to test with
> - * CONFIG_RCU_EQS_DEBUG=y.
> - */
> -void rcu_idle_enter(void)
> -{
> -	lockdep_assert_irqs_disabled();
> -	rcu_eqs_enter(false);
> -}
> -
> -#ifdef CONFIG_NO_HZ_FULL
> -
> -#if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)
> +#if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK))
>  /*
>   * An empty function that will trigger a reschedule on
>   * IRQ tail once IRQs get re-enabled on userspace/guest resume.
> @@ -702,78 +585,7 @@ noinstr void rcu_irq_work_resched(void)
>  	}
>  	instrumentation_end();
>  }
> -#endif /* #if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK) */
> -
> -/**
> - * rcu_user_enter - inform RCU that we are resuming userspace.
> - *
> - * Enter RCU idle mode right before resuming userspace.  No use of RCU
> - * is permitted between this call and rcu_user_exit(). This way the
> - * CPU doesn't need to maintain the tick for RCU maintenance purposes
> - * when the CPU runs in userspace.
> - *
> - * If you add or remove a call to rcu_user_enter(), be sure to test with
> - * CONFIG_RCU_EQS_DEBUG=y.
> - */
> -noinstr void rcu_user_enter(void)
> -{
> -	rcu_eqs_enter(true);
> -}
> -
> -#endif /* CONFIG_NO_HZ_FULL */
> -
> -/**
> - * rcu_nmi_exit - inform RCU of exit from NMI context
> - *
> - * If we are returning from the outermost NMI handler that interrupted an
> - * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
> - * to let the RCU grace-period handling know that the CPU is back to
> - * being RCU-idle.
> - *
> - * If you add or remove a call to rcu_nmi_exit(), be sure to test
> - * with CONFIG_RCU_EQS_DEBUG=y.
> - */
> -noinstr void rcu_nmi_exit(void)
> -{
> -	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -
> -	instrumentation_begin();
> -	/*
> -	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
> -	 * (We are exiting an NMI handler, so RCU better be paying attention
> -	 * to us!)
> -	 */
> -	WARN_ON_ONCE(ct->dynticks_nmi_nesting <= 0);
> -	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
> -
> -	/*
> -	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
> -	 * leave it in non-RCU-idle state.
> -	 */
> -	if (ct->dynticks_nmi_nesting != 1) {
> -		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
> -				  atomic_read(&ct->dynticks));
> -		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
> -			   ct->dynticks_nmi_nesting - 2);
> -		instrumentation_end();
> -		return;
> -	}
> -
> -	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> -	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
> -	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> -
> -	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> -	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> -	instrumentation_end();
> -
> -	// RCU is watching here ...
> -	rcu_dynticks_eqs_enter();
> -	// ... but is no longer watching here.
> -
> -	if (!in_nmi())
> -		rcu_dynticks_task_enter();
> -}
> +#endif /* #if defined(CONFIG_NO_HZ_FULL) && (!defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK)) */
>  
>  #ifdef CONFIG_PROVE_RCU
>  /**
> @@ -793,77 +605,6 @@ void rcu_irq_exit_check_preempt(void)
>  }
>  #endif /* #ifdef CONFIG_PROVE_RCU */
>  
> -/*
> - * Exit an RCU extended quiescent state, which can be either the
> - * idle loop or adaptive-tickless usermode execution.
> - *
> - * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
> - * allow for the possibility of usermode upcalls messing up our count of
> - * interrupt nesting level during the busy period that is just now starting.
> - */
> -static void noinstr rcu_eqs_exit(bool user)
> -{
> -	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -	long oldval;
> -
> -	lockdep_assert_irqs_disabled();
> -	oldval = ct->dynticks_nesting;
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
> -	if (oldval) {
> -		// RCU was already watching, so just do accounting and leave.
> -		ct->dynticks_nesting++;
> -		return;
> -	}
> -	rcu_dynticks_task_exit();
> -	// RCU is not watching here ...
> -	rcu_dynticks_eqs_exit();
> -	// ... but is watching here.
> -	instrumentation_begin();
> -
> -	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> -	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> -
> -	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> -	WRITE_ONCE(ct->dynticks_nesting, 1);
> -	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
> -	WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
> -	instrumentation_end();
> -}
> -
> -/**
> - * rcu_idle_exit - inform RCU that current CPU is leaving idle
> - *
> - * Exit idle mode, in other words, -enter- the mode in which RCU
> - * read-side critical sections can occur.
> - *
> - * If you add or remove a call to rcu_idle_exit(), be sure to test with
> - * CONFIG_RCU_EQS_DEBUG=y.
> - */
> -void rcu_idle_exit(void)
> -{
> -	unsigned long flags;
> -
> -	local_irq_save(flags);
> -	rcu_eqs_exit(false);
> -	local_irq_restore(flags);
> -}
> -
> -#ifdef CONFIG_NO_HZ_FULL
> -/**
> - * rcu_user_exit - inform RCU that we are exiting userspace.
> - *
> - * Exit RCU idle mode while entering the kernel because it can
> - * run a RCU read side critical section anytime.
> - *
> - * If you add or remove a call to rcu_user_exit(), be sure to test with
> - * CONFIG_RCU_EQS_DEBUG=y.
> - */
> -void noinstr rcu_user_exit(void)
> -{
> -	rcu_eqs_exit(true);
> -}
> -
>  /**
>   * __rcu_irq_enter_check_tick - Enable scheduler tick on CPU if RCU needs it.
>   *
> @@ -924,67 +665,6 @@ void __rcu_irq_enter_check_tick(void)
>  	}
>  	raw_spin_unlock_rcu_node(rdp->mynode);
>  }
> -#endif /* CONFIG_NO_HZ_FULL */
> -
> -/**
> - * rcu_nmi_enter - inform RCU of entry to NMI context
> - *
> - * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
> - * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
> - * that the CPU is active.  This implementation permits nested NMIs, as
> - * long as the nesting level does not overflow an int.  (You will probably
> - * run out of stack space first.)
> - *
> - * If you add or remove a call to rcu_nmi_enter(), be sure to test
> - * with CONFIG_RCU_EQS_DEBUG=y.
> - */
> -noinstr void rcu_nmi_enter(void)
> -{
> -	long incby = 2;
> -	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -
> -	/* Complain about underflow. */
> -	WARN_ON_ONCE(ct->dynticks_nmi_nesting < 0);
> -
> -	/*
> -	 * If idle from RCU viewpoint, atomically increment ->dynticks
> -	 * to mark non-idle and increment ->dynticks_nmi_nesting by one.
> -	 * Otherwise, increment ->dynticks_nmi_nesting by two.  This means
> -	 * if ->dynticks_nmi_nesting is equal to one, we are guaranteed
> -	 * to be in the outermost NMI handler that interrupted an RCU-idle
> -	 * period (observation due to Andy Lutomirski).
> -	 */
> -	if (rcu_dynticks_curr_cpu_in_eqs()) {
> -
> -		if (!in_nmi())
> -			rcu_dynticks_task_exit();
> -
> -		// RCU is not watching here ...
> -		rcu_dynticks_eqs_exit();
> -		// ... but is watching here.
> -
> -		instrumentation_begin();
> -		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
> -		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
> -		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> -		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> -
> -		incby = 1;
> -	} else if (!in_nmi()) {
> -		instrumentation_begin();
> -		rcu_irq_enter_check_tick();
> -	} else  {
> -		instrumentation_begin();
> -	}
> -
> -	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
> -			  ct->dynticks_nmi_nesting,
> -			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
> -	instrumentation_end();
> -	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
> -		   ct->dynticks_nmi_nesting + incby);
> -	barrier();
> -}
>  
>  /*
>   * Check to see if any future non-offloaded RCU-related work will need
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 56d38568292b..a42c2a737e24 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -426,7 +426,6 @@ static void rcu_cpu_kthread_setup(unsigned int cpu);
>  static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp);
>  static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
>  static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
> -static void rcu_preempt_deferred_qs(struct task_struct *t);
>  static void zero_cpu_stall_ticks(struct rcu_data *rdp);
>  static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
>  static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
> @@ -466,10 +465,6 @@ do {								\
>  
>  static void rcu_bind_gp_kthread(void);
>  static bool rcu_nohz_full_cpu(void);
> -static void rcu_dynticks_task_enter(void);
> -static void rcu_dynticks_task_exit(void);
> -static void rcu_dynticks_task_trace_enter(void);
> -static void rcu_dynticks_task_trace_exit(void);
>  
>  /* Forward declarations for tree_stall.h */
>  static void record_gp_stall_check_time(void);
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 6b9bcd45c7b2..be4b74b46109 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -595,7 +595,7 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
>   * evaluate safety in terms of interrupt, softirq, and preemption
>   * disabling.
>   */
> -static void rcu_preempt_deferred_qs(struct task_struct *t)
> +void rcu_preempt_deferred_qs(struct task_struct *t)
>  {
>  	unsigned long flags;
>  
> @@ -1283,37 +1283,3 @@ static void rcu_bind_gp_kthread(void)
>  		return;
>  	housekeeping_affine(current, HK_FLAG_RCU);
>  }
> -
> -/* Record the current task on dyntick-idle entry. */
> -static __always_inline void rcu_dynticks_task_enter(void)
> -{
> -#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> -	WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
> -#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> -}
> -
> -/* Record no current task on dyntick-idle exit. */
> -static __always_inline void rcu_dynticks_task_exit(void)
> -{
> -#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> -	WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
> -#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> -}
> -
> -/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
> -static __always_inline void rcu_dynticks_task_trace_enter(void)
> -{
> -#ifdef CONFIG_TASKS_TRACE_RCU
> -	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> -		current->trc_reader_special.b.need_mb = true;
> -#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
> -}
> -
> -/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
> -static __always_inline void rcu_dynticks_task_trace_exit(void)
> -{
> -#ifdef CONFIG_TASKS_TRACE_RCU
> -	if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
> -		current->trc_reader_special.b.need_mb = false;
> -#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
> -}
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 17/19] rcu/context-tracking: Use accessor for dynticks counter value
  2022-03-02 15:48 ` [PATCH 17/19] rcu/context-tracking: Use accessor for dynticks counter value Frederic Weisbecker
@ 2022-03-10 20:08   ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:08 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:08PM +0100, Frederic Weisbecker wrote:
> The dynticks counter value is going to join the context tracking state
> in a single field. Use an accessor for this value to make the transition
> easier for all readers.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking_state.h | 17 +++++++++++++++++
>  kernel/context_tracking.c              | 10 +++++-----
>  kernel/rcu/tree.c                      | 13 ++++---------
>  3 files changed, 26 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index 3da44987dc71..bca0d3e0bd3d 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -40,6 +40,23 @@ static __always_inline int __ct_state(void)
>  {
>  	return atomic_read(this_cpu_ptr(&context_tracking.state));
>  }
> +
> +static __always_inline int ct_dynticks(void)
> +{
> +	return atomic_read(this_cpu_ptr(&context_tracking.dynticks));
> +}
> +
> +static __always_inline int ct_dynticks_cpu(int cpu)
> +{
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> +	return atomic_read(&ct->dynticks);
> +}
> +
> +static __always_inline int ct_dynticks_cpu_acquire(int cpu)
> +{
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> +	return atomic_read_acquire(&ct->state);
> +}
>  #endif
>  
>  #ifdef CONFIG_CONTEXT_TRACKING_USER
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 69db43548768..fe9066fdfaab 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -133,7 +133,7 @@ static noinstr void rcu_eqs_enter(bool user)
>  
>  	lockdep_assert_irqs_disabled();
>  	instrumentation_begin();
> -	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, atomic_read(&ct->dynticks));
> +	trace_rcu_dyntick(TPS("Start"), ct->dynticks_nesting, 0, ct_dynticks());
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	rcu_preempt_deferred_qs(current);
>  
> @@ -178,7 +178,7 @@ noinstr void ct_nmi_exit(void)
>  	 */
>  	if (ct->dynticks_nmi_nesting != 1) {
>  		trace_rcu_dyntick(TPS("--="), ct->dynticks_nmi_nesting, ct->dynticks_nmi_nesting - 2,
> -				  atomic_read(&ct->dynticks));
> +				  ct_dynticks());
>  		WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */
>  			   ct->dynticks_nmi_nesting - 2);
>  		instrumentation_end();
> @@ -186,7 +186,7 @@ noinstr void ct_nmi_exit(void)
>  	}
>  
>  	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> -	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, atomic_read(&ct->dynticks));
> +	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, ct_dynticks());
>  	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>  
>  	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> @@ -231,7 +231,7 @@ static void noinstr rcu_eqs_exit(bool user)
>  	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
>  	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
>  
> -	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, atomic_read(&ct->dynticks));
> +	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, ct_dynticks());
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	WRITE_ONCE(ct->dynticks_nesting, 1);
>  	WARN_ON_ONCE(ct->dynticks_nmi_nesting);
> @@ -292,7 +292,7 @@ noinstr void ct_nmi_enter(void)
>  
>  	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
>  			  ct->dynticks_nmi_nesting,
> -			  ct->dynticks_nmi_nesting + incby, atomic_read(&ct->dynticks));
> +			  ct->dynticks_nmi_nesting + incby, ct_dynticks());
>  	instrumentation_end();
>  	WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */
>  		   ct->dynticks_nmi_nesting + incby);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index e55a44ed19b6..90a22dd2189d 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -272,9 +272,7 @@ void rcu_softirq_qs(void)
>   */
>  static void rcu_dynticks_eqs_online(void)
>  {
> -	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> -
> -	if (atomic_read(&ct->dynticks) & 0x1)
> +	if (ct_dynticks() & 0x1)
>  		return;
>  	rcu_dynticks_inc(1);
>  }
> @@ -285,10 +283,8 @@ static void rcu_dynticks_eqs_online(void)
>   */
>  static int rcu_dynticks_snap(int cpu)
>  {
> -	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> -
>  	smp_mb();  // Fundamental RCU ordering guarantee.
> -	return atomic_read_acquire(&ct->dynticks);
> +	return ct_dynticks_cpu_acquire(cpu);
>  }
>  
>  /*
> @@ -322,11 +318,10 @@ static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
>   */
>  bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
>  {
> -	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>  	int snap;
>  
>  	// If not quiescent, force back to earlier extended quiescent state.
> -	snap = atomic_read(&ct->dynticks) & ~0x1;
> +	snap = ct_dynticks_cpu(cpu) & ~0x1;
>  
>  	smp_rmb(); // Order ->dynticks and *vp reads.
>  	if (READ_ONCE(*vp))
> @@ -334,7 +329,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
>  	smp_rmb(); // Order *vp read and ->dynticks re-read.
>  
>  	// If still in the same extended quiescent state, we are good!
> -	return snap == atomic_read(&ct->dynticks);
> +	return snap == ct_dynticks_cpu(cpu);
>  }
>  
>  /*
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states
  2022-03-02 15:48 ` [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states Frederic Weisbecker
@ 2022-03-10 20:32   ` Paul E. McKenney
  2022-03-11 16:35     ` Frederic Weisbecker
  0 siblings, 1 reply; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-10 20:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:09PM +0100, Frederic Weisbecker wrote:
> Updating the context tracking state and the RCU dynticks counter
> atomically in a single operation is a first step towards improving CPU
> isolation. This makes the context tracking state updates fully ordered
> and therefore allow for later enhancements such as postponing some work
> while a task is running isolated in userspace until it ever comes back
> to the kernel.
> 
> The state field becomes divided in two parts:
> 
> 1) Lower bits for context tracking state:
> 
>    	CONTEXT_IDLE = 1,
> 	CONTEXT_USER = 2,
> 	CONTEXT_GUEST = 4,

And the CONTEXT_DISABLED value of -1 works because you can have only
one of the above three bits set at a time?

Except that RCU needs this to unconditionally at least distinguish
between kernel and idle, given the prevalence of CONFIG_NO_HZ_IDLE=y.
So does the CONTEXT_DISABLED really happen anymore?

A few more questions interspersed below.

							Thanx, Paul

>    A value of 0 means we are in CONTEXT_KERNEL.
> 
> 2) Higher bits for RCU eqs dynticks counting:
> 
>     RCU_DYNTICKS_IDX = 8
> 
>    The dynticks counting is always incremented by this value.
>    (state & RCU_DYNTICKS_IDX) means we are NOT in an extended quiescent
>    state. This makes the chance for a collision more likely between two
>    RCU dynticks snapshots but wrapping up 24 bits of eqs dynticks
>    increments still takes some bad luck (also rdp.dynticks_snap could be
>    converted from int to long?)
> 
> Some RCU eqs functions have been renamed to better reflect their broader
> scope that now include context tracking state.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Cc: Yu Liao<liaoyu15@huawei.com>
> Cc: Phil Auld <pauld@redhat.com>
> Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
>  include/linux/context_tracking.h       |  4 +-
>  include/linux/context_tracking_state.h | 33 ++++++---
>  kernel/context_tracking.c              | 92 +++++++++++++-------------
>  kernel/rcu/tree.c                      | 13 ++--
>  kernel/rcu/tree_stall.h                |  2 +-
>  5 files changed, 81 insertions(+), 63 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 63343c34ab4e..aa0e6fbf6946 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -110,7 +110,7 @@ static inline void context_tracking_init(void) { }
>  #ifdef CONFIG_CONTEXT_TRACKING
>  extern void ct_idle_enter(void);
>  extern void ct_idle_exit(void);
> -extern unsigned long rcu_dynticks_inc(int incby);
> +extern unsigned long ct_state_inc(int incby);
>  
>  /*
>   * Is the current CPU in an extended quiescent state?
> @@ -119,7 +119,7 @@ extern unsigned long rcu_dynticks_inc(int incby);
>   */
>  static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
>  {
> -	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.dynticks)) & 0x1);
> +	return !(arch_atomic_read(this_cpu_ptr(&context_tracking.state)) & RCU_DYNTICKS_IDX);
>  }
>  
>  #else
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index bca0d3e0bd3d..b8a309532c18 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -9,13 +9,29 @@
>  /* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
>  #define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
>  
> +enum {
> +	CONTEXT_IDLE_OFFSET = 0,
> +	CONTEXT_USER_OFFSET,
> +	CONTEXT_GUEST_OFFSET,
> +	CONTEXT_MAX_OFFSET,
> +};
> +
>  enum ctx_state {
>  	CONTEXT_> DISABLED = -1,	/* returned by ct_state() if unknown */
>  	CONTEXT_KERNEL = 0,
> -	CONTEXT_USER,
> -	CONTEXT_GUEST,
> +	CONTEXT_IDLE = BIT(CONTEXT_IDLE_OFFSET),
> +	CONTEXT_USER = BIT(CONTEXT_USER_OFFSET),
> +	CONTEXT_GUEST = BIT(CONTEXT_GUEST_OFFSET),
> +	CONTEXT_MAX = BIT(CONTEXT_MAX_OFFSET),
>  };
>  
> +/* Even value for idle, else odd. */
> +#define RCU_DYNTICKS_SHIFT CONTEXT_MAX_OFFSET
> +#define RCU_DYNTICKS_IDX CONTEXT_MAX
> +
> +#define CT_STATE_MASK (CONTEXT_MAX - 1)
> +#define CT_DYNTICKS_MASK (~CT_STATE_MASK)
> +
>  struct context_tracking {
>  #ifdef CONFIG_CONTEXT_TRACKING_USER
>  	/*
> @@ -26,9 +42,8 @@ struct context_tracking {
>  	 */
>  	bool active;
>  	int recursion;
> +#endif
>  	atomic_t state;
> -#endif
> -	atomic_t dynticks;		/* Even value for idle, else odd. */
>  	long dynticks_nesting;		/* Track process nesting level. */
>  	long dynticks_nmi_nesting;	/* Track irq/NMI nesting level. */
>  };
> @@ -38,24 +53,26 @@ DECLARE_PER_CPU(struct context_tracking, context_tracking);
>  
>  static __always_inline int __ct_state(void)
>  {
> -	return atomic_read(this_cpu_ptr(&context_tracking.state));
> +	return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK;
>  }
>  
>  static __always_inline int ct_dynticks(void)
>  {
> -	return atomic_read(this_cpu_ptr(&context_tracking.dynticks));
> +	return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_DYNTICKS_MASK;
>  }
>  
>  static __always_inline int ct_dynticks_cpu(int cpu)
>  {
>  	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> -	return atomic_read(&ct->dynticks);
> +
> +	return atomic_read(&ct->state) & CT_DYNTICKS_MASK;
>  }
>  
>  static __always_inline int ct_dynticks_cpu_acquire(int cpu)
>  {
>  	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> -	return atomic_read_acquire(&ct->state);
> +
> +	return atomic_read_acquire(&ct->state) & CT_DYNTICKS_MASK;
>  }
>  #endif
>  
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index fe9066fdfaab..87e7b748791c 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -63,9 +63,9 @@ static __always_inline void rcu_dynticks_task_trace_exit(void)
>   * Increment the current CPU's context_tracking structure's ->dynticks field
>   * with ordering.  Return the new value.
>   */
> -noinstr unsigned long rcu_dynticks_inc(int incby)
> +noinstr unsigned long ct_state_inc(int incby)
>  {
> -	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
> +	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.state));
>  }
>  
>  /*
> @@ -74,7 +74,7 @@ noinstr unsigned long rcu_dynticks_inc(int incby)
>   * RCU is watching prior to the call to this function and is no longer
>   * watching upon return.
>   */
> -static noinstr void rcu_dynticks_eqs_enter(void)
> +static noinstr void ct_kernel_exit_state(int offset)
>  {
>  	int seq;
>  
> @@ -84,9 +84,9 @@ static noinstr void rcu_dynticks_eqs_enter(void)
>  	 * next idle sojourn.
>  	 */
>  	rcu_dynticks_task_trace_enter();  // Before ->dynticks update!
> -	seq = rcu_dynticks_inc(1);
> +	seq = ct_state_inc(offset);
>  	// RCU is no longer watching.  Better be in extended quiescent state!
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & RCU_DYNTICKS_IDX));
>  }
>  
>  /*
> @@ -94,7 +94,7 @@ static noinstr void rcu_dynticks_eqs_enter(void)
>   * called from an extended quiescent state, that is, RCU is not watching
>   * prior to the call to this function and is watching upon return.
>   */
> -static noinstr void rcu_dynticks_eqs_exit(void)
> +static noinstr void ct_kernel_enter_state(int offset)
>  {
>  	int seq;
>  
> @@ -103,10 +103,10 @@ static noinstr void rcu_dynticks_eqs_exit(void)
>  	 * and we also must force ordering with the next RCU read-side
>  	 * critical section.
>  	 */
> -	seq = rcu_dynticks_inc(1);
> +	seq = ct_state_inc(offset);
>  	// RCU is now watching.  Better not be in an extended quiescent state!
>  	rcu_dynticks_task_trace_exit();  // After ->dynticks update!
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & RCU_DYNTICKS_IDX));
>  }
>  
>  /*
> @@ -117,7 +117,7 @@ static noinstr void rcu_dynticks_eqs_exit(void)
>   * the possibility of usermode upcalls having messed up our count
>   * of interrupt nesting level during the prior busy period.
>   */
> -static noinstr void rcu_eqs_enter(bool user)
> +static noinstr void ct_kernel_exit(bool user, int offset)
>  {
>  	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  
> @@ -137,13 +137,13 @@ static noinstr void rcu_eqs_enter(bool user)
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	rcu_preempt_deferred_qs(current);
>  
> -	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> -	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +	// instrumentation for the noinstr ct_kernel_exit_state()
> +	instrument_atomic_write(&ct->state, sizeof(ct->state));
>  
>  	instrumentation_end();
>  	WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */
>  	// RCU is watching here ...
> -	rcu_dynticks_eqs_enter();
> +	ct_kernel_exit_state(offset);
>  	// ... but is no longer watching here.
>  	rcu_dynticks_task_enter();
>  }
> @@ -152,7 +152,7 @@ static noinstr void rcu_eqs_enter(bool user)
>   * ct_nmi_exit - inform RCU of exit from NMI context
>   *
>   * If we are returning from the outermost NMI handler that interrupted an
> - * RCU-idle period, update ct->dynticks and ct->dynticks_nmi_nesting
> + * RCU-idle period, update ct->state and ct->dynticks_nmi_nesting
>   * to let the RCU grace-period handling know that the CPU is back to
>   * being RCU-idle.
>   *
> @@ -189,12 +189,12 @@ noinstr void ct_nmi_exit(void)
>  	trace_rcu_dyntick(TPS("Startirq"), ct->dynticks_nmi_nesting, 0, ct_dynticks());
>  	WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>  
> -	// instrumentation for the noinstr rcu_dynticks_eqs_enter()
> -	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +	// instrumentation for the noinstr ct_kernel_exit_state()
> +	instrument_atomic_write(&ct->state, sizeof(ct->state));
>  	instrumentation_end();
>  
>  	// RCU is watching here ...
> -	rcu_dynticks_eqs_enter();
> +	ct_kernel_exit_state(RCU_DYNTICKS_IDX);
>  	// ... but is no longer watching here.
>  
>  	if (!in_nmi())
> @@ -209,7 +209,7 @@ noinstr void ct_nmi_exit(void)
>   * allow for the possibility of usermode upcalls messing up our count of
>   * interrupt nesting level during the busy period that is just now starting.
>   */
> -static void noinstr rcu_eqs_exit(bool user)
> +static void noinstr ct_kernel_enter(bool user, int offset)
>  {
>  	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	long oldval;
> @@ -224,12 +224,12 @@ static void noinstr rcu_eqs_exit(bool user)
>  	}
>  	rcu_dynticks_task_exit();
>  	// RCU is not watching here ...
> -	rcu_dynticks_eqs_exit();
> +	ct_kernel_enter_state(offset);
>  	// ... but is watching here.
>  	instrumentation_begin();
>  
> -	// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> -	instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +	// instrumentation for the noinstr ct_kernel_enter_state()
> +	instrument_atomic_write(&ct->state, sizeof(ct->state));
>  
>  	trace_rcu_dyntick(TPS("End"), ct->dynticks_nesting, 1, ct_dynticks());
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
> @@ -242,7 +242,7 @@ static void noinstr rcu_eqs_exit(bool user)
>  /**
>   * ct_nmi_enter - inform RCU of entry to NMI context
>   *
> - * If the CPU was idle from RCU's viewpoint, update ct->dynticks and
> + * If the CPU was idle from RCU's viewpoint, update ct->state and
>   * ct->dynticks_nmi_nesting to let the RCU grace-period handling know
>   * that the CPU is active.  This implementation permits nested NMIs, as
>   * long as the nesting level does not overflow an int.  (You will probably
> @@ -273,14 +273,14 @@ noinstr void ct_nmi_enter(void)
>  			rcu_dynticks_task_exit();
>  
>  		// RCU is not watching here ...
> -		rcu_dynticks_eqs_exit();
> +		ct_kernel_enter_state(RCU_DYNTICKS_IDX);
>  		// ... but is watching here.
>  
>  		instrumentation_begin();
>  		// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
> -		instrument_atomic_read(&ct->dynticks, sizeof(ct->dynticks));
> -		// instrumentation for the noinstr rcu_dynticks_eqs_exit()
> -		instrument_atomic_write(&ct->dynticks, sizeof(ct->dynticks));
> +		instrument_atomic_read(&ct->state, sizeof(ct->state));
> +		// instrumentation for the noinstr ct_kernel_enter_state()
> +		instrument_atomic_write(&ct->state, sizeof(ct->state));
>  
>  		incby = 1;
>  	} else if (!in_nmi()) {
> @@ -373,22 +373,23 @@ void noinstr __ct_user_enter(enum ctx_state state)
>  			 * CPU doesn't need to maintain the tick for RCU maintenance purposes
>  			 * when the CPU runs in userspace.
>  			 */
> -			rcu_eqs_enter(true);
> +			ct_kernel_exit(true, RCU_DYNTICKS_IDX + state);
> +		} else {
> +			/*
> +			 * Even if context tracking is disabled on this CPU, because it's outside
> +			 * the full dynticks mask for example, we still have to keep track of the
> +			 * context transitions and states to prevent inconsistency on those of
> +			 * other CPUs.
> +			 * If a task triggers an exception in userspace, sleep on the exception
> +			 * handler and then migrate to another CPU, that new CPU must know where
> +			 * the exception returns by the time we call exception_exit().
> +			 * This information can only be provided by the previous CPU when it called
> +			 * exception_enter().
> +			 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
> +			 * is false because we know that CPU is not tickless.
> +			 */
> +			atomic_add(state, &ct->state);
>  		}
> -		/*
> -		 * Even if context tracking is disabled on this CPU, because it's outside
> -		 * the full dynticks mask for example, we still have to keep track of the
> -		 * context transitions and states to prevent inconsistency on those of
> -		 * other CPUs.
> -		 * If a task triggers an exception in userspace, sleep on the exception
> -		 * handler and then migrate to another CPU, that new CPU must know where
> -		 * the exception returns by the time we call exception_exit().
> -		 * This information can only be provided by the previous CPU when it called
> -		 * exception_enter().
> -		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
> -		 * is false because we know that CPU is not tickless.
> -		 */
> -		atomic_set(&ct->state, state);
>  	}
>  	context_tracking_recursion_exit();
>  }
> @@ -452,15 +453,16 @@ void noinstr __ct_user_exit(enum ctx_state state)
>  			 * Exit RCU idle mode while entering the kernel because it can
>  			 * run a RCU read side critical section anytime.
>  			 */
> -			rcu_eqs_exit(true);
> +			ct_kernel_enter(true, RCU_DYNTICKS_IDX - state);
>  			if (state == CONTEXT_USER) {
>  				instrumentation_begin();
>  				vtime_user_exit(current);
>  				trace_user_exit(0);
>  				instrumentation_end();
>  			}
> +		} else {
> +			atomic_sub(state, &ct->state);

OK, atomic_sub() got my attention.  What is going on here?  ;-)

>  		}
> -		atomic_set(&ct->state, CONTEXT_KERNEL);
>  	}
>  	context_tracking_recursion_exit();
>  }
> @@ -530,7 +532,7 @@ void __init context_tracking_init(void)
>  DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
>  		.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
>  		.dynticks_nesting = 1,
> -		.dynticks = ATOMIC_INIT(1),
> +		.state = ATOMIC_INIT(RCU_DYNTICKS_IDX),
>  };
>  EXPORT_SYMBOL_GPL(context_tracking);
>  
> @@ -548,7 +550,7 @@ EXPORT_SYMBOL_GPL(context_tracking);
>  void ct_idle_enter(void)
>  {
>  	lockdep_assert_irqs_disabled();
> -	rcu_eqs_enter(false);
> +	ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE);
>  }
>  EXPORT_SYMBOL_GPL(ct_idle_enter);
>  
> @@ -566,7 +568,7 @@ void ct_idle_exit(void)
>  	unsigned long flags;
>  
>  	local_irq_save(flags);
> -	rcu_eqs_exit(false);
> +	ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);

Nice!  This works because all transitions must be either from or
to kernel context, correct?

>  	local_irq_restore(flags);
>  
>  }
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 90a22dd2189d..98fac3d327c9 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -272,9 +272,9 @@ void rcu_softirq_qs(void)
>   */
>  static void rcu_dynticks_eqs_online(void)
>  {
> -	if (ct_dynticks() & 0x1)
> +	if (ct_dynticks() & RCU_DYNTICKS_IDX)
>  		return;
> -	rcu_dynticks_inc(1);
> +	ct_state_inc(RCU_DYNTICKS_IDX);
>  }
>  
>  /*
> @@ -293,7 +293,7 @@ static int rcu_dynticks_snap(int cpu)
>   */
>  static bool rcu_dynticks_in_eqs(int snap)
>  {
> -	return !(snap & 0x1);
> +	return !(snap & RCU_DYNTICKS_IDX);
>  }
>  
>  /* Return true if the specified CPU is currently idle from an RCU viewpoint.  */
> @@ -321,8 +321,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
>  	int snap;
>  
>  	// If not quiescent, force back to earlier extended quiescent state.
> -	snap = ct_dynticks_cpu(cpu) & ~0x1;
> -
> +	snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX;

Do we also need to get rid of the low-order bits?  Or is that happening
elsewhere?  Or is there some reason that they can stick around?

>  	smp_rmb(); // Order ->dynticks and *vp reads.
>  	if (READ_ONCE(*vp))
>  		return false;  // Non-zero, so report failure;
> @@ -348,9 +347,9 @@ notrace void rcu_momentary_dyntick_idle(void)
>  	int seq;
>  
>  	raw_cpu_write(rcu_data.rcu_need_heavy_qs, false);
> -	seq = rcu_dynticks_inc(2);
> +	seq = ct_state_inc(2 * RCU_DYNTICKS_IDX);
>  	/* It is illegal to call this from idle state. */
> -	WARN_ON_ONCE(!(seq & 0x1));
> +	WARN_ON_ONCE(!(seq & RCU_DYNTICKS_IDX));
>  	rcu_preempt_deferred_qs(current);
>  }
>  EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle);
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 9bf5cc79d5eb..1ac48c804006 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -459,7 +459,7 @@ static void print_cpu_stall_info(int cpu)
>  			rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
>  				"!."[!delta],
>  	       ticks_value, ticks_title,
> -	       rcu_dynticks_snap(cpu) & 0xfff,
> +	       (rcu_dynticks_snap(cpu) >> RCU_DYNTICKS_SHIFT) & 0xfff ,

Actually, the low-ordder several bits are useful when debugging, so
could you please not shift them away?  Maybe also go to 0xffff to allow
for more bits taken?

>  	       ct->dynticks_nesting, ct->dynticks_nmi_nesting,
>  	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
>  	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking
  2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
                   ` (18 preceding siblings ...)
  2022-03-02 15:48 ` [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking Frederic Weisbecker
@ 2022-03-11 11:37 ` nicolas saenz julienne
  19 siblings, 0 replies; 57+ messages in thread
From: nicolas saenz julienne @ 2022-03-11 11:37 UTC (permalink / raw)
  To: Frederic Weisbecker, LKML
  Cc: Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, 2022-03-02 at 16:47 +0100, Frederic Weisbecker wrote:
> This mixes up the RCU dynticks counter and the context tracking state
> updates into a single atomic instruction. This may serve several
> purposes:
> 
> 1) Improve CPU isolation with deferring some disturbances until sensitive
>    userspace workload completes and goes to the kernel. This can take
>    several forms, for example smp_call_function_housekeeping() or
>    on_each_housekeeping_cpu() to enqueue and execute work on all
>    housekeeping CPUs. Then an atomic operation on ct->state can defer
>    the work on nohz_full CPUs until they run in kernel (or IPI them
>    if they are in kernel mode), see this proposal by Peter:
>    https://lore.kernel.org/all/20210929151723.162004989@infradead.org/#r
> 
> 2) Unearth sysidle (https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=sysidle.2017.05.11a&id=fe5ac724d81a3c7803e60c2232718f212f3f38d4)
>    This feature allowed to shutdown the tick on the last housekeeping
>    CPU once the rest of the system is fully idle. We needed some proper
>    fully ordered context tracking for that.
> 
> Inspired by Peterz: https://lore.kernel.org/all/20210929151723.162004989@infradead.org
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> 	rcu/context-tracking
> 
> HEAD: e4eaff86ec91c1cbde9a113cf5232dac9f897337
> 
> Thanks,
> 	Frederic
> ---

I've been testing patches 1 to 18 on my nohz_full setup. Everything is looking
good so far.

Regards,
Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking
  2022-03-08 16:15   ` nicolas saenz julienne
@ 2022-03-11 15:16     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-11 15:16 UTC (permalink / raw)
  To: nicolas saenz julienne
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Tue, Mar 08, 2022 at 05:15:14PM +0100, nicolas saenz julienne wrote:
> Hi Frederic,
> 
> On Wed, 2022-03-02 at 16:48 +0100, Frederic Weisbecker wrote:
> > Since a CPU may save the state of the context tracking using
> > exception_enter() before calling into schedule(), we need all CPUs in
> > the system to track user <-> kernel transitions and not just those that
> > really need it (nohz_full CPUs).
> > 
> > The following illustrates the issue that could otherwise happen:
> > 
> >      CPU 0 (not tracking)                       CPU 1 (tracking)
> >      -------------------                       --------------------
> >      // we are past user_enter()
> >      // but this CPU is always in
> >      // CONTEXT_KERNEL
> >      // because it doesn't track user <-> kernel
> > 
> >      ctx = exception_enter(); //ctx == CONTEXT_KERNEL
> >      schedule();
> >      ===========================================>
> >                                                return from schedule();
> >                                                exception_exit(ctx);
> >                                                //go to user in CONTEXT_KERNEL
> > 
> > However CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK doesn't play those
> > games because schedule() can't be called between user_enter() and
> > user_exit() under such config. In this situation we can spare context
> > tracking on the CPUs that don't need it.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> > Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> > Cc: Yu Liao<liaoyu15@huawei.com>
> > Cc: Phil Auld <pauld@redhat.com>
> > Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> > Cc: Alex Belits <abelits@marvell.com>
> > ---
> >  kernel/context_tracking.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index 87e7b748791c..b1934264f77f 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -374,7 +374,7 @@ void noinstr __ct_user_enter(enum ctx_state state)
> >  			 * when the CPU runs in userspace.
> >  			 */
> >  			ct_kernel_exit(true, RCU_DYNTICKS_IDX + state);
> > -		} else {
> > +		} else if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)) {
> 
> user entry code assumes that state will be kept on all CPUs as long as context
> tracking is enabled. See kernel/entry/common.c:
> 
>    static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
>    {
>            arch_check_user_regs(regs);
>            lockdep_hardirqs_off(CALLER_ADDR0);
>            
>            CT_WARN_ON(ct_state() != CONTEXT_USER); <-- NOT HAPPY ABOUT THIS
>            CHANGE

Good point!

So I need to do:

#ifdef CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK
#define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled_this_cpu() && (cond))
#else
#define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled() && (cond))
#endif

Thanks.

>            user_exit_irqoff();
>            
>            instrumentation_begin();
>            trace_hardirqs_off_finish();
>            instrumentation_end();
>    }
> 
> Regards,
> Nicolas

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions
  2022-03-09 16:40   ` nicolas saenz julienne
@ 2022-03-11 15:19     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-11 15:19 UTC (permalink / raw)
  To: nicolas saenz julienne
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 09, 2022 at 05:40:15PM +0100, nicolas saenz julienne wrote:
> On Wed, 2022-03-02 at 16:48 +0100, Frederic Weisbecker wrote:
> > Some eqs functions are now only used internally by context tracking, so
> > their public declarations can be removed.
> > 
> > Also middle functions such as rcu_user_*() and rcu_idle_*()
> > which now directly call to rcu_eqs_enter() and rcu_eqs_exit() can be
> > wiped out as well.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> > Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> > Cc: Yu Liao<liaoyu15@huawei.com>
> > Cc: Phil Auld <pauld@redhat.com>
> > Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> > Cc: Alex Belits <abelits@marvell.com>
> > ---
> 
> You missed rcu_user_{enter,exit} declarations in rcupdate.h
> 
> There are also comments refering to them in kernel/context_tracking.c and
> Documentation/RCU/stallwarn.rst.

And I thought nobody would notice ;-)

Thanks!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/19] context_tracking: Convert state to atomic_t
  2022-03-09 17:17   ` nicolas saenz julienne
@ 2022-03-11 15:24     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-11 15:24 UTC (permalink / raw)
  To: nicolas saenz julienne
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits, Xiongfeng Wang,
	Neeraj Upadhyay, Thomas Gleixner, Yu Liao, Boqun Feng,
	Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 09, 2022 at 06:17:02PM +0100, nicolas saenz julienne wrote:
> On Wed, 2022-03-02 at 16:48 +0100, Frederic Weisbecker wrote:
> > Context tracking's state and dynticks counter are going to be merged
> > in a single field so that both updates can happen atomically and at the
> > same time. Prepare for that with converting the state into an atomic_t.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> > Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
> > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> > Cc: Yu Liao<liaoyu15@huawei.com>
> > Cc: Phil Auld <pauld@redhat.com>
> > Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
> > Cc: Alex Belits <abelits@marvell.com>
> > ---
> >  static __always_inline bool context_tracking_in_user(void)
> >  {
> > -	return __this_cpu_read(context_tracking.state) == CONTEXT_USER;
> > +	return __ct_state() == CONTEXT_USER;
> >  }
> 
> I was wondering whether it'd make more sense to use ct_state() for extra safety
> vs preemption, but it turns out the function isn't being used at all.
> 
> I figure it'd be better to remove it altogether and leave ct_state() as the
> goto function for this sort of checks.

Ah even better!

> 
> >  #else
> >  static inline bool context_tracking_in_user(void) { return false; }
> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index de247e758767..69db43548768 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -337,6 +337,7 @@ static __always_inline void context_tracking_recursion_exit(void)
> >   */
> >  void noinstr __ct_user_enter(enum ctx_state state)
> >  {
> > +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
> 
> I wonder if there is any value to having __ct_state() take 'struct
> context_tracking *ct' as an argument to avoid a redundant this_cpu_ptr()...
> 
> >  	lockdep_assert_irqs_disabled();
> >  
> >  	/* Kernel threads aren't supposed to go to userspace */
> > @@ -345,8 +346,8 @@ void noinstr __ct_user_enter(enum ctx_state state)
> >  	if (!context_tracking_recursion_enter())
> >  		return;
> >  
> > -	if ( __this_cpu_read(context_tracking.state) != state) {
> > -		if (__this_cpu_read(context_tracking.active)) {
> > +	if (__ct_state() != state) {
> 
> ...here (and in __ct_user_exit()).

Hmm, I'll check that.

Thanks!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 05/19] context_tracking: Split user tracking Kconfig
  2022-03-10 19:43   ` Paul E. McKenney
@ 2022-03-11 15:49     ` Frederic Weisbecker
  0 siblings, 0 replies; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-11 15:49 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Thu, Mar 10, 2022 at 11:43:46AM -0800, Paul E. McKenney wrote:
> On Wed, Mar 02, 2022 at 04:47:56PM +0100, Frederic Weisbecker wrote:
> > Context tracking is going to be used not only to track user transitions
> > but also idle/IRQs/NMIs. The user tracking part will then become a
> > seperate feature. Prepare Kconfig for that.
> 
> s/seperate/separate/ # nit

Thanks, of course I'm never sure about that one.

> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index 678a80713b21..1a3b79cfc9e3 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -762,7 +762,7 @@ config HAVE_ARCH_WITHIN_STACK_FRAMES
> >  	  and similar) by implementing an inline arch_within_stack_frames(),
> >  	  which is used by CONFIG_HARDENED_USERCOPY.
> >  
> > -config HAVE_CONTEXT_TRACKING
> > +config HAVE_CONTEXT_TRACKING_USER
> 
> Just checking...  This means that only some configs will see userland
> execution as being different than kernel execution, correct?  (Which
> is the case today, to be fair.)

Exactly!

Thanks!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
  2022-03-10 20:07   ` Paul E. McKenney
@ 2022-03-11 16:02     ` Frederic Weisbecker
  2022-03-11 16:14       ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-11 16:02 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Thu, Mar 10, 2022 at 12:07:05PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 02, 2022 at 04:48:05PM +0100, Frederic Weisbecker wrote:
> > Move the core RCU eqs/dynticks functions to context tracking so that
> > we can later merge all that code within context tracking.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> 
> I am not sure that you want rcu_dynticks_task_enter() and friends in
> context tracking, but I have no objection to them living there.  ;-)

I initially tried to keep them in RCU headers but their use of "current"
would imply a circular dependency with sched.h.

Not much appealing alternatives could be:

* macrofying them
* uninline them and keep in RCU

...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
  2022-03-11 16:02     ` Frederic Weisbecker
@ 2022-03-11 16:14       ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-11 16:14 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Fri, Mar 11, 2022 at 05:02:39PM +0100, Frederic Weisbecker wrote:
> On Thu, Mar 10, 2022 at 12:07:05PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 02, 2022 at 04:48:05PM +0100, Frederic Weisbecker wrote:
> > > Move the core RCU eqs/dynticks functions to context tracking so that
> > > we can later merge all that code within context tracking.
> > > 
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > 
> > I am not sure that you want rcu_dynticks_task_enter() and friends in
> > context tracking, but I have no objection to them living there.  ;-)
> 
> I initially tried to keep them in RCU headers but their use of "current"
> would imply a circular dependency with sched.h.
> 
> Not much appealing alternatives could be:
> 
> * macrofying them
> * uninline them and keep in RCU
> 
> ...

Sounds like good reasons for them to live outside of kernel/rcu.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states
  2022-03-10 20:32   ` Paul E. McKenney
@ 2022-03-11 16:35     ` Frederic Weisbecker
  2022-03-11 17:28       ` Paul E. McKenney
  0 siblings, 1 reply; 57+ messages in thread
From: Frederic Weisbecker @ 2022-03-11 16:35 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Thu, Mar 10, 2022 at 12:32:22PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 02, 2022 at 04:48:09PM +0100, Frederic Weisbecker wrote:
> > Updating the context tracking state and the RCU dynticks counter
> > atomically in a single operation is a first step towards improving CPU
> > isolation. This makes the context tracking state updates fully ordered
> > and therefore allow for later enhancements such as postponing some work
> > while a task is running isolated in userspace until it ever comes back
> > to the kernel.
> > 
> > The state field becomes divided in two parts:
> > 
> > 1) Lower bits for context tracking state:
> > 
> >    	CONTEXT_IDLE = 1,
> > 	CONTEXT_USER = 2,
> > 	CONTEXT_GUEST = 4,
> 
> And the CONTEXT_DISABLED value of -1 works because you can have only
> one of the above three bits set at a time?
> 
> Except that RCU needs this to unconditionally at least distinguish
> between kernel and idle, given the prevalence of CONFIG_NO_HZ_IDLE=y.
> So does the CONTEXT_DISABLED really happen anymore?
> 
> A few more questions interspersed below.

The value of CONTEXT_DISABLED is never stored in the ct->state. It is just
returned as is when CONTEXT_TRACKING is disabled. So this shouldn't conflict
with RCU.

> > @@ -452,15 +453,16 @@ void noinstr __ct_user_exit(enum ctx_state state)
> >  			 * Exit RCU idle mode while entering the kernel because it can
> >  			 * run a RCU read side critical section anytime.
> >  			 */
> > -			rcu_eqs_exit(true);
> > +			ct_kernel_enter(true, RCU_DYNTICKS_IDX - state);
> >  			if (state == CONTEXT_USER) {
> >  				instrumentation_begin();
> >  				vtime_user_exit(current);
> >  				trace_user_exit(0);
> >  				instrumentation_end();
> >  			}
> > +		} else {
> > +			atomic_sub(state, &ct->state);
> 
> OK, atomic_sub() got my attention.  What is going on here?  ;-)

Right :-)

So that's when context tracking user is running but RCU doesn't
track user. This is for example when NO_HZ_FULL=n but VIRT_CPU_ACCOUNTING_GEN=y.

I might remove that standalone VIRT_CPU_ACCOUNTING_GEN=y one day but for now
it's there.

Anyway so in this case we only want to track KERNEL <-> USER from context
tracking POV, but we don't need the DYNTICKS_RCU_IDX part, hence the spared
ordering.

But it still needs to be atomic because NMIs may increase DYNTICKS_RCU_IDX on
the same field.


> > @@ -548,7 +550,7 @@ EXPORT_SYMBOL_GPL(context_tracking);
> >  void ct_idle_enter(void)
> >  {
> >  	lockdep_assert_irqs_disabled();
> > -	rcu_eqs_enter(false);
> > +	ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE);
> >  }
> >  EXPORT_SYMBOL_GPL(ct_idle_enter);
> >  
> > @@ -566,7 +568,7 @@ void ct_idle_exit(void)
> >  	unsigned long flags;
> >  
> >  	local_irq_save(flags);
> > -	rcu_eqs_exit(false);
> > +	ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);
> 
> Nice!  This works because all transitions must be either from or
> to kernel context, correct?

Exactly. There is no such thing as IDLE -> USER -> GUEST, etc...
There has to be KERNEL in the middle of each. Because we never
call rcu_idle_enter() -> rcu_user_enter() for example. The has to be
rcu_idle_exit() in the middle.

(famous last words).

> >  /* Return true if the specified CPU is currently idle from an RCU viewpoint.  */
> > @@ -321,8 +321,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
> >  	int snap;
> >  
> >  	// If not quiescent, force back to earlier extended quiescent state.
> > -	snap = ct_dynticks_cpu(cpu) & ~0x1;
> > -
> > +	snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX;
> 
> Do we also need to get rid of the low-order bits?  Or is that happening
> elsewhere?  Or is there some reason that they can stick around?

Yep, ct_dynticks_cpu() clears the low order CONTEXT_* bits.

> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index 9bf5cc79d5eb..1ac48c804006 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -459,7 +459,7 @@ static void print_cpu_stall_info(int cpu)
> >  			rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
> >  				"!."[!delta],
> >  	       ticks_value, ticks_title,
> > -	       rcu_dynticks_snap(cpu) & 0xfff,
> > +	       (rcu_dynticks_snap(cpu) >> RCU_DYNTICKS_SHIFT) & 0xfff ,
> 
> Actually, the low-ordder several bits are useful when debugging, so
> could you please not shift them away?  Maybe also go to 0xffff to allow
> for more bits taken?

Yeah that makes sense, I'll change that.

Thanks a lot for the reviews!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states
  2022-03-11 16:35     ` Frederic Weisbecker
@ 2022-03-11 17:28       ` Paul E. McKenney
  0 siblings, 0 replies; 57+ messages in thread
From: Paul E. McKenney @ 2022-03-11 17:28 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Marcelo Tosatti,
	Paul Gortmaker, Uladzislau Rezki, Joel Fernandes

On Fri, Mar 11, 2022 at 05:35:25PM +0100, Frederic Weisbecker wrote:
> On Thu, Mar 10, 2022 at 12:32:22PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 02, 2022 at 04:48:09PM +0100, Frederic Weisbecker wrote:
> > > Updating the context tracking state and the RCU dynticks counter
> > > atomically in a single operation is a first step towards improving CPU
> > > isolation. This makes the context tracking state updates fully ordered
> > > and therefore allow for later enhancements such as postponing some work
> > > while a task is running isolated in userspace until it ever comes back
> > > to the kernel.
> > > 
> > > The state field becomes divided in two parts:
> > > 
> > > 1) Lower bits for context tracking state:
> > > 
> > >    	CONTEXT_IDLE = 1,
> > > 	CONTEXT_USER = 2,
> > > 	CONTEXT_GUEST = 4,
> > 
> > And the CONTEXT_DISABLED value of -1 works because you can have only
> > one of the above three bits set at a time?
> > 
> > Except that RCU needs this to unconditionally at least distinguish
> > between kernel and idle, given the prevalence of CONFIG_NO_HZ_IDLE=y.
> > So does the CONTEXT_DISABLED really happen anymore?
> > 
> > A few more questions interspersed below.
> 
> The value of CONTEXT_DISABLED is never stored in the ct->state. It is just
> returned as is when CONTEXT_TRACKING is disabled. So this shouldn't conflict
> with RCU.

Whew!  ;-)

> > > @@ -452,15 +453,16 @@ void noinstr __ct_user_exit(enum ctx_state state)
> > >  			 * Exit RCU idle mode while entering the kernel because it can
> > >  			 * run a RCU read side critical section anytime.
> > >  			 */
> > > -			rcu_eqs_exit(true);
> > > +			ct_kernel_enter(true, RCU_DYNTICKS_IDX - state);
> > >  			if (state == CONTEXT_USER) {
> > >  				instrumentation_begin();
> > >  				vtime_user_exit(current);
> > >  				trace_user_exit(0);
> > >  				instrumentation_end();
> > >  			}
> > > +		} else {
> > > +			atomic_sub(state, &ct->state);
> > 
> > OK, atomic_sub() got my attention.  What is going on here?  ;-)
> 
> Right :-)
> 
> So that's when context tracking user is running but RCU doesn't
> track user. This is for example when NO_HZ_FULL=n but VIRT_CPU_ACCOUNTING_GEN=y.
> 
> I might remove that standalone VIRT_CPU_ACCOUNTING_GEN=y one day but for now
> it's there.
> 
> Anyway so in this case we only want to track KERNEL <-> USER from context
> tracking POV, but we don't need the DYNTICKS_RCU_IDX part, hence the spared
> ordering.
> 
> But it still needs to be atomic because NMIs may increase DYNTICKS_RCU_IDX on
> the same field.

OK, so the idea is because NO_HZ_FULL=n, RCU doesn't care about user
space execution?

How about looking at it the other way?  Is there some reason that RCU
shouldn't take advantage of the userspace-execution information when it
exists?  For example, in the NO_HZ_FULL=n but VIRT_CPU_ACCOUNTING_GEN=y
case, is there some chance that RCU would be ignoring a non-noinstr
function?

> > > @@ -548,7 +550,7 @@ EXPORT_SYMBOL_GPL(context_tracking);
> > >  void ct_idle_enter(void)
> > >  {
> > >  	lockdep_assert_irqs_disabled();
> > > -	rcu_eqs_enter(false);
> > > +	ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE);
> > >  }
> > >  EXPORT_SYMBOL_GPL(ct_idle_enter);
> > >  
> > > @@ -566,7 +568,7 @@ void ct_idle_exit(void)
> > >  	unsigned long flags;
> > >  
> > >  	local_irq_save(flags);
> > > -	rcu_eqs_exit(false);
> > > +	ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);
> > 
> > Nice!  This works because all transitions must be either from or
> > to kernel context, correct?
> 
> Exactly. There is no such thing as IDLE -> USER -> GUEST, etc...
> There has to be KERNEL in the middle of each. Because we never
> call rcu_idle_enter() -> rcu_user_enter() for example. The has to be
> rcu_idle_exit() in the middle.
> 
> (famous last words).

Works for me, for the moment, anyway.  ;-)

> > >  /* Return true if the specified CPU is currently idle from an RCU viewpoint.  */
> > > @@ -321,8 +321,7 @@ bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
> > >  	int snap;
> > >  
> > >  	// If not quiescent, force back to earlier extended quiescent state.
> > > -	snap = ct_dynticks_cpu(cpu) & ~0x1;
> > > -
> > > +	snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX;
> > 
> > Do we also need to get rid of the low-order bits?  Or is that happening
> > elsewhere?  Or is there some reason that they can stick around?
> 
> Yep, ct_dynticks_cpu() clears the low order CONTEXT_* bits.

Whew!  ;-)

> > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > index 9bf5cc79d5eb..1ac48c804006 100644
> > > --- a/kernel/rcu/tree_stall.h
> > > +++ b/kernel/rcu/tree_stall.h
> > > @@ -459,7 +459,7 @@ static void print_cpu_stall_info(int cpu)
> > >  			rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
> > >  				"!."[!delta],
> > >  	       ticks_value, ticks_title,
> > > -	       rcu_dynticks_snap(cpu) & 0xfff,
> > > +	       (rcu_dynticks_snap(cpu) >> RCU_DYNTICKS_SHIFT) & 0xfff ,
> > 
> > Actually, the low-ordder several bits are useful when debugging, so
> > could you please not shift them away?  Maybe also go to 0xffff to allow
> > for more bits taken?
> 
> Yeah that makes sense, I'll change that.
> 
> Thanks a lot for the reviews!

Thank you for the series!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/19] context_tracking: Convert state to atomic_t
  2022-03-02 15:48 ` [PATCH 16/19] context_tracking: Convert state to atomic_t Frederic Weisbecker
  2022-03-09 17:17   ` nicolas saenz julienne
@ 2022-03-12 22:54   ` Peter Zijlstra
  2022-03-21 13:32     ` Will Deacon
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-12 22:54 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes, Mark Rutland, Will Deacon

On Wed, Mar 02, 2022 at 04:48:07PM +0100, Frederic Weisbecker wrote:
> +static __always_inline int __ct_state(void)
> +{
> +	return atomic_read(this_cpu_ptr(&context_tracking.state));
> +}

One arguably horrible thing to do would be to write it like:

	return __this_cpu_read(context_tracking.state.counter);

IIRC that will actually DTRT since atomic_read() is basically defined to
be READ_ONCE() and this_cpu_read() implies the same.

Only PowerPC and s390 implement arch_atomic_read() in asm, but I don't
think they have a particularly good reason to. The only other weird case
is Alpha, where READ_ONCE() implies smp_mb() because Alpha. I'm not sure
we care about that case, hmm?

The same can be done for ct_dynticks(), which is basically the same
function with a different mask.

As mentioned elsewhere, ct_state() appears unused at the end of the
ride.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking
  2022-03-02 15:48 ` [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Frederic Weisbecker
  2022-03-10 20:07   ` Paul E. McKenney
@ 2022-03-12 23:10   ` Peter Zijlstra
  1 sibling, 0 replies; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-12 23:10 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:05PM +0100, Frederic Weisbecker wrote:
> +noinstr unsigned long rcu_dynticks_inc(int incby)
> +{
> +	return arch_atomic_add_return(incby, this_cpu_ptr(&context_tracking.dynticks));
> +}

noinstr implies noinline, making the above a rather sad little function;
would it perhaps be better to make it __always_inline ?

Also; I could imagine myself doing an arch special for this such that
x86 generates:

	LOCK XADD	[reg], %gs:[var]

But that's for later I suppose, it would be but a little tweak to
perpcu_add_return_op()

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting to context tracking
  2022-03-02 15:48 ` [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting " Frederic Weisbecker
  2022-03-10 20:01   ` Paul E. McKenney
@ 2022-03-12 23:23   ` Peter Zijlstra
  1 sibling, 0 replies; 57+ messages in thread
From: Peter Zijlstra @ 2022-03-12 23:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Phil Auld, Alex Belits, Nicolas Saenz Julienne,
	Xiongfeng Wang, Neeraj Upadhyay, Thomas Gleixner, Yu Liao,
	Boqun Feng, Paul E . McKenney, Marcelo Tosatti, Paul Gortmaker,
	Uladzislau Rezki, Joel Fernandes

On Wed, Mar 02, 2022 at 04:48:02PM +0100, Frederic Weisbecker wrote:

> @@ -441,7 +440,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  	lockdep_assert_irqs_disabled();
>  
>  	/* Check for counter underflows */
> -	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) < 0,
> +	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) < 0,
>  			 "RCU dynticks_nesting counter underflow!");
>  	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 0,
>  			 "RCU dynticks_nmi_nesting counter underflow/zero!");
> @@ -457,7 +456,7 @@ static int rcu_is_cpu_rrupt_from_idle(void)
>  	WARN_ON_ONCE(!nesting && !is_idle_task(current));
>  
>  	/* Does CPU appear to be idle from an RCU standpoint? */
> -	return __this_cpu_read(rcu_data.dynticks_nesting) == 0;
> +	return __this_cpu_read(context_tracking.dynticks_nesting) == 0;
>  }
>  
>  #define DEFAULT_RCU_BLIMIT (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) ? 1000 : 10)

> @@ -798,7 +797,7 @@ void rcu_irq_exit_check_preempt(void)
>  {
>  	lockdep_assert_irqs_disabled();
>  
> -	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) <= 0,
> +	RCU_LOCKDEP_WARN(__this_cpu_read(context_tracking.dynticks_nesting) <= 0,
>  			 "RCU dynticks_nesting counter underflow/zero!");
>  	RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
>  			 DYNTICK_IRQ_NONIDLE,

Would it make sense to create __ct_wrappers() to access the
dynticks_{nmi_,}_nesting counters ?


> @@ -4122,12 +4121,13 @@ static void rcu_init_new_rnp(struct rcu_node *rnp_leaf)
>  static void __init
>  rcu_boot_init_percpu_data(int cpu)
>  {
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  
>  	/* Set up local state, ensuring consistent view of global state. */
>  	rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
>  	INIT_WORK(&rdp->strict_work, strict_work_handler);
> -	WARN_ON_ONCE(rdp->dynticks_nesting != 1);
> +	WARN_ON_ONCE(ct->dynticks_nesting != 1);
>  	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu)));
>  	rdp->barrier_seq_snap = rcu_state.barrier_sequence;
>  	rdp->rcu_ofl_gp_seq = rcu_state.gp_seq;
> @@ -4152,6 +4152,7 @@ rcu_boot_init_percpu_data(int cpu)
>  int rcutree_prepare_cpu(unsigned int cpu)
>  {
>  	unsigned long flags;
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  	struct rcu_node *rnp = rcu_get_root();
>  

> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 202129b1c7e4..30a5e0a8ddb3 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -429,6 +429,7 @@ static void print_cpu_stall_info(int cpu)
>  {
>  	unsigned long delta;
>  	bool falsepositive;
> +	struct context_tracking *ct = this_cpu_ptr(&context_tracking);
>  	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>  	char *ticks_title;
>  	unsigned long ticks_value;
> @@ -459,7 +460,7 @@ static void print_cpu_stall_info(int cpu)
>  				"!."[!delta],
>  	       ticks_value, ticks_title,
>  	       rcu_dynticks_snap(cpu) & 0xfff,
> -	       rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
> +	       ct->dynticks_nesting, rdp->dynticks_nmi_nesting,
>  	       rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
>  	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
>  	       falsepositive ? " (false positive?)" : "");

And perhaps helpers here too? RCU grubbing in the context_tracking
internals seems a bit yuck.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/19] context_tracking: Convert state to atomic_t
  2022-03-12 22:54   ` Peter Zijlstra
@ 2022-03-21 13:32     ` Will Deacon
  0 siblings, 0 replies; 57+ messages in thread
From: Will Deacon @ 2022-03-21 13:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, LKML, Phil Auld, Alex Belits,
	Nicolas Saenz Julienne, Xiongfeng Wang, Neeraj Upadhyay,
	Thomas Gleixner, Yu Liao, Boqun Feng, Paul E . McKenney,
	Marcelo Tosatti, Paul Gortmaker, Uladzislau Rezki,
	Joel Fernandes, Mark Rutland

On Sat, Mar 12, 2022 at 11:54:09PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 02, 2022 at 04:48:07PM +0100, Frederic Weisbecker wrote:
> > +static __always_inline int __ct_state(void)
> > +{
> > +	return atomic_read(this_cpu_ptr(&context_tracking.state));
> > +}
> 
> One arguably horrible thing to do would be to write it like:
> 
> 	return __this_cpu_read(context_tracking.state.counter);
> 
> IIRC that will actually DTRT since atomic_read() is basically defined to
> be READ_ONCE() and this_cpu_read() implies the same.
> 
> Only PowerPC and s390 implement arch_atomic_read() in asm, but I don't
> think they have a particularly good reason to. The only other weird case
> is Alpha, where READ_ONCE() implies smp_mb() because Alpha. I'm not sure
> we care about that case, hmm?

If we don't care about the dependency ordering, then __READ_ONCE() is the
chappy to use if the types don't get in the way too much.

Will

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2022-03-21 13:32 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-02 15:47 [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking Frederic Weisbecker
2022-03-02 15:47 ` [PATCH 01/19] context_tracking: Rename __context_tracking_enter/exit() to __ct_user_enter/exit() Frederic Weisbecker
2022-03-10 19:27   ` Paul E. McKenney
2022-03-02 15:47 ` [PATCH 02/19] context_tracking: Rename context_tracking_user_enter/exit() to user_enter/exit_callable() Frederic Weisbecker
2022-03-05 13:59   ` Peter Zijlstra
2022-03-09 20:53     ` Frederic Weisbecker
2022-03-02 15:47 ` [PATCH 03/19] context_tracking: Rename context_tracking_enter/exit() to ct_user_enter/exit() Frederic Weisbecker
2022-03-05 14:02   ` Peter Zijlstra
2022-03-09 21:21     ` Frederic Weisbecker
2022-03-02 15:47 ` [PATCH 04/19] context_tracking: Rename context_tracking_cpu_set() to context_tracking_cpu_track_user() Frederic Weisbecker
2022-03-05 14:03   ` Peter Zijlstra
2022-03-09 21:11     ` Frederic Weisbecker
2022-03-02 15:47 ` [PATCH 05/19] context_tracking: Split user tracking Kconfig Frederic Weisbecker
2022-03-10 19:43   ` Paul E. McKenney
2022-03-11 15:49     ` Frederic Weisbecker
2022-03-02 15:47 ` [PATCH 06/19] context_tracking: Take idle eqs entrypoints over RCU Frederic Weisbecker
2022-03-05 14:05   ` Peter Zijlstra
2022-03-09 21:12     ` Frederic Weisbecker
2022-03-02 15:47 ` [PATCH 07/19] context_tracking: Take IRQ " Frederic Weisbecker
2022-03-10 19:46   ` Paul E. McKenney
2022-03-02 15:47 ` [PATCH 08/19] context_tracking: Take NMI " Frederic Weisbecker
2022-03-10 19:47   ` Paul E. McKenney
2022-03-02 15:48 ` [PATCH 09/19] rcu/context-tracking: Remove rcu_irq_enter/exit() Frederic Weisbecker
2022-03-05 14:16   ` Peter Zijlstra
2022-03-09 22:25     ` Frederic Weisbecker
2022-03-02 15:48 ` [PATCH 10/19] rcu/context_tracking: Move dynticks counter to context tracking Frederic Weisbecker
2022-03-10 20:00   ` Paul E. McKenney
2022-03-02 15:48 ` [PATCH 11/19] rcu/context_tracking: Move dynticks_nesting " Frederic Weisbecker
2022-03-10 20:01   ` Paul E. McKenney
2022-03-12 23:23   ` Peter Zijlstra
2022-03-02 15:48 ` [PATCH 12/19] rcu/context_tracking: Move dynticks_nmi_nesting " Frederic Weisbecker
2022-03-10 20:02   ` Paul E. McKenney
2022-03-02 15:48 ` [PATCH 13/19] rcu/context-tracking: Move deferred nocb resched " Frederic Weisbecker
2022-03-10 20:04   ` Paul E. McKenney
2022-03-02 15:48 ` [PATCH 14/19] rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Frederic Weisbecker
2022-03-10 20:07   ` Paul E. McKenney
2022-03-11 16:02     ` Frederic Weisbecker
2022-03-11 16:14       ` Paul E. McKenney
2022-03-12 23:10   ` Peter Zijlstra
2022-03-02 15:48 ` [PATCH 15/19] rcu/context-tracking: Remove unused and/or unecessary middle functions Frederic Weisbecker
2022-03-09 16:40   ` nicolas saenz julienne
2022-03-11 15:19     ` Frederic Weisbecker
2022-03-02 15:48 ` [PATCH 16/19] context_tracking: Convert state to atomic_t Frederic Weisbecker
2022-03-09 17:17   ` nicolas saenz julienne
2022-03-11 15:24     ` Frederic Weisbecker
2022-03-12 22:54   ` Peter Zijlstra
2022-03-21 13:32     ` Will Deacon
2022-03-02 15:48 ` [PATCH 17/19] rcu/context-tracking: Use accessor for dynticks counter value Frederic Weisbecker
2022-03-10 20:08   ` Paul E. McKenney
2022-03-02 15:48 ` [PATCH 18/19] rcu/context_tracking: Merge dynticks counter and context tracking states Frederic Weisbecker
2022-03-10 20:32   ` Paul E. McKenney
2022-03-11 16:35     ` Frederic Weisbecker
2022-03-11 17:28       ` Paul E. McKenney
2022-03-02 15:48 ` [PATCH 19/19] context_tracking: Exempt CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK from non-active tracking Frederic Weisbecker
2022-03-08 16:15   ` nicolas saenz julienne
2022-03-11 15:16     ` Frederic Weisbecker
2022-03-11 11:37 ` [PATCH 00/19] rcu/context-tracking: Merge RCU eqs-dynticks counter to context tracking nicolas saenz julienne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).