linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kprobes: Fix to delay the kprobes jump optimization
@ 2021-02-18 14:29 Masami Hiramatsu
  2021-02-18 15:15 ` Paul E. McKenney
  2021-02-19 19:36 ` Steven Rostedt
  0 siblings, 2 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2021-02-18 14:29 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: Paul E . McKenney, Masami Hiramatsu, Sebastian Andrzej Siewior,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
moved the kprobe setup in early_initcall(), which includes kprobe
jump optimization.
The kprobes jump optimizer involves synchronize_rcu_tasks() which
depends on the ksoftirqd and rcu_spawn_tasks_*(). However, since
those are setup in core_initcall(), kprobes jump optimizer can not
run at the early_initcall().

To avoid this issue, make the kprobe optimization disabled in the
early_initcall() and enables it in subsys_initcall().

Note that non-optimized kprobes is still available after
early_initcall(). Only jump optimization is delayed.

Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
---
 kernel/kprobes.c |   31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d5a3eb74a657..779d8322e307 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -861,7 +861,6 @@ static void try_to_optimize_kprobe(struct kprobe *p)
 	cpus_read_unlock();
 }
 
-#ifdef CONFIG_SYSCTL
 static void optimize_all_kprobes(void)
 {
 	struct hlist_head *head;
@@ -887,6 +886,7 @@ static void optimize_all_kprobes(void)
 	mutex_unlock(&kprobe_mutex);
 }
 
+#ifdef CONFIG_SYSCTL
 static void unoptimize_all_kprobes(void)
 {
 	struct hlist_head *head;
@@ -2497,18 +2497,14 @@ static int __init init_kprobes(void)
 		}
 	}
 
-#if defined(CONFIG_OPTPROBES)
-#if defined(__ARCH_WANT_KPROBES_INSN_SLOT)
-	/* Init kprobe_optinsn_slots */
-	kprobe_optinsn_slots.insn_size = MAX_OPTINSN_SIZE;
-#endif
-	/* By default, kprobes can be optimized */
-	kprobes_allow_optimization = true;
-#endif
-
 	/* By default, kprobes are armed */
 	kprobes_all_disarmed = false;
 
+#if defined(CONFIG_OPTPROBES) && defined(__ARCH_WANT_KPROBES_INSN_SLOT)
+	/* Init kprobe_optinsn_slots for allocation */
+	kprobe_optinsn_slots.insn_size = MAX_OPTINSN_SIZE;
+#endif
+
 	err = arch_init_kprobes();
 	if (!err)
 		err = register_die_notifier(&kprobe_exceptions_nb);
@@ -2523,6 +2519,21 @@ static int __init init_kprobes(void)
 }
 early_initcall(init_kprobes);
 
+#if defined(CONFIG_OPTPROBES)
+static int __init init_optprobes(void)
+{
+	/*
+	 * Enable kprobe optimization - this kicks the optimizer which
+	 * depends on synchronize_rcu_tasks() and ksoftirqd, that is
+	 * not spawned in early initcall. So delay the optimization.
+	 */
+	optimize_all_kprobes();
+
+	return 0;
+}
+subsys_initcall(init_optprobes);
+#endif
+
 #ifdef CONFIG_DEBUG_FS
 static void report_probe(struct seq_file *pi, struct kprobe *p,
 		const char *sym, int offset, char *modname, struct kprobe *pp)


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-18 14:29 [PATCH] kprobes: Fix to delay the kprobes jump optimization Masami Hiramatsu
@ 2021-02-18 15:15 ` Paul E. McKenney
  2021-02-19  8:17   ` Sebastian Andrzej Siewior
  2021-02-19 19:36 ` Steven Rostedt
  1 sibling, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-18 15:15 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Steven Rostedt, Sebastian Andrzej Siewior,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Thu, Feb 18, 2021 at 11:29:23PM +0900, Masami Hiramatsu wrote:
> Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> moved the kprobe setup in early_initcall(), which includes kprobe
> jump optimization.
> The kprobes jump optimizer involves synchronize_rcu_tasks() which
> depends on the ksoftirqd and rcu_spawn_tasks_*(). However, since
> those are setup in core_initcall(), kprobes jump optimizer can not
> run at the early_initcall().
> 
> To avoid this issue, make the kprobe optimization disabled in the
> early_initcall() and enables it in subsys_initcall().
> 
> Note that non-optimized kprobes is still available after
> early_initcall(). Only jump optimization is delayed.
> 
> Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> Reported-by: Paul E. McKenney <paulmck@kernel.org>

Thank you, but the original report of a problem was from Sebastian
and the connection to softirq was Uladzislau.  So could you please
add these before (or even in place of) my Reported-by?

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: Uladzislau Rezki <urezki@gmail.com>

Other than that, looks good!

Acked-by: Paul E. McKenney <paulmck@kernel.org>

							Thanx, Paul

> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: stable@vger.kernel.org
> ---
>  kernel/kprobes.c |   31 +++++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index d5a3eb74a657..779d8322e307 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -861,7 +861,6 @@ static void try_to_optimize_kprobe(struct kprobe *p)
>  	cpus_read_unlock();
>  }
>  
> -#ifdef CONFIG_SYSCTL
>  static void optimize_all_kprobes(void)
>  {
>  	struct hlist_head *head;
> @@ -887,6 +886,7 @@ static void optimize_all_kprobes(void)
>  	mutex_unlock(&kprobe_mutex);
>  }
>  
> +#ifdef CONFIG_SYSCTL
>  static void unoptimize_all_kprobes(void)
>  {
>  	struct hlist_head *head;
> @@ -2497,18 +2497,14 @@ static int __init init_kprobes(void)
>  		}
>  	}
>  
> -#if defined(CONFIG_OPTPROBES)
> -#if defined(__ARCH_WANT_KPROBES_INSN_SLOT)
> -	/* Init kprobe_optinsn_slots */
> -	kprobe_optinsn_slots.insn_size = MAX_OPTINSN_SIZE;
> -#endif
> -	/* By default, kprobes can be optimized */
> -	kprobes_allow_optimization = true;
> -#endif
> -
>  	/* By default, kprobes are armed */
>  	kprobes_all_disarmed = false;
>  
> +#if defined(CONFIG_OPTPROBES) && defined(__ARCH_WANT_KPROBES_INSN_SLOT)
> +	/* Init kprobe_optinsn_slots for allocation */
> +	kprobe_optinsn_slots.insn_size = MAX_OPTINSN_SIZE;
> +#endif
> +
>  	err = arch_init_kprobes();
>  	if (!err)
>  		err = register_die_notifier(&kprobe_exceptions_nb);
> @@ -2523,6 +2519,21 @@ static int __init init_kprobes(void)
>  }
>  early_initcall(init_kprobes);
>  
> +#if defined(CONFIG_OPTPROBES)
> +static int __init init_optprobes(void)
> +{
> +	/*
> +	 * Enable kprobe optimization - this kicks the optimizer which
> +	 * depends on synchronize_rcu_tasks() and ksoftirqd, that is
> +	 * not spawned in early initcall. So delay the optimization.
> +	 */
> +	optimize_all_kprobes();
> +
> +	return 0;
> +}
> +subsys_initcall(init_optprobes);
> +#endif
> +
>  #ifdef CONFIG_DEBUG_FS
>  static void report_probe(struct seq_file *pi, struct kprobe *p,
>  		const char *sym, int offset, char *modname, struct kprobe *pp)
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-18 15:15 ` Paul E. McKenney
@ 2021-02-19  8:17   ` Sebastian Andrzej Siewior
  2021-02-19 10:49     ` Uladzislau Rezki
  0 siblings, 1 reply; 30+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-02-19  8:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Masami Hiramatsu, Ingo Molnar, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, Uladzislau Rezki, LKML, RCU, Michael Ellerman,
	Andrew Morton, Daniel Axtens, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Michal Hocko,
	Theodore Y . Ts'o, Oleksiy Avramchenko

On 2021-02-18 07:15:54 [-0800], Paul E. McKenney wrote:
> Thank you, but the original report of a problem was from Sebastian
> and the connection to softirq was Uladzislau.  So could you please
> add these before (or even in place of) my Reported-by?
> 
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Reported-by: Uladzislau Rezki <urezki@gmail.com>
> 
> Other than that, looks good!

Perfect. I'm kind of lost here, nevertheless ;) Does this mean that the
RCU selftest can now be delayed?

> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> 
> 							Thanx, Paul

Sebastian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19  8:17   ` Sebastian Andrzej Siewior
@ 2021-02-19 10:49     ` Uladzislau Rezki
  2021-02-19 10:57       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-19 10:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Masami Hiramatsu, Ingo Molnar, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 09:17:55AM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-02-18 07:15:54 [-0800], Paul E. McKenney wrote:
> > Thank you, but the original report of a problem was from Sebastian
> > and the connection to softirq was Uladzislau.  So could you please
> > add these before (or even in place of) my Reported-by?
> > 
> > Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Reported-by: Uladzislau Rezki <urezki@gmail.com>
> > 
> > Other than that, looks good!
> 
> Perfect. I'm kind of lost here, nevertheless ;) Does this mean that the
> RCU selftest can now be delayed?
> 
If above fix works, we can initialize rcu_init_tasks_generic() from the
core_initcall() including selftst. It means that such initialization can
be done later:

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 5cc6deaa5df2..ae7d0cdfa9bd 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -88,12 +88,6 @@ void rcu_sched_clock_irq(int user);
 void rcu_report_dead(unsigned int cpu);
 void rcutree_migrate_callbacks(int cpu);
 
-#ifdef CONFIG_TASKS_RCU_GENERIC
-void rcu_init_tasks_generic(void);
-#else
-static inline void rcu_init_tasks_generic(void) { }
-#endif
-
 #ifdef CONFIG_RCU_STALL_COMMON
 void rcu_sysrq_start(void);
 void rcu_sysrq_end(void);
diff --git a/init/main.c b/init/main.c
index c68d784376ca..3024c4db17a9 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1512,7 +1512,6 @@ static noinline void __init kernel_init_freeable(void)
 
 	init_mm_internals();
 
-	rcu_init_tasks_generic();
 	do_pre_smp_initcalls();
 	lockup_detector_init();
 
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 17c8ebe131af..2797f9a042f4 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -966,11 +966,6 @@ static void rcu_tasks_trace_pregp_step(void)
 static void rcu_tasks_trace_pertask(struct task_struct *t,
 				    struct list_head *hop)
 {
-	// During early boot when there is only the one boot CPU, there
-	// is no idle task for the other CPUs. Just return.
-	if (unlikely(t == NULL))
-		return;
-
 	WRITE_ONCE(t->trc_reader_special.b.need_qs, false);
 	WRITE_ONCE(t->trc_reader_checked, false);
 	t->trc_ipi_to_cpu = -1;
@@ -1300,7 +1295,7 @@ late_initcall(rcu_tasks_verify_self_tests);
 static void rcu_tasks_initiate_self_tests(void) { }
 #endif /* #else #ifdef CONFIG_PROVE_RCU */
 
-void __init rcu_init_tasks_generic(void)
+static void __init rcu_init_tasks_generic(void)
 {
 #ifdef CONFIG_TASKS_RCU
 	rcu_spawn_tasks_kthread();
@@ -1318,6 +1313,7 @@ void __init rcu_init_tasks_generic(void)
 	rcu_tasks_initiate_self_tests();
 }
 
+core_initcall(rcu_init_tasks_generic);
 #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
 static inline void rcu_tasks_bootup_oddness(void) {}
 void show_rcu_tasks_gp_kthreads(void) {}


--
Vlad Rezki

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 10:49     ` Uladzislau Rezki
@ 2021-02-19 10:57       ` Sebastian Andrzej Siewior
  2021-02-19 11:13         ` Uladzislau Rezki
  0 siblings, 1 reply; 30+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-02-19 10:57 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, Masami Hiramatsu, Ingo Molnar, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, LKML, RCU, Michael Ellerman,
	Andrew Morton, Daniel Axtens, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Michal Hocko,
	Theodore Y . Ts'o, Oleksiy Avramchenko

On 2021-02-19 11:49:58 [+0100], Uladzislau Rezki wrote:
> If above fix works, we can initialize rcu_init_tasks_generic() from the
> core_initcall() including selftst. It means that such initialization can
> be done later:

Good. Please let me know once there is something for me to test.
Do I assume correctly that the self-test, I stumbled upon, is v5.12
material?

Sebastian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 10:57       ` Sebastian Andrzej Siewior
@ 2021-02-19 11:13         ` Uladzislau Rezki
  2021-02-19 11:17           ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-19 11:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Paul E. McKenney
  Cc: Uladzislau Rezki, Paul E. McKenney, Masami Hiramatsu,
	Ingo Molnar, Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	LKML, RCU, Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 11:57:10AM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-02-19 11:49:58 [+0100], Uladzislau Rezki wrote:
> > If above fix works, we can initialize rcu_init_tasks_generic() from the
> > core_initcall() including selftst. It means that such initialization can
> > be done later:
> 
> Good. Please let me know once there is something for me to test.
> Do I assume correctly that the self-test, I stumbled upon, is v5.12
> material?
> 
I or Paul will ask for a test once it is settled down :) Looks like
it is, so we should fix for v5.12.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 11:13         ` Uladzislau Rezki
@ 2021-02-19 11:17           ` Sebastian Andrzej Siewior
  2021-02-19 11:23             ` Uladzislau Rezki
  0 siblings, 1 reply; 30+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-02-19 11:17 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Paul E. McKenney, Masami Hiramatsu, Ingo Molnar, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, LKML, RCU, Michael Ellerman,
	Andrew Morton, Daniel Axtens, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Michal Hocko,
	Theodore Y . Ts'o, Oleksiy Avramchenko

On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> I or Paul will ask for a test once it is settled down :) Looks like
> it is, so we should fix for v5.12.

Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
parts of it are also -stable material.

Sebastian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 11:17           ` Sebastian Andrzej Siewior
@ 2021-02-19 11:23             ` Uladzislau Rezki
  2021-02-19 11:27               ` Uladzislau Rezki
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-19 11:23 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Uladzislau Rezki, Paul E. McKenney, Masami Hiramatsu,
	Ingo Molnar, Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	LKML, RCU, Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 12:17:38PM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> > I or Paul will ask for a test once it is settled down :) Looks like
> > it is, so we should fix for v5.12.
> 
> Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
> parts of it are also -stable material.
> 
OK, i see. It will be broken starting from v5.12-rc unless we fix it.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 11:23             ` Uladzislau Rezki
@ 2021-02-19 11:27               ` Uladzislau Rezki
  2021-02-19 18:18                 ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-19 11:27 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Sebastian Andrzej Siewior, Paul E. McKenney, Masami Hiramatsu,
	Ingo Molnar, Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	LKML, RCU, Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 12:23:57PM +0100, Uladzislau Rezki wrote:
> On Fri, Feb 19, 2021 at 12:17:38PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> > > I or Paul will ask for a test once it is settled down :) Looks like
> > > it is, so we should fix for v5.12.
> > 
> > Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
> > parts of it are also -stable material.
> > 
> OK, i see. It will be broken starting from v5.12-rc unless we fix it.
> 
Sorry it is broken since 5.11 kernel already, i messed it up.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 11:27               ` Uladzislau Rezki
@ 2021-02-19 18:18                 ` Paul E. McKenney
  2021-02-19 18:33                   ` Paul E. McKenney
                                     ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 18:18 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Masami Hiramatsu, Ingo Molnar,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 12:27:51PM +0100, Uladzislau Rezki wrote:
> On Fri, Feb 19, 2021 at 12:23:57PM +0100, Uladzislau Rezki wrote:
> > On Fri, Feb 19, 2021 at 12:17:38PM +0100, Sebastian Andrzej Siewior wrote:
> > > On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> > > > I or Paul will ask for a test once it is settled down :) Looks like
> > > > it is, so we should fix for v5.12.
> > > 
> > > Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
> > > parts of it are also -stable material.

If Masami's patch works for the PowerPC guys on v5.10-rc7, then it can
be backported.  The patch making RCU Tasks initialize itself early won't
have any effect and can be left or reverted, as we choose.  The self-test
patch will need to be either adjusted or reverted.

However...

The root cause of this problem is that softirq only kind-of works
during a window of time during boot.  It works only if the number and
duration of softirq handlers during this time is small enough, for some
ill-defined notion of "small enough".  If there are too many, whatever
that means exactly, then we get failed attempt to awaken ksoftirqd, which
(sometimes!) results in a silent hang.  Which, as you pointed out earlier,
is a really obnoxious error message.  And any minor change could kick
us into silent-hang state because of the heuristics used to hand off
to ksoftirqd.  The straw that broke the camel's back and all that.

One approach would be to add WARN_ON_ONCE() so that if softirq tries
to awaken ksoftirqd before it is spawned, we get a nice obvious splat.
Unfortunately, this gives false positives because there is code that
needs a softirq handler to run eventually, but is OK with that handler
being delayed until some random point in the early_initcall() sequence.

Besides which, if we are going to add a check, why not use that check
just make things work by forcing handler execution to remain within the
softirq back-of-interrupt context instead of awakening a not-yet-spawned
ksoftirqd?  We can further prevent entry into dyntick-idle state until
the ksoftirqd kthreads have been spawned, which means that if softirq
handlers must be deferred, they will be resumed within one jiffy by the
next scheduler-clock interrupt.

Yes, this can allow softirq handlers to impose large latencies, but only
during early boot, long before any latency-sensitive applications can
possibly have been created.  So this does not seem like a real problem.

Am I missing something here?

							Thanx, Paul

> > OK, i see. It will be broken starting from v5.12-rc unless we fix it.
> > 
> Sorry it is broken since 5.11 kernel already, i messed it up.
> 
> --
> Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 18:18                 ` Paul E. McKenney
@ 2021-02-19 18:33                   ` Paul E. McKenney
  2021-02-19 19:34                     ` Paul E. McKenney
                                       ` (2 more replies)
  2021-02-19 19:14                   ` Steven Rostedt
  2021-02-22 10:04                   ` Sebastian Andrzej Siewior
  2 siblings, 3 replies; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 18:33 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Masami Hiramatsu, Ingo Molnar,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 10:18:11AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 19, 2021 at 12:27:51PM +0100, Uladzislau Rezki wrote:
> > On Fri, Feb 19, 2021 at 12:23:57PM +0100, Uladzislau Rezki wrote:
> > > On Fri, Feb 19, 2021 at 12:17:38PM +0100, Sebastian Andrzej Siewior wrote:
> > > > On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> > > > > I or Paul will ask for a test once it is settled down :) Looks like
> > > > > it is, so we should fix for v5.12.
> > > > 
> > > > Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
> > > > parts of it are also -stable material.
> 
> If Masami's patch works for the PowerPC guys on v5.10-rc7, then it can
> be backported.  The patch making RCU Tasks initialize itself early won't
> have any effect and can be left or reverted, as we choose.  The self-test
> patch will need to be either adjusted or reverted.
> 
> However...
> 
> The root cause of this problem is that softirq only kind-of works
> during a window of time during boot.  It works only if the number and
> duration of softirq handlers during this time is small enough, for some
> ill-defined notion of "small enough".  If there are too many, whatever
> that means exactly, then we get failed attempt to awaken ksoftirqd, which
> (sometimes!) results in a silent hang.  Which, as you pointed out earlier,
> is a really obnoxious error message.  And any minor change could kick
> us into silent-hang state because of the heuristics used to hand off
> to ksoftirqd.  The straw that broke the camel's back and all that.
> 
> One approach would be to add WARN_ON_ONCE() so that if softirq tries
> to awaken ksoftirqd before it is spawned, we get a nice obvious splat.
> Unfortunately, this gives false positives because there is code that
> needs a softirq handler to run eventually, but is OK with that handler
> being delayed until some random point in the early_initcall() sequence.
> 
> Besides which, if we are going to add a check, why not use that check
> just make things work by forcing handler execution to remain within the
> softirq back-of-interrupt context instead of awakening a not-yet-spawned
> ksoftirqd?  We can further prevent entry into dyntick-idle state until
> the ksoftirqd kthreads have been spawned, which means that if softirq
> handlers must be deferred, they will be resumed within one jiffy by the
> next scheduler-clock interrupt.
> 
> Yes, this can allow softirq handlers to impose large latencies, but only
> during early boot, long before any latency-sensitive applications can
> possibly have been created.  So this does not seem like a real problem.
> 
> Am I missing something here?

For definiteness, here is the first part of the change, posted earlier.
The commit log needs to be updated.  I will post the change that keeps
the tick going as a reply to this email.

							Thanx, Paul

------------------------------------------------------------------------

commit 4f659bf04fc4610523544493d6db92fc8670b086
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Fri Feb 12 16:20:40 2021 -0800

    softirq: Don't try waking ksoftirqd before it has been spawned
    
    If there is heavy softirq activity, the softirq system will attempt
    to awaken ksoftirqd and will stop the traditional back-of-interrupt
    softirq processing.  This is all well and good, but only if the
    ksoftirqd kthreads already exist, which is not the case during early
    boot, in which case the system hangs.
    
    One reproducer is as follows:
    
    tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
    
    This commit therefore adds a couple of existence checks for ksoftirqd
    and forces back-of-interrupt softirq processing when ksoftirqd does not
    yet exist.  With this change, the above test passes.
    
    Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Reported-by: Uladzislau Rezki <urezki@gmail.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 9d71046..ba78e63 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
 	if (ksoftirqd_running(local_softirq_pending()))
 		return;
 
-	if (!force_irqthreads) {
+	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
 #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
 		/*
 		 * We can safely execute softirq on the current stack if
@@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
 
 	pending = local_softirq_pending();
 	if (pending) {
-		if (time_before(jiffies, end) && !need_resched() &&
-		    --max_restart)
+		if (!__this_cpu_read(ksoftirqd) ||
+		    (time_before(jiffies, end) && !need_resched() && --max_restart))
 			goto restart;
 
 		wakeup_softirqd();

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 18:18                 ` Paul E. McKenney
  2021-02-19 18:33                   ` Paul E. McKenney
@ 2021-02-19 19:14                   ` Steven Rostedt
  2021-02-19 19:45                     ` Paul E. McKenney
  2021-02-22 10:04                   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2021-02-19 19:14 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, 19 Feb 2021 10:18:11 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> We can further prevent entry into dyntick-idle state until
> the ksoftirqd kthreads have been spawned, which means that if softirq
> handlers must be deferred, they will be resumed within one jiffy by the
> next scheduler-clock interrupt.

Why not just prevent entry into dyntick-idle state until the system is
finished booting? As you said; There should be no latency-sensitive
applications running, until after we started the system.

-- Steve

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 18:33                   ` Paul E. McKenney
@ 2021-02-19 19:34                     ` Paul E. McKenney
  2021-02-19 20:02                     ` Steven Rostedt
  2021-02-22 10:21                     ` Sebastian Andrzej Siewior
  2 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 19:34 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Masami Hiramatsu, Ingo Molnar,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 10:33:36AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 19, 2021 at 10:18:11AM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 19, 2021 at 12:27:51PM +0100, Uladzislau Rezki wrote:
> > > On Fri, Feb 19, 2021 at 12:23:57PM +0100, Uladzislau Rezki wrote:
> > > > On Fri, Feb 19, 2021 at 12:17:38PM +0100, Sebastian Andrzej Siewior wrote:
> > > > > On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> > > > > > I or Paul will ask for a test once it is settled down :) Looks like
> > > > > > it is, so we should fix for v5.12.
> > > > > 
> > > > > Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
> > > > > parts of it are also -stable material.
> > 
> > If Masami's patch works for the PowerPC guys on v5.10-rc7, then it can
> > be backported.  The patch making RCU Tasks initialize itself early won't
> > have any effect and can be left or reverted, as we choose.  The self-test
> > patch will need to be either adjusted or reverted.
> > 
> > However...
> > 
> > The root cause of this problem is that softirq only kind-of works
> > during a window of time during boot.  It works only if the number and
> > duration of softirq handlers during this time is small enough, for some
> > ill-defined notion of "small enough".  If there are too many, whatever
> > that means exactly, then we get failed attempt to awaken ksoftirqd, which
> > (sometimes!) results in a silent hang.  Which, as you pointed out earlier,
> > is a really obnoxious error message.  And any minor change could kick
> > us into silent-hang state because of the heuristics used to hand off
> > to ksoftirqd.  The straw that broke the camel's back and all that.
> > 
> > One approach would be to add WARN_ON_ONCE() so that if softirq tries
> > to awaken ksoftirqd before it is spawned, we get a nice obvious splat.
> > Unfortunately, this gives false positives because there is code that
> > needs a softirq handler to run eventually, but is OK with that handler
> > being delayed until some random point in the early_initcall() sequence.
> > 
> > Besides which, if we are going to add a check, why not use that check
> > just make things work by forcing handler execution to remain within the
> > softirq back-of-interrupt context instead of awakening a not-yet-spawned
> > ksoftirqd?  We can further prevent entry into dyntick-idle state until
> > the ksoftirqd kthreads have been spawned, which means that if softirq
> > handlers must be deferred, they will be resumed within one jiffy by the
> > next scheduler-clock interrupt.
> > 
> > Yes, this can allow softirq handlers to impose large latencies, but only
> > during early boot, long before any latency-sensitive applications can
> > possibly have been created.  So this does not seem like a real problem.
> > 
> > Am I missing something here?
> 
> For definiteness, here is the first part of the change, posted earlier.
> The commit log needs to be updated.  I will post the change that keeps
> the tick going as a reply to this email.

And here it is.

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 9c0ee82..1d4f5b8 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1320,6 +1320,11 @@ static void rcu_prepare_kthreads(int cpu)
  */
 int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 {
+	/* Through early_initcall(), need tick for softirq handlers. */
+	if (!this_cpu_ksoftirqd()) {
+		*nextevt = 1;
+		return 1;
+	}
 	*nextevt = KTIME_MAX;
 	return !rcu_segcblist_empty(&this_cpu_ptr(&rcu_data)->cblist) &&
 		!rcu_rdp_is_offloaded(this_cpu_ptr(&rcu_data));
@@ -1415,6 +1420,12 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt)
 
 	lockdep_assert_irqs_disabled();
 
+	/* Through early_initcall(), need tick for softirq handlers. */
+	if (!this_cpu_ksoftirqd()) {
+		*nextevt = 1;
+		return 1;
+	}
+
 	/* If no non-offloaded callbacks, RCU doesn't need the CPU. */
 	if (rcu_segcblist_empty(&rdp->cblist) ||
 	    rcu_rdp_is_offloaded(rdp)) {

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-18 14:29 [PATCH] kprobes: Fix to delay the kprobes jump optimization Masami Hiramatsu
  2021-02-18 15:15 ` Paul E. McKenney
@ 2021-02-19 19:36 ` Steven Rostedt
  2021-02-19 19:47   ` Paul E. McKenney
  1 sibling, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2021-02-19 19:36 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Paul E . McKenney, Sebastian Andrzej Siewior,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Thu, 18 Feb 2021 23:29:23 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> moved the kprobe setup in early_initcall(), which includes kprobe
> jump optimization.
> The kprobes jump optimizer involves synchronize_rcu_tasks() which
> depends on the ksoftirqd and rcu_spawn_tasks_*(). However, since
> those are setup in core_initcall(), kprobes jump optimizer can not
> run at the early_initcall().
> 
> To avoid this issue, make the kprobe optimization disabled in the
> early_initcall() and enables it in subsys_initcall().
> 
> Note that non-optimized kprobes is still available after
> early_initcall(). Only jump optimization is delayed.
> 
> Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> Reported-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: stable@vger.kernel.org
> ---
>

I pulled this into my queue to be tested, and when that completes
(hopefully without failure), I'll add this to my pull request for the
current merge window (which I still need to send).

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 19:14                   ` Steven Rostedt
@ 2021-02-19 19:45                     ` Paul E. McKenney
  2021-02-19 20:04                       ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 19:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 02:14:29PM -0500, Steven Rostedt wrote:
> On Fri, 19 Feb 2021 10:18:11 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > We can further prevent entry into dyntick-idle state until
> > the ksoftirqd kthreads have been spawned, which means that if softirq
> > handlers must be deferred, they will be resumed within one jiffy by the
> > next scheduler-clock interrupt.
> 
> Why not just prevent entry into dyntick-idle state until the system is
> finished booting? As you said; There should be no latency-sensitive
> applications running, until after we started the system.

Exactly, and that is the effect of patch to rcu_needs_cpu() that I just
now posted.

Ah, your point is that if the tick keeps running, there is no need to
modify softirq?  Good point, and I will test that, thank you!!!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 19:36 ` Steven Rostedt
@ 2021-02-19 19:47   ` Paul E. McKenney
  2021-02-19 19:58     ` Steven Rostedt
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 19:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Ingo Molnar, Sebastian Andrzej Siewior,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 02:36:07PM -0500, Steven Rostedt wrote:
> On Thu, 18 Feb 2021 23:29:23 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> > moved the kprobe setup in early_initcall(), which includes kprobe
> > jump optimization.
> > The kprobes jump optimizer involves synchronize_rcu_tasks() which
> > depends on the ksoftirqd and rcu_spawn_tasks_*(). However, since
> > those are setup in core_initcall(), kprobes jump optimizer can not
> > run at the early_initcall().
> > 
> > To avoid this issue, make the kprobe optimization disabled in the
> > early_initcall() and enables it in subsys_initcall().
> > 
> > Note that non-optimized kprobes is still available after
> > early_initcall(). Only jump optimization is delayed.
> > 
> > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
> > Reported-by: Paul E. McKenney <paulmck@kernel.org>
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> 
> I pulled this into my queue to be tested, and when that completes
> (hopefully without failure), I'll add this to my pull request for the
> current merge window (which I still need to send).

Thank you, Steve!

Could you please add the following Reported-by tags?

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: Uladzislau Rezki <urezki@gmail.com>

Sebastian first noticed the problem, and Uladzislau figured out
how softirqs were involved.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 19:47   ` Paul E. McKenney
@ 2021-02-19 19:58     ` Steven Rostedt
  2021-02-19 20:04       ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2021-02-19 19:58 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Masami Hiramatsu, Ingo Molnar, Sebastian Andrzej Siewior,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, 19 Feb 2021 11:47:44 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> Could you please add the following Reported-by tags?
> 
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Reported-by: Uladzislau Rezki <urezki@gmail.com>
> 
> Sebastian first noticed the problem, and Uladzislau figured out
> how softirqs were involved.

These were already added. They must have appeared in the thread somewhere,
as my internal patchwork picked them up already.

-- Steve

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 18:33                   ` Paul E. McKenney
  2021-02-19 19:34                     ` Paul E. McKenney
@ 2021-02-19 20:02                     ` Steven Rostedt
  2021-02-19 21:22                       ` Paul E. McKenney
  2021-02-22 10:21                     ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 30+ messages in thread
From: Steven Rostedt @ 2021-02-19 20:02 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, 19 Feb 2021 10:33:36 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> commit 4f659bf04fc4610523544493d6db92fc8670b086
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Fri Feb 12 16:20:40 2021 -0800
> 
>     softirq: Don't try waking ksoftirqd before it has been spawned
>     
>     If there is heavy softirq activity, the softirq system will attempt
>     to awaken ksoftirqd and will stop the traditional back-of-interrupt
>     softirq processing.  This is all well and good, but only if the
>     ksoftirqd kthreads already exist, which is not the case during early
>     boot, in which case the system hangs.
>     
>     One reproducer is as follows:
>     
>     tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
>     
>     This commit therefore adds a couple of existence checks for ksoftirqd
>     and forces back-of-interrupt softirq processing when ksoftirqd does not
>     yet exist.  With this change, the above test passes.
>     
>     Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>     Reported-by: Uladzislau Rezki <urezki@gmail.com>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Hmm, I wonder if this is what patchwork picked up regarding the
reported-by.

And I checked, it did not add the "Signed-off-by" from you to Masami's
patch ;-)

The dangers of posting patches in threads of other patches :-/

-- Steve

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 19:45                     ` Paul E. McKenney
@ 2021-02-19 20:04                       ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 20:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 11:45:39AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 19, 2021 at 02:14:29PM -0500, Steven Rostedt wrote:
> > On Fri, 19 Feb 2021 10:18:11 -0800
> > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > 
> > > We can further prevent entry into dyntick-idle state until
> > > the ksoftirqd kthreads have been spawned, which means that if softirq
> > > handlers must be deferred, they will be resumed within one jiffy by the
> > > next scheduler-clock interrupt.
> > 
> > Why not just prevent entry into dyntick-idle state until the system is
> > finished booting? As you said; There should be no latency-sensitive
> > applications running, until after we started the system.
> 
> Exactly, and that is the effect of patch to rcu_needs_cpu() that I just
> now posted.
> 
> Ah, your point is that if the tick keeps running, there is no need to
> modify softirq?  Good point, and I will test that, thank you!!!

But sadly keeping the tick on without also modifying softirq still
results in a hang.  The problem is that when the kernel is booted with
threadirqs=1, invoke_softirq() will avoid ever running the softirq
handlers on the backside of an interrupt.

So is this where Sebastian tells me that some -rt transformation can
result in locking-based deadlocks if softirq handlers are ever run on
the backside of an interrupt?  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 19:58     ` Steven Rostedt
@ 2021-02-19 20:04       ` Paul E. McKenney
  2021-02-22 12:06         ` Masami Hiramatsu
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 20:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Ingo Molnar, Sebastian Andrzej Siewior,
	Peter Zijlstra, Thomas Gleixner, Uladzislau Rezki, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 02:58:30PM -0500, Steven Rostedt wrote:
> On Fri, 19 Feb 2021 11:47:44 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > Could you please add the following Reported-by tags?
> > 
> > Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Reported-by: Uladzislau Rezki <urezki@gmail.com>
> > 
> > Sebastian first noticed the problem, and Uladzislau figured out
> > how softirqs were involved.
> 
> These were already added. They must have appeared in the thread somewhere,
> as my internal patchwork picked them up already.

Even better, thank you!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 20:02                     ` Steven Rostedt
@ 2021-02-19 21:22                       ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-19 21:22 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Fri, Feb 19, 2021 at 03:02:31PM -0500, Steven Rostedt wrote:
> On Fri, 19 Feb 2021 10:33:36 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > commit 4f659bf04fc4610523544493d6db92fc8670b086
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Fri Feb 12 16:20:40 2021 -0800
> > 
> >     softirq: Don't try waking ksoftirqd before it has been spawned
> >     
> >     If there is heavy softirq activity, the softirq system will attempt
> >     to awaken ksoftirqd and will stop the traditional back-of-interrupt
> >     softirq processing.  This is all well and good, but only if the
> >     ksoftirqd kthreads already exist, which is not the case during early
> >     boot, in which case the system hangs.
> >     
> >     One reproducer is as follows:
> >     
> >     tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
> >     
> >     This commit therefore adds a couple of existence checks for ksoftirqd
> >     and forces back-of-interrupt softirq processing when ksoftirqd does not
> >     yet exist.  With this change, the above test passes.
> >     
> >     Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> >     Reported-by: Uladzislau Rezki <urezki@gmail.com>
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> Hmm, I wonder if this is what patchwork picked up regarding the
> reported-by.
> 
> And I checked, it did not add the "Signed-off-by" from you to Masami's
> patch ;-)
> 
> The dangers of posting patches in threads of other patches :-/

Glad I asked, then!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 18:18                 ` Paul E. McKenney
  2021-02-19 18:33                   ` Paul E. McKenney
  2021-02-19 19:14                   ` Steven Rostedt
@ 2021-02-22 10:04                   ` Sebastian Andrzej Siewior
  2 siblings, 0 replies; 30+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-02-22 10:04 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Masami Hiramatsu, Ingo Molnar, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, LKML, RCU, Michael Ellerman,
	Andrew Morton, Daniel Axtens, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Michal Hocko,
	Theodore Y . Ts'o, Oleksiy Avramchenko

On 2021-02-19 10:18:11 [-0800], Paul E. McKenney wrote:
> If Masami's patch works for the PowerPC guys on v5.10-rc7, then it can
> be backported.  The patch making RCU Tasks initialize itself early won't
> have any effect and can be left or reverted, as we choose.  The self-test
> patch will need to be either adjusted or reverted.
> 
> However...
> 
> The root cause of this problem is that softirq only kind-of works
> during a window of time during boot.  It works only if the number and
> duration of softirq handlers during this time is small enough, for some
> ill-defined notion of "small enough".  If there are too many, whatever
> that means exactly, then we get failed attempt to awaken ksoftirqd, which

The number of registered softirq handlers does not matter nor the amount
times the individual softirqs that were scheduled. The only problem is
that one schedules softirq and then waits for its completion.
So scheduling a timer_list timer works. Waiting for its completion does
not. Once ksoftirqd is up, will be processed.

> (sometimes!) results in a silent hang.  Which, as you pointed out earlier,
> is a really obnoxious error message.  And any minor change could kick
> us into silent-hang state because of the heuristics used to hand off
> to ksoftirqd.  The straw that broke the camel's back and all that.

The problem is that a softirq is raised and being waited for its
completion.
Something like synchronize_rcu() would be such a thing I guess.

> One approach would be to add WARN_ON_ONCE() so that if softirq tries
> to awaken ksoftirqd before it is spawned, we get a nice obvious splat.
> Unfortunately, this gives false positives because there is code that
> needs a softirq handler to run eventually, but is OK with that handler
> being delayed until some random point in the early_initcall() sequence.
> 
> Besides which, if we are going to add a check, why not use that check
> just make things work by forcing handler execution to remain within the
> softirq back-of-interrupt context instead of awakening a not-yet-spawned
> ksoftirqd?  We can further prevent entry into dyntick-idle state until
> the ksoftirqd kthreads have been spawned, which means that if softirq
> handlers must be deferred, they will be resumed within one jiffy by the
> next scheduler-clock interrupt.

This should work.

> Yes, this can allow softirq handlers to impose large latencies, but only
> during early boot, long before any latency-sensitive applications can
> possibly have been created.  So this does not seem like a real problem.
> 
> Am I missing something here?
> 
> 							Thanx, Paul

Sebastian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 18:33                   ` Paul E. McKenney
  2021-02-19 19:34                     ` Paul E. McKenney
  2021-02-19 20:02                     ` Steven Rostedt
@ 2021-02-22 10:21                     ` Sebastian Andrzej Siewior
  2021-02-22 12:54                       ` Uladzislau Rezki
  2 siblings, 1 reply; 30+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-02-22 10:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Masami Hiramatsu, Ingo Molnar, Steven Rostedt,
	Peter Zijlstra, Thomas Gleixner, LKML, RCU, Michael Ellerman,
	Andrew Morton, Daniel Axtens, Frederic Weisbecker,
	Neeraj Upadhyay, Joel Fernandes, Michal Hocko,
	Theodore Y . Ts'o, Oleksiy Avramchenko

On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> For definiteness, here is the first part of the change, posted earlier.
> The commit log needs to be updated.  I will post the change that keeps
> the tick going as a reply to this email.
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 9d71046..ba78e63 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
>  	if (ksoftirqd_running(local_softirq_pending()))
>  		return;
>  
> -	if (!force_irqthreads) {
> +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
>  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
>  		/*
>  		 * We can safely execute softirq on the current stack if
> @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>  
>  	pending = local_softirq_pending();
>  	if (pending) {
> -		if (time_before(jiffies, end) && !need_resched() &&
> -		    --max_restart)
> +		if (!__this_cpu_read(ksoftirqd) ||
> +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
>  			goto restart;

This is hunk shouldn't be needed. The reason for it is probably that the
following wakeup_softirqd() would avoid further invoke_softirq()
performing the actual softirq work. It would leave early due to
ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
outside of an interrupt would do the same. 

I would like PeterZ / tglx to comment on this one. Basically I'm not
sure if it is okay to expect softirqs beeing served and waited on that
early in the boot.

>  		wakeup_softirqd();

Sebastian

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-19 20:04       ` Paul E. McKenney
@ 2021-02-22 12:06         ` Masami Hiramatsu
  0 siblings, 0 replies; 30+ messages in thread
From: Masami Hiramatsu @ 2021-02-22 12:06 UTC (permalink / raw)
  To: paulmck
  Cc: Steven Rostedt, Masami Hiramatsu, Ingo Molnar,
	Sebastian Andrzej Siewior, Peter Zijlstra, Thomas Gleixner,
	Uladzislau Rezki, LKML, RCU, Michael Ellerman, Andrew Morton,
	Daniel Axtens, Frederic Weisbecker, Neeraj Upadhyay,
	Joel Fernandes, Michal Hocko, Theodore Y . Ts'o,
	Oleksiy Avramchenko

On Fri, 19 Feb 2021 12:04:42 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Fri, Feb 19, 2021 at 02:58:30PM -0500, Steven Rostedt wrote:
> > On Fri, 19 Feb 2021 11:47:44 -0800
> > "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > 
> > > Could you please add the following Reported-by tags?
> > > 
> > > Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > > Reported-by: Uladzislau Rezki <urezki@gmail.com>
> > > 
> > > Sebastian first noticed the problem, and Uladzislau figured out
> > > how softirqs were involved.
> > 
> > These were already added. They must have appeared in the thread somewhere,
> > as my internal patchwork picked them up already.
> 
> Even better, thank you!

Thank you for fixing and pulling it Steve and Paul!
I couldn't find the original thread, so it is helpful :)

Thanks!

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-22 10:21                     ` Sebastian Andrzej Siewior
@ 2021-02-22 12:54                       ` Uladzislau Rezki
  2021-02-22 15:09                         ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-22 12:54 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul E. McKenney, Uladzislau Rezki, Masami Hiramatsu,
	Ingo Molnar, Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	LKML, RCU, Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Mon, Feb 22, 2021 at 11:21:04AM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> > For definiteness, here is the first part of the change, posted earlier.
> > The commit log needs to be updated.  I will post the change that keeps
> > the tick going as a reply to this email.
> …
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 9d71046..ba78e63 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
> >  	if (ksoftirqd_running(local_softirq_pending()))
> >  		return;
> >  
> > -	if (!force_irqthreads) {
> > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> >  		/*
> >  		 * We can safely execute softirq on the current stack if
> > @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> >  
> >  	pending = local_softirq_pending();
> >  	if (pending) {
> > -		if (time_before(jiffies, end) && !need_resched() &&
> > -		    --max_restart)
> > +		if (!__this_cpu_read(ksoftirqd) ||
> > +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
> >  			goto restart;
> 
> This is hunk shouldn't be needed. The reason for it is probably that the
> following wakeup_softirqd() would avoid further invoke_softirq()
> performing the actual softirq work. It would leave early due to
> ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
> outside of an interrupt would do the same. 
> 
> I would like PeterZ / tglx to comment on this one. Basically I'm not
> sure if it is okay to expect softirqs beeing served and waited on that
> early in the boot.
> 
The ksoftirqd threads get spawned during early_initcall() phase. Why not
just spawn them one step earlier what is totally safe? I mean before
do_pre_smp_initcalls() that calls early callbacks.

+       spawn_ksoftirqd();
        rcu_init_tasks_generic();
        do_pre_smp_initcalls();

With such change the spawning will not be depended on linker/compiler
i.e. when and in which order an early_initcall(spawn_ksoftirqd) callback
is executed.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-22 12:54                       ` Uladzislau Rezki
@ 2021-02-22 15:09                         ` Paul E. McKenney
  2021-02-22 17:16                           ` Uladzislau Rezki
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-22 15:09 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Masami Hiramatsu, Ingo Molnar,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Mon, Feb 22, 2021 at 01:54:31PM +0100, Uladzislau Rezki wrote:
> On Mon, Feb 22, 2021 at 11:21:04AM +0100, Sebastian Andrzej Siewior wrote:
> > On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> > > For definiteness, here is the first part of the change, posted earlier.
> > > The commit log needs to be updated.  I will post the change that keeps
> > > the tick going as a reply to this email.
> > …
> > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > index 9d71046..ba78e63 100644
> > > --- a/kernel/softirq.c
> > > +++ b/kernel/softirq.c
> > > @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
> > >  	if (ksoftirqd_running(local_softirq_pending()))
> > >  		return;
> > >  
> > > -	if (!force_irqthreads) {
> > > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> > >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> > >  		/*
> > >  		 * We can safely execute softirq on the current stack if
> > > @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > >  
> > >  	pending = local_softirq_pending();
> > >  	if (pending) {
> > > -		if (time_before(jiffies, end) && !need_resched() &&
> > > -		    --max_restart)
> > > +		if (!__this_cpu_read(ksoftirqd) ||
> > > +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
> > >  			goto restart;
> > 
> > This is hunk shouldn't be needed. The reason for it is probably that the
> > following wakeup_softirqd() would avoid further invoke_softirq()
> > performing the actual softirq work. It would leave early due to
> > ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
> > outside of an interrupt would do the same. 

And it does pass the rcutorture test without that hunk:

tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make

> > I would like PeterZ / tglx to comment on this one. Basically I'm not
> > sure if it is okay to expect softirqs beeing served and waited on that
> > early in the boot.

It would be good to get other eyes on this.

I do agree that "don't wait on softirq handlers until after completion
of all early_initcall() handlers" is a nice simple rule, but debugging
violations of it is not so simple.  Adding warnings to ease debugging
of violations of this rule is quite a bit more complex than is either of
the methods of making the rule unnecessary, at least from what I can see
at this point.  The complexity of the warnings is exactly what Sebastian
pointed out earlier, that it is currently legal to raise_softirq() but
not to wait on the resulting handlers.  But even waiting is OK if that
waiting does not delay the boot sequence.  But if the boot kthread waits
on the kthread that does the waiting, it is once again not OK.

So am I missing something subtle here?

> The ksoftirqd threads get spawned during early_initcall() phase. Why not
> just spawn them one step earlier what is totally safe? I mean before
> do_pre_smp_initcalls() that calls early callbacks.
> 
> +       spawn_ksoftirqd();
>         rcu_init_tasks_generic();
>         do_pre_smp_initcalls();
> 
> With such change the spawning will not be depended on linker/compiler
> i.e. when and in which order an early_initcall(spawn_ksoftirqd) callback
> is executed.

We both posted patches similar to this, so I am not opposed.  One caveat,
though, namely that this narrows the window quite a bit but does not
entirely close it.  But it does allow the early_initcall()s to wait on
softirq handlers.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-22 15:09                         ` Paul E. McKenney
@ 2021-02-22 17:16                           ` Uladzislau Rezki
  2021-02-22 18:16                             ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-22 17:16 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	LKML, RCU, Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Mon, Feb 22, 2021 at 07:09:03AM -0800, Paul E. McKenney wrote:
> On Mon, Feb 22, 2021 at 01:54:31PM +0100, Uladzislau Rezki wrote:
> > On Mon, Feb 22, 2021 at 11:21:04AM +0100, Sebastian Andrzej Siewior wrote:
> > > On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> > > > For definiteness, here is the first part of the change, posted earlier.
> > > > The commit log needs to be updated.  I will post the change that keeps
> > > > the tick going as a reply to this email.
> > > …
> > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > index 9d71046..ba78e63 100644
> > > > --- a/kernel/softirq.c
> > > > +++ b/kernel/softirq.c
> > > > @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
> > > >  	if (ksoftirqd_running(local_softirq_pending()))
> > > >  		return;
> > > >  
> > > > -	if (!force_irqthreads) {
> > > > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> > > >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> > > >  		/*
> > > >  		 * We can safely execute softirq on the current stack if
> > > > @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > > >  
> > > >  	pending = local_softirq_pending();
> > > >  	if (pending) {
> > > > -		if (time_before(jiffies, end) && !need_resched() &&
> > > > -		    --max_restart)
> > > > +		if (!__this_cpu_read(ksoftirqd) ||
> > > > +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
> > > >  			goto restart;
> > > 
> > > This is hunk shouldn't be needed. The reason for it is probably that the
> > > following wakeup_softirqd() would avoid further invoke_softirq()
> > > performing the actual softirq work. It would leave early due to
> > > ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
> > > outside of an interrupt would do the same. 
> 
> And it does pass the rcutorture test without that hunk:
> 
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
> 
Yep. I have tested that patch also. It works for me as well. So
technically i do not see any issues from the first glance but of
course it should be reviewed by the softirq people to hear their
opinion.

IRQs are enabled, so it can be handled from an IRQ tail until
ksoftirqd threads are spawned.

> > > I would like PeterZ / tglx to comment on this one. Basically I'm not
> > > sure if it is okay to expect softirqs beeing served and waited on that
> > > early in the boot.
> 
> It would be good to get other eyes on this.
> 
> I do agree that "don't wait on softirq handlers until after completion
> of all early_initcall() handlers" is a nice simple rule, but debugging
> violations of it is not so simple.  Adding warnings to ease debugging
> of violations of this rule is quite a bit more complex than is either of
> the methods of making the rule unnecessary, at least from what I can see
> at this point.  The complexity of the warnings is exactly what Sebastian
> pointed out earlier, that it is currently legal to raise_softirq() but
> not to wait on the resulting handlers.  But even waiting is OK if that
> waiting does not delay the boot sequence.  But if the boot kthread waits
> on the kthread that does the waiting, it is once again not OK.
> 
> So am I missing something subtle here?
>
I agree here. Seems like we are on the same page in understanding :)

> > The ksoftirqd threads get spawned during early_initcall() phase. Why not
> > just spawn them one step earlier what is totally safe? I mean before
> > do_pre_smp_initcalls() that calls early callbacks.
> > 
> > +       spawn_ksoftirqd();
> >         rcu_init_tasks_generic();
> >         do_pre_smp_initcalls();
> > 
> > With such change the spawning will not be depended on linker/compiler
> > i.e. when and in which order an early_initcall(spawn_ksoftirqd) callback
> > is executed.
> 
> We both posted patches similar to this, so I am not opposed.  One caveat,
> though, namely that this narrows the window quite a bit but does not
> entirely close it.  But it does allow the early_initcall()s to wait on
> softirq handlers.
> 
Yep, that was an intention. At least to provide such functionality for early
callbacks. What happens before it(init/main.c) is pretty controllable.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-22 17:16                           ` Uladzislau Rezki
@ 2021-02-22 18:16                             ` Paul E. McKenney
  2021-02-22 19:07                               ` Uladzislau Rezki
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-22 18:16 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Masami Hiramatsu, Ingo Molnar,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Mon, Feb 22, 2021 at 06:16:05PM +0100, Uladzislau Rezki wrote:
> On Mon, Feb 22, 2021 at 07:09:03AM -0800, Paul E. McKenney wrote:
> > On Mon, Feb 22, 2021 at 01:54:31PM +0100, Uladzislau Rezki wrote:
> > > On Mon, Feb 22, 2021 at 11:21:04AM +0100, Sebastian Andrzej Siewior wrote:
> > > > On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> > > > > For definiteness, here is the first part of the change, posted earlier.
> > > > > The commit log needs to be updated.  I will post the change that keeps
> > > > > the tick going as a reply to this email.
> > > > …
> > > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > > index 9d71046..ba78e63 100644
> > > > > --- a/kernel/softirq.c
> > > > > +++ b/kernel/softirq.c
> > > > > @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
> > > > >  	if (ksoftirqd_running(local_softirq_pending()))
> > > > >  		return;
> > > > >  
> > > > > -	if (!force_irqthreads) {
> > > > > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> > > > >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> > > > >  		/*
> > > > >  		 * We can safely execute softirq on the current stack if
> > > > > @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > > > >  
> > > > >  	pending = local_softirq_pending();
> > > > >  	if (pending) {
> > > > > -		if (time_before(jiffies, end) && !need_resched() &&
> > > > > -		    --max_restart)
> > > > > +		if (!__this_cpu_read(ksoftirqd) ||
> > > > > +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
> > > > >  			goto restart;
> > > > 
> > > > This is hunk shouldn't be needed. The reason for it is probably that the
> > > > following wakeup_softirqd() would avoid further invoke_softirq()
> > > > performing the actual softirq work. It would leave early due to
> > > > ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
> > > > outside of an interrupt would do the same. 
> > 
> > And it does pass the rcutorture test without that hunk:
> > 
> > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
> > 
> Yep. I have tested that patch also. It works for me as well. So
> technically i do not see any issues from the first glance but of
> course it should be reviewed by the softirq people to hear their
> opinion.
> 
> IRQs are enabled, so it can be handled from an IRQ tail until
> ksoftirqd threads are spawned.

And if I add "CONFIG_NO_HZ_IDLE=y CONFIG_HZ_PERIODIC=n" it still works,
even if I revert my changes to rcu_needs_cpu().  Should I rely on this
working globally?  ;-)

							Thanx, Paul

> > > > I would like PeterZ / tglx to comment on this one. Basically I'm not
> > > > sure if it is okay to expect softirqs beeing served and waited on that
> > > > early in the boot.
> > 
> > It would be good to get other eyes on this.
> > 
> > I do agree that "don't wait on softirq handlers until after completion
> > of all early_initcall() handlers" is a nice simple rule, but debugging
> > violations of it is not so simple.  Adding warnings to ease debugging
> > of violations of this rule is quite a bit more complex than is either of
> > the methods of making the rule unnecessary, at least from what I can see
> > at this point.  The complexity of the warnings is exactly what Sebastian
> > pointed out earlier, that it is currently legal to raise_softirq() but
> > not to wait on the resulting handlers.  But even waiting is OK if that
> > waiting does not delay the boot sequence.  But if the boot kthread waits
> > on the kthread that does the waiting, it is once again not OK.
> > 
> > So am I missing something subtle here?
> >
> I agree here. Seems like we are on the same page in understanding :)
> 
> > > The ksoftirqd threads get spawned during early_initcall() phase. Why not
> > > just spawn them one step earlier what is totally safe? I mean before
> > > do_pre_smp_initcalls() that calls early callbacks.
> > > 
> > > +       spawn_ksoftirqd();
> > >         rcu_init_tasks_generic();
> > >         do_pre_smp_initcalls();
> > > 
> > > With such change the spawning will not be depended on linker/compiler
> > > i.e. when and in which order an early_initcall(spawn_ksoftirqd) callback
> > > is executed.
> > 
> > We both posted patches similar to this, so I am not opposed.  One caveat,
> > though, namely that this narrows the window quite a bit but does not
> > entirely close it.  But it does allow the early_initcall()s to wait on
> > softirq handlers.
> > 
> Yep, that was an intention. At least to provide such functionality for early
> callbacks. What happens before it(init/main.c) is pretty controllable.
> 
> --
> Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-22 18:16                             ` Paul E. McKenney
@ 2021-02-22 19:07                               ` Uladzislau Rezki
  2021-02-22 21:32                                 ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Uladzislau Rezki @ 2021-02-22 19:07 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, Masami Hiramatsu,
	Ingo Molnar, Steven Rostedt, Peter Zijlstra, Thomas Gleixner,
	LKML, RCU, Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Mon, Feb 22, 2021 at 10:16:08AM -0800, Paul E. McKenney wrote:
> On Mon, Feb 22, 2021 at 06:16:05PM +0100, Uladzislau Rezki wrote:
> > On Mon, Feb 22, 2021 at 07:09:03AM -0800, Paul E. McKenney wrote:
> > > On Mon, Feb 22, 2021 at 01:54:31PM +0100, Uladzislau Rezki wrote:
> > > > On Mon, Feb 22, 2021 at 11:21:04AM +0100, Sebastian Andrzej Siewior wrote:
> > > > > On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> > > > > > For definiteness, here is the first part of the change, posted earlier.
> > > > > > The commit log needs to be updated.  I will post the change that keeps
> > > > > > the tick going as a reply to this email.
> > > > > …
> > > > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > > > index 9d71046..ba78e63 100644
> > > > > > --- a/kernel/softirq.c
> > > > > > +++ b/kernel/softirq.c
> > > > > > @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
> > > > > >  	if (ksoftirqd_running(local_softirq_pending()))
> > > > > >  		return;
> > > > > >  
> > > > > > -	if (!force_irqthreads) {
> > > > > > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> > > > > >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> > > > > >  		/*
> > > > > >  		 * We can safely execute softirq on the current stack if
> > > > > > @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > > > > >  
> > > > > >  	pending = local_softirq_pending();
> > > > > >  	if (pending) {
> > > > > > -		if (time_before(jiffies, end) && !need_resched() &&
> > > > > > -		    --max_restart)
> > > > > > +		if (!__this_cpu_read(ksoftirqd) ||
> > > > > > +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
> > > > > >  			goto restart;
> > > > > 
> > > > > This is hunk shouldn't be needed. The reason for it is probably that the
> > > > > following wakeup_softirqd() would avoid further invoke_softirq()
> > > > > performing the actual softirq work. It would leave early due to
> > > > > ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
> > > > > outside of an interrupt would do the same. 
> > > 
> > > And it does pass the rcutorture test without that hunk:
> > > 
> > > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
> > > 
> > Yep. I have tested that patch also. It works for me as well. So
> > technically i do not see any issues from the first glance but of
> > course it should be reviewed by the softirq people to hear their
> > opinion.
> > 
> > IRQs are enabled, so it can be handled from an IRQ tail until
> > ksoftirqd threads are spawned.
> 
> And if I add "CONFIG_NO_HZ_IDLE=y CONFIG_HZ_PERIODIC=n" it still works,
> even if I revert my changes to rcu_needs_cpu().  Should I rely on this
> working globally?  ;-)
> 
There might be corner cases which we are not aware of so far. From the
other hand what the patch does is simulating the !threadirqs behaviour
during early boot. In that case we know that handling of SW irqs from
real-irq tail works :)

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization
  2021-02-22 19:07                               ` Uladzislau Rezki
@ 2021-02-22 21:32                                 ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2021-02-22 21:32 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Masami Hiramatsu, Ingo Molnar,
	Steven Rostedt, Peter Zijlstra, Thomas Gleixner, LKML, RCU,
	Michael Ellerman, Andrew Morton, Daniel Axtens,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Michal Hocko, Theodore Y . Ts'o, Oleksiy Avramchenko

On Mon, Feb 22, 2021 at 08:07:03PM +0100, Uladzislau Rezki wrote:
> On Mon, Feb 22, 2021 at 10:16:08AM -0800, Paul E. McKenney wrote:
> > On Mon, Feb 22, 2021 at 06:16:05PM +0100, Uladzislau Rezki wrote:
> > > On Mon, Feb 22, 2021 at 07:09:03AM -0800, Paul E. McKenney wrote:
> > > > On Mon, Feb 22, 2021 at 01:54:31PM +0100, Uladzislau Rezki wrote:
> > > > > On Mon, Feb 22, 2021 at 11:21:04AM +0100, Sebastian Andrzej Siewior wrote:
> > > > > > On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> > > > > > > For definiteness, here is the first part of the change, posted earlier.
> > > > > > > The commit log needs to be updated.  I will post the change that keeps
> > > > > > > the tick going as a reply to this email.
> > > > > > …
> > > > > > > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > > > > > > index 9d71046..ba78e63 100644
> > > > > > > --- a/kernel/softirq.c
> > > > > > > +++ b/kernel/softirq.c
> > > > > > > @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
> > > > > > >  	if (ksoftirqd_running(local_softirq_pending()))
> > > > > > >  		return;
> > > > > > >  
> > > > > > > -	if (!force_irqthreads) {
> > > > > > > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> > > > > > >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> > > > > > >  		/*
> > > > > > >  		 * We can safely execute softirq on the current stack if
> > > > > > > @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
> > > > > > >  
> > > > > > >  	pending = local_softirq_pending();
> > > > > > >  	if (pending) {
> > > > > > > -		if (time_before(jiffies, end) && !need_resched() &&
> > > > > > > -		    --max_restart)
> > > > > > > +		if (!__this_cpu_read(ksoftirqd) ||
> > > > > > > +		    (time_before(jiffies, end) && !need_resched() && --max_restart))
> > > > > > >  			goto restart;
> > > > > > 
> > > > > > This is hunk shouldn't be needed. The reason for it is probably that the
> > > > > > following wakeup_softirqd() would avoid further invoke_softirq()
> > > > > > performing the actual softirq work. It would leave early due to
> > > > > > ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
> > > > > > outside of an interrupt would do the same. 
> > > > 
> > > > And it does pass the rcutorture test without that hunk:
> > > > 
> > > > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
> > > > 
> > > Yep. I have tested that patch also. It works for me as well. So
> > > technically i do not see any issues from the first glance but of
> > > course it should be reviewed by the softirq people to hear their
> > > opinion.
> > > 
> > > IRQs are enabled, so it can be handled from an IRQ tail until
> > > ksoftirqd threads are spawned.
> > 
> > And if I add "CONFIG_NO_HZ_IDLE=y CONFIG_HZ_PERIODIC=n" it still works,
> > even if I revert my changes to rcu_needs_cpu().  Should I rely on this
> > working globally?  ;-)
> > 
> There might be corner cases which we are not aware of so far. From the
> other hand what the patch does is simulating the !threadirqs behaviour
> during early boot. In that case we know that handling of SW irqs from
> real-irq tail works :)

Sold!  I keep the rcu_needs_cpu() changes, just in case.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-02-22 21:33 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-18 14:29 [PATCH] kprobes: Fix to delay the kprobes jump optimization Masami Hiramatsu
2021-02-18 15:15 ` Paul E. McKenney
2021-02-19  8:17   ` Sebastian Andrzej Siewior
2021-02-19 10:49     ` Uladzislau Rezki
2021-02-19 10:57       ` Sebastian Andrzej Siewior
2021-02-19 11:13         ` Uladzislau Rezki
2021-02-19 11:17           ` Sebastian Andrzej Siewior
2021-02-19 11:23             ` Uladzislau Rezki
2021-02-19 11:27               ` Uladzislau Rezki
2021-02-19 18:18                 ` Paul E. McKenney
2021-02-19 18:33                   ` Paul E. McKenney
2021-02-19 19:34                     ` Paul E. McKenney
2021-02-19 20:02                     ` Steven Rostedt
2021-02-19 21:22                       ` Paul E. McKenney
2021-02-22 10:21                     ` Sebastian Andrzej Siewior
2021-02-22 12:54                       ` Uladzislau Rezki
2021-02-22 15:09                         ` Paul E. McKenney
2021-02-22 17:16                           ` Uladzislau Rezki
2021-02-22 18:16                             ` Paul E. McKenney
2021-02-22 19:07                               ` Uladzislau Rezki
2021-02-22 21:32                                 ` Paul E. McKenney
2021-02-19 19:14                   ` Steven Rostedt
2021-02-19 19:45                     ` Paul E. McKenney
2021-02-19 20:04                       ` Paul E. McKenney
2021-02-22 10:04                   ` Sebastian Andrzej Siewior
2021-02-19 19:36 ` Steven Rostedt
2021-02-19 19:47   ` Paul E. McKenney
2021-02-19 19:58     ` Steven Rostedt
2021-02-19 20:04       ` Paul E. McKenney
2021-02-22 12:06         ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).