linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch V2 0/9] softirq: Make it RT aware
@ 2020-12-04 17:01 Thomas Gleixner
  2020-12-04 17:01 ` [patch V2 1/9] softirq: Add RT specific softirq accounting Thomas Gleixner
                   ` (9 more replies)
  0 siblings, 10 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

RT runs softirq processing always in thread context and it requires that
both the softirq execution and the BH disabled sections are preemptible.

This is achieved by serialization through per CPU local locks and
substituting a few parts of the existing softirq processing code with
helper functions.

The series applies on top of

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core

which already contains the irqstat cleanups and Frederic's irq time
accounting changes.

Changes to V1 which can be found here:

  https://lore.kernel.org/r/20201113140207.499353218@linutronix.de

 - Rebase on top of Frederic's irq time accounting changes
 - Addressing the review comments
 - Fixing the fallout from the irq time accounting updates

The RT variant has sucessfully been tested in the current 5.10-rt
patches. For non-RT kernels there is no functional change.

Thanks,

	tglx
---
 include/linux/bottom_half.h |    8 +
 include/linux/hardirq.h     |    1 
 include/linux/interrupt.h   |   13 --
 include/linux/preempt.h     |    6 
 include/linux/rcupdate.h    |    3 
 include/linux/sched.h       |    3 
 kernel/sched/cputime.c      |    4 
 kernel/softirq.c            |  280 +++++++++++++++++++++++++++++++++++++++++---
 kernel/time/tick-sched.c    |    2 
 9 files changed, 291 insertions(+), 29 deletions(-)



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 1/9] softirq: Add RT specific softirq accounting
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-07 13:06   ` Frederic Weisbecker
  2020-12-04 17:01 ` [patch V2 2/9] irqtime: Make accounting correct on RT Thomas Gleixner
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

RT requires the softirq processing and local bottomhalf disabled regions to
be preemptible. Using the normal preempt count based serialization is
therefore not possible because this implicitely disables preemption.

RT kernels use a per CPU local lock to serialize bottomhalfs. As
local_bh_disable() can nest the lock can only be acquired on the outermost
invocation of local_bh_disable() and released when the nest count becomes
zero. Tasks which hold the local lock can be preempted so its required to
keep track of the nest count per task.

Add a RT only counter to task struct and adjust the relevant macros in
preempt.h.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Rewrote changelog.
---
 include/linux/hardirq.h |    1 +
 include/linux/preempt.h |    6 +++++-
 include/linux/sched.h   |    3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -6,6 +6,7 @@
 #include <linux/preempt.h>
 #include <linux/lockdep.h>
 #include <linux/ftrace_irq.h>
+#include <linux/sched.h>
 #include <linux/vtime.h>
 #include <asm/hardirq.h>
 
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -79,7 +79,11 @@
 
 #define nmi_count()	(preempt_count() & NMI_MASK)
 #define hardirq_count()	(preempt_count() & HARDIRQ_MASK)
-#define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
+#ifdef CONFIG_PREEMPT_RT
+# define softirq_count()	(current->softirq_disable_cnt & SOFTIRQ_MASK)
+#else
+# define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
+#endif
 #define irq_count()	(nmi_count() | hardirq_count() | softirq_count())
 
 /*
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1004,6 +1004,9 @@ struct task_struct {
 	int				softirq_context;
 	int				irq_config;
 #endif
+#ifdef CONFIG_PREEMPT_RT
+	int				softirq_disable_cnt;
+#endif
 
 #ifdef CONFIG_LOCKDEP
 # define MAX_LOCK_DEPTH			48UL


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 2/9] irqtime: Make accounting correct on RT
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
  2020-12-04 17:01 ` [patch V2 1/9] softirq: Add RT specific softirq accounting Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-07  0:23   ` Frederic Weisbecker
  2020-12-07 13:27   ` Frederic Weisbecker
  2020-12-04 17:01 ` [patch V2 3/9] softirq: Move various protections into inline helpers Thomas Gleixner
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

vtime_account_irq and irqtime_account_irq() base checks on preempt_count()
which fails on RT because preempt_count() does not contain the softirq
accounting which is seperate on RT.

These checks do not need the full preempt count as they only operate on the
hard and softirq sections.

Use irq_count() instead which provides the correct value on both RT and non
RT kernels. The compiler is clever enough to fold the masking for !RT:

       99b:	65 8b 05 00 00 00 00 	mov    %gs:0x0(%rip),%eax
 -     9a2:	25 ff ff ff 7f       	and    $0x7fffffff,%eax
 +     9a2:	25 00 ff ff 00       	and    $0xffff00,%eax

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 kernel/sched/cputime.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -60,7 +60,7 @@ void irqtime_account_irq(struct task_str
 	cpu = smp_processor_id();
 	delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
 	irqtime->irq_start_time += delta;
-	pc = preempt_count() - offset;
+	pc = irq_count() - offset;
 
 	/*
 	 * We do not account for softirq time from ksoftirqd here.
@@ -421,7 +421,7 @@ void vtime_task_switch(struct task_struc
 
 void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
 {
-	unsigned int pc = preempt_count() - offset;
+	unsigned int pc = irq_count() - offset;
 
 	if (pc & HARDIRQ_OFFSET) {
 		vtime_account_hardirq(tsk);


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 3/9] softirq: Move various protections into inline helpers
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
  2020-12-04 17:01 ` [patch V2 1/9] softirq: Add RT specific softirq accounting Thomas Gleixner
  2020-12-04 17:01 ` [patch V2 2/9] irqtime: Make accounting correct on RT Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-07 13:37   ` Frederic Weisbecker
  2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

To allow reuse of the bulk of softirq processing code for RT and to avoid
#ifdeffery all over the place, split protections for various code sections
out into inline helpers so the RT variant can just replace them in one go.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Adapt to Frederics rework
---
 kernel/softirq.c |   39 ++++++++++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 7 deletions(-)

--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -204,6 +204,32 @@ void __local_bh_enable_ip(unsigned long
 }
 EXPORT_SYMBOL(__local_bh_enable_ip);
 
+static inline void softirq_handle_begin(void)
+{
+	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
+}
+
+static inline void softirq_handle_end(void)
+{
+	__local_bh_enable(SOFTIRQ_OFFSET);
+	WARN_ON_ONCE(in_interrupt());
+}
+
+static inline void ksoftirqd_run_begin(void)
+{
+	local_irq_disable();
+}
+
+static inline void ksoftirqd_run_end(void)
+{
+	local_irq_enable();
+}
+
+static inline bool should_wake_ksoftirqd(void)
+{
+	return true;
+}
+
 static inline void invoke_softirq(void)
 {
 	if (ksoftirqd_running(local_softirq_pending()))
@@ -316,7 +342,7 @@ asmlinkage __visible void __softirq_entr
 
 	pending = local_softirq_pending();
 
-	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
+	softirq_handle_begin();
 	in_hardirq = lockdep_softirq_start();
 	account_softirq_enter(current);
 
@@ -367,8 +393,7 @@ asmlinkage __visible void __softirq_entr
 
 	account_softirq_exit(current);
 	lockdep_softirq_end(in_hardirq);
-	__local_bh_enable(SOFTIRQ_OFFSET);
-	WARN_ON_ONCE(in_interrupt());
+	softirq_handle_end();
 	current_restore_flags(old_flags, PF_MEMALLOC);
 }
 
@@ -463,7 +488,7 @@ inline void raise_softirq_irqoff(unsigne
 	 * Otherwise we wake up ksoftirqd to make sure we
 	 * schedule the softirq soon.
 	 */
-	if (!in_interrupt())
+	if (!in_interrupt() && should_wake_ksoftirqd())
 		wakeup_softirqd();
 }
 
@@ -641,18 +666,18 @@ static int ksoftirqd_should_run(unsigned
 
 static void run_ksoftirqd(unsigned int cpu)
 {
-	local_irq_disable();
+	ksoftirqd_run_begin();
 	if (local_softirq_pending()) {
 		/*
 		 * We can safely run softirq on inline stack, as we are not deep
 		 * in the task stack here.
 		 */
 		__do_softirq();
-		local_irq_enable();
+		ksoftirqd_run_end();
 		cond_resched();
 		return;
 	}
-	local_irq_enable();
+	ksoftirqd_run_end();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (2 preceding siblings ...)
  2020-12-04 17:01 ` [patch V2 3/9] softirq: Move various protections into inline helpers Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-07 14:16   ` Frederic Weisbecker
                     ` (3 more replies)
  2020-12-04 17:01 ` [patch V2 5/9] tick/sched: Prevent false positive softirq pending warnings on RT Thomas Gleixner
                   ` (5 subsequent siblings)
  9 siblings, 4 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

Provide a local lock based serialization for soft interrupts on RT which
allows the local_bh_disabled() sections and servicing soft interrupts to be
preemptible.

Provide the necessary inline helpers which allow to reuse the bulk of the
softirq processing code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Adjusted to Frederic's changes
---
 include/linux/bottom_half.h |    2 
 kernel/softirq.c            |  188 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 182 insertions(+), 8 deletions(-)

--- a/include/linux/bottom_half.h
+++ b/include/linux/bottom_half.h
@@ -4,7 +4,7 @@
 
 #include <linux/preempt.h>
 
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_TRACE_IRQFLAGS)
 extern void __local_bh_disable_ip(unsigned long ip, unsigned int cnt);
 #else
 static __always_inline void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -13,6 +13,7 @@
 #include <linux/kernel_stat.h>
 #include <linux/interrupt.h>
 #include <linux/init.h>
+#include <linux/local_lock.h>
 #include <linux/mm.h>
 #include <linux/notifier.h>
 #include <linux/percpu.h>
@@ -100,20 +101,189 @@ EXPORT_PER_CPU_SYMBOL_GPL(hardirq_contex
 #endif
 
 /*
- * preempt_count and SOFTIRQ_OFFSET usage:
- * - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving
- *   softirq processing.
- * - preempt_count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET)
+ * SOFTIRQ_OFFSET usage:
+ *
+ * On !RT kernels 'count' is the preempt counter, on RT kernels this applies
+ * to a per CPU counter and to task::softirqs_disabled_cnt.
+ *
+ * - count is changed by SOFTIRQ_OFFSET on entering or leaving softirq
+ *   processing.
+ *
+ * - count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET)
  *   on local_bh_disable or local_bh_enable.
+ *
  * This lets us distinguish between whether we are currently processing
  * softirq and whether we just have bh disabled.
  */
+#ifdef CONFIG_PREEMPT_RT
+
+/*
+ * RT accounts for BH disabled sections in task::softirqs_disabled_cnt and
+ * also in per CPU softirq_ctrl::cnt. This is necessary to allow tasks in a
+ * softirq disabled section to be preempted.
+ *
+ * The per task counter is used for softirq_count(), in_softirq() and
+ * in_serving_softirqs() because these counts are only valid when the task
+ * holding softirq_ctrl::lock is running.
+ *
+ * The per CPU counter prevents pointless wakeups of ksoftirqd in case that
+ * the task which is in a softirq disabled section is preempted or blocks.
+ */
+struct softirq_ctrl {
+	local_lock_t	lock;
+	int		cnt;
+};
+
+static DEFINE_PER_CPU(struct softirq_ctrl, softirq_ctrl) = {
+	.lock	= INIT_LOCAL_LOCK(softirq_ctrl.lock),
+};
+
+void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
+{
+	unsigned long flags;
+	int newcnt;
+
+	WARN_ON_ONCE(in_hardirq());
+
+	/* First entry of a task into a BH disabled section? */
+	if (!current->softirq_disable_cnt) {
+		if (preemptible()) {
+			local_lock(&softirq_ctrl.lock);
+			/* Required to meet the RCU bottomhalf requirements. */
+			rcu_read_lock();
+		} else {
+			DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt));
+		}
+	}
+
+	/*
+	 * Track the per CPU softirq disabled state. On RT this is per CPU
+	 * state to allow preemption of bottom half disabled sections.
+	 */
+	newcnt = __this_cpu_add_return(softirq_ctrl.cnt, cnt);
+	/*
+	 * Reflect the result in the task state to prevent recursion on the
+	 * local lock and to make softirq_count() & al work.
+	 */
+	current->softirq_disable_cnt = newcnt;
+
+	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && newcnt == cnt) {
+		raw_local_irq_save(flags);
+		lockdep_softirqs_off(ip);
+		raw_local_irq_restore(flags);
+	}
+}
+EXPORT_SYMBOL(__local_bh_disable_ip);
+
+static void __local_bh_enable(unsigned int cnt, bool unlock)
+{
+	unsigned long flags;
+	int newcnt;
+
+	DEBUG_LOCKS_WARN_ON(current->softirq_disable_cnt !=
+			    this_cpu_read(softirq_ctrl.cnt));
+
+	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && softirq_count() == cnt) {
+		raw_local_irq_save(flags);
+		lockdep_softirqs_on(_RET_IP_);
+		raw_local_irq_restore(flags);
+	}
+
+	newcnt = __this_cpu_sub_return(softirq_ctrl.cnt, cnt);
+	current->softirq_disable_cnt = newcnt;
+
+	if (!newcnt && unlock) {
+		rcu_read_unlock();
+		local_unlock(&softirq_ctrl.lock);
+	}
+}
+
+void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
+{
+	bool preempt_on = preemptible();
+	unsigned long flags;
+	u32 pending;
+	int curcnt;
+
+	WARN_ON_ONCE(in_irq());
+	lockdep_assert_irqs_enabled();
+
+	local_irq_save(flags);
+	curcnt = this_cpu_read(softirq_ctrl.cnt);
+
+	/*
+	 * If this is not reenabling soft interrupts, no point in trying to
+	 * run pending ones.
+	 */
+	if (curcnt != cnt)
+		goto out;
+
+	pending = local_softirq_pending();
+	if (!pending || ksoftirqd_running(pending))
+		goto out;
+
+	/*
+	 * If this was called from non preemptible context, wake up the
+	 * softirq daemon.
+	 */
+	if (!preempt_on) {
+		wakeup_softirqd();
+		goto out;
+	}
+
+	/*
+	 * Adjust softirq count to SOFTIRQ_OFFSET which makes
+	 * in_serving_softirq() become true.
+	 */
+	cnt = SOFTIRQ_OFFSET;
+	__local_bh_enable(cnt, false);
+	__do_softirq();
+
+out:
+	__local_bh_enable(cnt, preempt_on);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL(__local_bh_enable_ip);
+
+/*
+ * Invoked from ksoftirqd_run() outside of the interrupt disabled section
+ * to acquire the per CPU local lock for reentrancy protection.
+ */
+static inline void ksoftirqd_run_begin(void)
+{
+	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
+	local_irq_disable();
+}
+
+/* Counterpart to ksoftirqd_run_begin() */
+static inline void ksoftirqd_run_end(void)
+{
+	__local_bh_enable(SOFTIRQ_OFFSET, true);
+	WARN_ON_ONCE(in_interrupt());
+	local_irq_enable();
+}
+
+static inline void softirq_handle_begin(void) { }
+static inline void softirq_handle_end(void) { }
+
+static inline bool should_wake_ksoftirqd(void)
+{
+	return !this_cpu_read(softirq_ctrl.cnt);
+}
+
+static inline void invoke_softirq(void)
+{
+	if (should_wake_ksoftirqd())
+		wakeup_softirqd();
+}
+
+#else /* CONFIG_PREEMPT_RT */
 
-#ifdef CONFIG_TRACE_IRQFLAGS
 /*
- * This is for softirq.c-internal use, where hardirqs are disabled
+ * This one is for softirq.c-internal use, where hardirqs are disabled
  * legitimately:
  */
+#ifdef CONFIG_TRACE_IRQFLAGS
 void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
 {
 	unsigned long flags;
@@ -274,6 +444,8 @@ asmlinkage __visible void do_softirq(voi
 	local_irq_restore(flags);
 }
 
+#endif /* !CONFIG_PREEMPT_RT */
+
 /*
  * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
  * but break the loop if need_resched() is set or after 2 ms.
@@ -378,8 +550,10 @@ asmlinkage __visible void __softirq_entr
 		pending >>= softirq_bit;
 	}
 
-	if (__this_cpu_read(ksoftirqd) == current)
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) &&
+	    __this_cpu_read(ksoftirqd) == current)
 		rcu_softirq_qs();
+
 	local_irq_disable();
 
 	pending = local_softirq_pending();


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 5/9] tick/sched: Prevent false positive softirq pending warnings on RT
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (3 preceding siblings ...)
  2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-08 12:23   ` Frederic Weisbecker
  2020-12-04 17:01 ` [patch V2 6/9] rcu: Prevent false positive softirq warning " Thomas Gleixner
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

On RT a task which has soft interrupts disabled can block on a lock and
schedule out to idle while soft interrupts are pending. This triggers the
warning in the NOHZ idle code which complains about going idle with pending
soft interrupts. But as the task is blocked soft interrupt processing is
temporarily blocked as well which means that such a warning is a false
positive.

To prevent that check the per CPU state which indicates that a scheduled
out task has soft interrupts disabled.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/bottom_half.h |    6 ++++++
 kernel/softirq.c            |   15 +++++++++++++++
 kernel/time/tick-sched.c    |    2 +-
 3 files changed, 22 insertions(+), 1 deletion(-)

--- a/include/linux/bottom_half.h
+++ b/include/linux/bottom_half.h
@@ -32,4 +32,10 @@ static inline void local_bh_enable(void)
 	__local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
 }
 
+#ifdef CONFIG_PREEMPT_RT
+extern bool local_bh_blocked(void);
+#else
+static inline bool local_bh_blocked(void) { return false; }
+#endif
+
 #endif /* _LINUX_BH_H */
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -138,6 +138,21 @@ static DEFINE_PER_CPU(struct softirq_ctr
 	.lock	= INIT_LOCAL_LOCK(softirq_ctrl.lock),
 };
 
+/**
+ * local_bh_blocked() - Check for idle whether BH processing is blocked
+ *
+ * Returns false if the per CPU softirq::cnt is 0 otherwise true.
+ *
+ * This is invoked from the idle task to guard against false positive
+ * softirq pending warnings, which would happen when the task which holds
+ * softirq_ctrl::lock was the only running task on the CPU and blocks on
+ * some other lock.
+ */
+bool local_bh_blocked(void)
+{
+	return this_cpu_read(softirq_ctrl.cnt) != 0;
+}
+
 void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
 {
 	unsigned long flags;
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -925,7 +925,7 @@ static bool can_stop_idle_tick(int cpu,
 	if (unlikely(local_softirq_pending())) {
 		static int ratelimit;
 
-		if (ratelimit < 10 &&
+		if (ratelimit < 10 && !local_bh_blocked() &&
 		    (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
 			pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
 				(unsigned int) local_softirq_pending());


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 6/9] rcu: Prevent false positive softirq warning on RT
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (4 preceding siblings ...)
  2020-12-04 17:01 ` [patch V2 5/9] tick/sched: Prevent false positive softirq pending warnings on RT Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-04 17:59   ` Paul E. McKenney
  2020-12-04 17:01 ` [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait() Thomas Gleixner
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

Soft interrupt disabled sections can legitimately be preempted or schedule
out when blocking on a lock on RT enabled kernels so the RCU preempt check
warning has to be disabled for RT kernels.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/rcupdate.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -319,7 +319,8 @@ static inline void rcu_preempt_sleep_che
 #define rcu_sleep_check()						\
 	do {								\
 		rcu_preempt_sleep_check();				\
-		RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map),	\
+		if (!IS_ENABLED(CONFIG_PREEMPT_RT))			\
+		    RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map),	\
 				 "Illegal context switch in RCU-bh read-side critical section"); \
 		RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map),	\
 				 "Illegal context switch in RCU-sched read-side critical section"); \


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait()
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (5 preceding siblings ...)
  2020-12-04 17:01 ` [patch V2 6/9] rcu: Prevent false positive softirq warning " Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-07 11:39   ` Peter Zijlstra
  2020-12-04 17:01 ` [patch V2 8/9] tasklets: Use static inlines for stub implementations Thomas Gleixner
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

A barrier() in a tight loop which waits for something to happen on a remote
CPU is a pointless exercise. Replace it with cpu_relax() which allows HT
siblings to make progress.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/interrupt.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -668,7 +668,8 @@ static inline void tasklet_unlock(struct
 
 static inline void tasklet_unlock_wait(struct tasklet_struct *t)
 {
-	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
+	while (test_bit(TASKLET_STATE_RUN, &(t)->state))
+		cpu_relax();
 }
 #else
 #define tasklet_trylock(t) 1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 8/9] tasklets: Use static inlines for stub implementations
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (6 preceding siblings ...)
  2020-12-04 17:01 ` [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait() Thomas Gleixner
@ 2020-12-04 17:01 ` Thomas Gleixner
  2020-12-04 17:02 ` [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT Thomas Gleixner
  2020-12-06 10:05 ` [patch V2 0/9] softirq: Make it RT aware Sebastian Andrzej Siewior
  9 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:01 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

Inlines exist for a reason.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/interrupt.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -672,9 +672,9 @@ static inline void tasklet_unlock_wait(s
 		cpu_relax();
 }
 #else
-#define tasklet_trylock(t) 1
-#define tasklet_unlock_wait(t) do { } while (0)
-#define tasklet_unlock(t) do { } while (0)
+static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; }
+static inline void tasklet_unlock(struct tasklet_struct *t) { }
+static inline void tasklet_unlock_wait(struct tasklet_struct *t) { }
 #endif
 
 extern void __tasklet_schedule(struct tasklet_struct *t);



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (7 preceding siblings ...)
  2020-12-04 17:01 ` [patch V2 8/9] tasklets: Use static inlines for stub implementations Thomas Gleixner
@ 2020-12-04 17:02 ` Thomas Gleixner
  2020-12-07 11:47   ` Peter Zijlstra
  2020-12-06 10:05 ` [patch V2 0/9] softirq: Make it RT aware Sebastian Andrzej Siewior
  9 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-04 17:02 UTC (permalink / raw)
  To: LKML
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior

From: Thomas Gleixner <tglx@linutronix.de>

tasklet_kill() and tasklet_unlock_wait() spin and wait for the
TASKLET_STATE_SCHED resp. TASKLET_STATE_RUN bit in the tasklet state to be
cleared. This works on !RT nicely because the corresponding execution can
only happen on a different CPU.

On RT softirq processing is preemptible, therefore a task preempting the
softirq processing thread can spin forever.

Prevent this by invoking local_bh_disable()/enable() inside the loop. In
case that the softirq processing thread was preempted by the current task,
current will block on the local lock which yields the CPU to the preempted
softirq processing thread. If the tasklet is processed on a different CPU
then the local_bh_disable()/enable() pair is just a waste of processor
cycles.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/interrupt.h |    8 ++------
 kernel/softirq.c          |   38 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 39 insertions(+), 7 deletions(-)

--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -654,7 +654,7 @@ enum
 	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
 };
 
-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
 static inline int tasklet_trylock(struct tasklet_struct *t)
 {
 	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
@@ -666,11 +666,7 @@ static inline void tasklet_unlock(struct
 	clear_bit(TASKLET_STATE_RUN, &(t)->state);
 }
 
-static inline void tasklet_unlock_wait(struct tasklet_struct *t)
-{
-	while (test_bit(TASKLET_STATE_RUN, &(t)->state))
-		cpu_relax();
-}
+void tasklet_unlock_wait(struct tasklet_struct *t);
 #else
 static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; }
 static inline void tasklet_unlock(struct tasklet_struct *t) { }
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -818,6 +818,29 @@ void tasklet_init(struct tasklet_struct
 }
 EXPORT_SYMBOL(tasklet_init);
 
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
+
+void tasklet_unlock_wait(struct tasklet_struct *t)
+{
+	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) {
+		if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+			/*
+			 * Prevent a live lock when current preempted soft
+			 * interrupt processing or prevents ksoftirqd from
+			 * running. If the tasklet runs on a different CPU
+			 * then this has no effect other than doing the BH
+			 * disable/enable dance for nothing.
+			 */
+			local_bh_disable();
+			local_bh_enable();
+		} else {
+			cpu_relax();
+		}
+	}
+}
+EXPORT_SYMBOL(tasklet_unlock_wait);
+#endif
+
 void tasklet_kill(struct tasklet_struct *t)
 {
 	if (in_interrupt())
@@ -825,7 +848,20 @@ void tasklet_kill(struct tasklet_struct
 
 	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
 		do {
-			yield();
+			if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+				/*
+				 * Prevent a live lock when current
+				 * preempted soft interrupt processing or
+				 * prevents ksoftirqd from running. If the
+				 * tasklet runs on a different CPU then
+				 * this has no effect other than doing the
+				 * BH disable/enable dance for nothing.
+				 */
+				local_bh_disable();
+				local_bh_enable();
+			} else {
+				yield();
+			}
 		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
 	}
 	tasklet_unlock_wait(t);


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 6/9] rcu: Prevent false positive softirq warning on RT
  2020-12-04 17:01 ` [patch V2 6/9] rcu: Prevent false positive softirq warning " Thomas Gleixner
@ 2020-12-04 17:59   ` Paul E. McKenney
  0 siblings, 0 replies; 38+ messages in thread
From: Paul E. McKenney @ 2020-12-04 17:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Frederic Weisbecker, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:57PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Soft interrupt disabled sections can legitimately be preempted or schedule
> out when blocking on a lock on RT enabled kernels so the RCU preempt check
> warning has to be disabled for RT kernels.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/rcupdate.h |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -319,7 +319,8 @@ static inline void rcu_preempt_sleep_che
>  #define rcu_sleep_check()						\
>  	do {								\
>  		rcu_preempt_sleep_check();				\
> -		RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map),	\
> +		if (!IS_ENABLED(CONFIG_PREEMPT_RT))			\
> +		    RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map),	\
>  				 "Illegal context switch in RCU-bh read-side critical section"); \
>  		RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map),	\
>  				 "Illegal context switch in RCU-sched read-side critical section"); \
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 0/9] softirq: Make it RT aware
  2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
                   ` (8 preceding siblings ...)
  2020-12-04 17:02 ` [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT Thomas Gleixner
@ 2020-12-06 10:05 ` Sebastian Andrzej Siewior
  9 siblings, 0 replies; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-06 10:05 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Peter Zijlstra, Frederic Weisbecker, Paul McKenney

On 2020-12-04 18:01:51 [+0100], Thomas Gleixner wrote:
> The RT variant has sucessfully been tested in the current 5.10-rt
> patches. For non-RT kernels there is no functional change.

this series is part of v5.10-rc6-rt14.

Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> Thanks,
> 
> 	tglx

Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 2/9] irqtime: Make accounting correct on RT
  2020-12-04 17:01 ` [patch V2 2/9] irqtime: Make accounting correct on RT Thomas Gleixner
@ 2020-12-07  0:23   ` Frederic Weisbecker
  2020-12-07  0:57     ` Thomas Gleixner
  2020-12-07 13:27   ` Frederic Weisbecker
  1 sibling, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-07  0:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:53PM +0100, Thomas Gleixner wrote:
> vtime_account_irq and irqtime_account_irq() base checks on preempt_count()
> which fails on RT because preempt_count() does not contain the softirq
> accounting which is seperate on RT.
> 
> These checks do not need the full preempt count as they only operate on the
> hard and softirq sections.
> 
> Use irq_count() instead which provides the correct value on both RT and non
> RT kernels. The compiler is clever enough to fold the masking for !RT:
> 
>        99b:	65 8b 05 00 00 00 00 	mov    %gs:0x0(%rip),%eax
>  -     9a2:	25 ff ff ff 7f       	and    $0x7fffffff,%eax
>  +     9a2:	25 00 ff ff 00       	and    $0xffff00,%eax
> 
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch
> ---
>  kernel/sched/cputime.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -60,7 +60,7 @@ void irqtime_account_irq(struct task_str
>  	cpu = smp_processor_id();
>  	delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
>  	irqtime->irq_start_time += delta;
> -	pc = preempt_count() - offset;
> +	pc = irq_count() - offset;

There are many preempt_count() users all around waiting for similar issues.
Wouldn't it be more reasonable to have current->softirq_disable_cnt just saving
the softirq count on context switch?

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d2003a7d5ab5..6c899c35d6ba 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3469,6 +3469,10 @@ static inline void prepare_task(struct task_struct *next)
 
 static inline void finish_task(struct task_struct *prev)
 {
+#ifdef CONFIG_PREEMPT_RT
+	prev->softirq_disable_cnt = softirq_count();
+	__preempt_count_sub(prev->softirq_disable_cnt);
+#endif
 #ifdef CONFIG_SMP
 	/*
 	 * This must be the very last reference to @prev from this CPU. After
@@ -3610,6 +3614,9 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	vtime_task_switch(prev);
 	perf_event_task_sched_in(prev, current);
 	finish_task(prev);
+#ifdef CONFIG_PREEMPT_RT	
+	__preempt_count_add(current->softirq_disable_cnt);
+#endif
 	finish_lock_switch(rq);
 	finish_arch_post_lock_switch();
 	kcov_finish_switch(current);


And I expect a few adjustments in schedule_debug() and atomic checks more
generally but that should be it and it would probably be less error-prone.
Although I'm probably overlooking other issues on the way.

Thanks.

>  
>  	/*
>  	 * We do not account for softirq time from ksoftirqd here.
> @@ -421,7 +421,7 @@ void vtime_task_switch(struct task_struc
>  
>  void vtime_account_irq(struct task_struct *tsk, unsigned int offset)
>  {
> -	unsigned int pc = preempt_count() - offset;
> +	unsigned int pc = irq_count() - offset;
>  
>  	if (pc & HARDIRQ_OFFSET) {
>  		vtime_account_hardirq(tsk);
> 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [patch V2 2/9] irqtime: Make accounting correct on RT
  2020-12-07  0:23   ` Frederic Weisbecker
@ 2020-12-07  0:57     ` Thomas Gleixner
  2020-12-07  1:14       ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07  0:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Mon, Dec 07 2020 at 01:23, Frederic Weisbecker wrote:
>> --- a/kernel/sched/cputime.c
>> +++ b/kernel/sched/cputime.c
>> @@ -60,7 +60,7 @@ void irqtime_account_irq(struct task_str
>>  	cpu = smp_processor_id();
>>  	delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
>>  	irqtime->irq_start_time += delta;
>> -	pc = preempt_count() - offset;
>> +	pc = irq_count() - offset;
>
> There are many preempt_count() users all around waiting for similar issues.
> Wouldn't it be more reasonable to have current->softirq_disable_cnt just saving
> the softirq count on context switch?

There are not that many and all of them need to be looked at.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d2003a7d5ab5..6c899c35d6ba 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3469,6 +3469,10 @@ static inline void prepare_task(struct task_struct *next)
>  
>  static inline void finish_task(struct task_struct *prev)
>  {
> +#ifdef CONFIG_PREEMPT_RT
> +	prev->softirq_disable_cnt = softirq_count();
> +	__preempt_count_sub(prev->softirq_disable_cnt);
> +#endif

You fundamentaly break RT with that.

If local_bh_disable() fiddles with the actual preempt_count on RT then
softirq disabled sections and softirq processing are not longer
preemtible.

You asked me in the last round of patches to add a proper argument for
pulling out the softirq count from preempt_count. Here is the revised
changelog which you agreed with:

 "RT requires the softirq processing and local bottomhalf disabled regions to
  be preemptible. Using the normal preempt count based serialization is
  therefore not possible because this implicitely disables preemption.
  ....
 "

Full text in patch 1/9.

According to the above folding of softirq count into the actual preempt
count cannot work at all.

The current RT approach just works except for the places which look at
the raw preempt_count and not using the wrappers. Those places are
restricted to core code and a pretty small number.

Trying to do what you suggest would be a major surgery all over the
place including a complete trainwreck on the highly optimized
preempt_enable() --> preempt decision.

> And I expect a few adjustments in schedule_debug() and atomic checks more
> generally but that should be it and it would probably be less error-prone.

Dream on.

> Although I'm probably overlooking other issues on the way.

I think so.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 2/9] irqtime: Make accounting correct on RT
  2020-12-07  0:57     ` Thomas Gleixner
@ 2020-12-07  1:14       ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-07  1:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Mon, Dec 07, 2020 at 01:57:25AM +0100, Thomas Gleixner wrote:
> On Mon, Dec 07 2020 at 01:23, Frederic Weisbecker wrote:
> >> --- a/kernel/sched/cputime.c
> >> +++ b/kernel/sched/cputime.c
> >> @@ -60,7 +60,7 @@ void irqtime_account_irq(struct task_str
> >>  	cpu = smp_processor_id();
> >>  	delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
> >>  	irqtime->irq_start_time += delta;
> >> -	pc = preempt_count() - offset;
> >> +	pc = irq_count() - offset;
> >
> > There are many preempt_count() users all around waiting for similar issues.
> > Wouldn't it be more reasonable to have current->softirq_disable_cnt just saving
> > the softirq count on context switch?
> 
> There are not that many and all of them need to be looked at.
> 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index d2003a7d5ab5..6c899c35d6ba 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3469,6 +3469,10 @@ static inline void prepare_task(struct task_struct *next)
> >  
> >  static inline void finish_task(struct task_struct *prev)
> >  {
> > +#ifdef CONFIG_PREEMPT_RT
> > +	prev->softirq_disable_cnt = softirq_count();
> > +	__preempt_count_sub(prev->softirq_disable_cnt);
> > +#endif
> 
> You fundamentaly break RT with that.
> 
> If local_bh_disable() fiddles with the actual preempt_count on RT then
> softirq disabled sections and softirq processing are not longer
> preemtible.
> 
> You asked me in the last round of patches to add a proper argument for
> pulling out the softirq count from preempt_count. Here is the revised
> changelog which you agreed with:
> 
>  "RT requires the softirq processing and local bottomhalf disabled regions to
>   be preemptible. Using the normal preempt count based serialization is
>   therefore not possible because this implicitely disables preemption.
>   ....
>  "
> 
> Full text in patch 1/9.
> 
> According to the above folding of softirq count into the actual preempt
> count cannot work at all.
> 
> The current RT approach just works except for the places which look at
> the raw preempt_count and not using the wrappers. Those places are
> restricted to core code and a pretty small number.
> 
> Trying to do what you suggest would be a major surgery all over the
> place including a complete trainwreck on the highly optimized
> preempt_enable() --> preempt decision.

I suspected it was more complicated than I imagined :-)
Nevermind.

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait()
  2020-12-04 17:01 ` [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait() Thomas Gleixner
@ 2020-12-07 11:39   ` Peter Zijlstra
  2020-12-07 15:21     ` Thomas Gleixner
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-12-07 11:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:58PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> A barrier() in a tight loop which waits for something to happen on a remote
> CPU is a pointless exercise. Replace it with cpu_relax() which allows HT
> siblings to make progress.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/interrupt.h |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -668,7 +668,8 @@ static inline void tasklet_unlock(struct
>  
>  static inline void tasklet_unlock_wait(struct tasklet_struct *t)
>  {
> -	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
> +	while (test_bit(TASKLET_STATE_RUN, &(t)->state))
> +		cpu_relax();
>  }

Wouldn't it be nicer to stick a completion in tasklet_struct ? Or at the
very least use wait_var_event() or something?


diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index ee8299eb1f52..7818085ac003 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -663,12 +663,14 @@ static inline int tasklet_trylock(struct tasklet_struct *t)
 static inline void tasklet_unlock(struct tasklet_struct *t)
 {
 	smp_mb__before_atomic();
-	clear_bit(TASKLET_STATE_RUN, &(t)->state);
+	clear_bit(TASKLET_STATE_RUN, &t->state);
+	smp_mb__after_atomic();
+	wake_up_var(&t->state);
 }
 
 static inline void tasklet_unlock_wait(struct tasklet_struct *t)
 {
-	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
+	wait_var_event(&t->state, !test_bit(TASKLET_STATE_RUN, &t->state));
 }
 #else
 #define tasklet_trylock(t) 1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-04 17:02 ` [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT Thomas Gleixner
@ 2020-12-07 11:47   ` Peter Zijlstra
  2020-12-07 14:00     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-12-07 11:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:02:00PM +0100, Thomas Gleixner wrote:
> @@ -825,7 +848,20 @@ void tasklet_kill(struct tasklet_struct
>  
>  	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>  		do {
> -			yield();
>  		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
>  	}
>  	tasklet_unlock_wait(t);


Egads... should we not start by doing something like this?


---
diff --git a/kernel/softirq.c b/kernel/softirq.c
index d5bfd5e661fc..95ff5b7f1833 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -529,6 +529,16 @@ void __tasklet_hi_schedule(struct tasklet_struct *t)
 }
 EXPORT_SYMBOL(__tasklet_hi_schedule);
 
+static inline bool tasklet_clear_sched(struct tasklet_struct *t)
+{
+	if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)) {
+		wake_up_var(&t->state);
+		return true;
+	}
+
+	return false;
+}
+
 static void tasklet_action_common(struct softirq_action *a,
 				  struct tasklet_head *tl_head,
 				  unsigned int softirq_nr)
@@ -548,8 +558,7 @@ static void tasklet_action_common(struct softirq_action *a,
 
 		if (tasklet_trylock(t)) {
 			if (!atomic_read(&t->count)) {
-				if (!test_and_clear_bit(TASKLET_STATE_SCHED,
-							&t->state))
+				if (!tasklet_clear_sched(t))
 					BUG();
 				if (t->use_callback)
 					t->callback(t);
@@ -609,13 +618,11 @@ void tasklet_kill(struct tasklet_struct *t)
 	if (in_interrupt())
 		pr_notice("Attempt to kill tasklet from interrupt\n");
 
-	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
-		do {
-			yield();
-		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
-	}
+	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
+		wait_var_event(&t->state, !test_bit(TASKLET_STATE_SCHED, &t->state));
+
 	tasklet_unlock_wait(t);
-	clear_bit(TASKLET_STATE_SCHED, &t->state);
+	tasklet_clear_sched(t);
 }
 EXPORT_SYMBOL(tasklet_kill);
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [patch V2 1/9] softirq: Add RT specific softirq accounting
  2020-12-04 17:01 ` [patch V2 1/9] softirq: Add RT specific softirq accounting Thomas Gleixner
@ 2020-12-07 13:06   ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 13:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:52PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> RT requires the softirq processing and local bottomhalf disabled regions to
> be preemptible. Using the normal preempt count based serialization is
> therefore not possible because this implicitely disables preemption.
> 
> RT kernels use a per CPU local lock to serialize bottomhalfs. As
> local_bh_disable() can nest the lock can only be acquired on the outermost
> invocation of local_bh_disable() and released when the nest count becomes
> zero. Tasks which hold the local lock can be preempted so its required to
> keep track of the nest count per task.
> 
> Add a RT only counter to task struct and adjust the relevant macros in
> preempt.h.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 2/9] irqtime: Make accounting correct on RT
  2020-12-04 17:01 ` [patch V2 2/9] irqtime: Make accounting correct on RT Thomas Gleixner
  2020-12-07  0:23   ` Frederic Weisbecker
@ 2020-12-07 13:27   ` Frederic Weisbecker
  2020-12-07 14:44     ` Thomas Gleixner
  1 sibling, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 13:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:53PM +0100, Thomas Gleixner wrote:
> vtime_account_irq and irqtime_account_irq() base checks on preempt_count()
> which fails on RT because preempt_count() does not contain the softirq
> accounting which is seperate on RT.
> 
> These checks do not need the full preempt count as they only operate on the
> hard and softirq sections.
> 
> Use irq_count() instead which provides the correct value on both RT and non
> RT kernels. The compiler is clever enough to fold the masking for !RT:
> 
>        99b:	65 8b 05 00 00 00 00 	mov    %gs:0x0(%rip),%eax
>  -     9a2:	25 ff ff ff 7f       	and    $0x7fffffff,%eax
>  +     9a2:	25 00 ff ff 00       	and    $0xffff00,%eax
> 
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Also I'm seeing a few other explicit SOFTIRQ_MASK checks on top
of preempt_count(), especially on RCU:

   $ git grep SOFTIRQ_MASK
   arch/sh/kernel/irq.c:                   (irqctx->tinfo.preempt_count & ~SOFTIRQ_MASK) |
   arch/sh/kernel/irq.c:                   (curctx->tinfo.preempt_count & SOFTIRQ_MASK);
   kernel/rcu/rcutorture.c:                if (preempt_count() & (SOFTIRQ_MASK | HARDIRQ_MASK))
   kernel/rcu/tree_exp.h:          if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) ||
   kernel/rcu/tree_plugin.h:                       !!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK));
   kernel/rcu/tree_plugin.h:           (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {

I guess some RT RCU config take care of that?

Also tracing:

     kernel/trace/ring_buffer.c:     if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
     kernel/trace/trace.c:           ((pc & SOFTIRQ_OFFSET) ? TRACE_FLAG_SOFTIRQ : 0)

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 3/9] softirq: Move various protections into inline helpers
  2020-12-04 17:01 ` [patch V2 3/9] softirq: Move various protections into inline helpers Thomas Gleixner
@ 2020-12-07 13:37   ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 13:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:54PM +0100, Thomas Gleixner wrote:
> To allow reuse of the bulk of softirq processing code for RT and to avoid
> #ifdeffery all over the place, split protections for various code sections
> out into inline helpers so the RT variant can just replace them in one go.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 11:47   ` Peter Zijlstra
@ 2020-12-07 14:00     ` Sebastian Andrzej Siewior
  2020-12-07 14:27       ` Peter Zijlstra
  2020-12-07 15:22       ` Thomas Gleixner
  0 siblings, 2 replies; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-07 14:00 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Thomas Gleixner, LKML, Frederic Weisbecker, Paul McKenney

On 2020-12-07 12:47:43 [+0100], Peter Zijlstra wrote:
> On Fri, Dec 04, 2020 at 06:02:00PM +0100, Thomas Gleixner wrote:
> > @@ -825,7 +848,20 @@ void tasklet_kill(struct tasklet_struct
> >  
> >  	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
> >  		do {
> > -			yield();
> >  		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
> >  	}
> >  	tasklet_unlock_wait(t);
> 
> 
> Egads... should we not start by doing something like this?

So we keep the RT part as-is and replace the non-RT bits with this?

Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
@ 2020-12-07 14:16   ` Frederic Weisbecker
  2020-12-07 15:08     ` Thomas Gleixner
  2020-12-08  0:08   ` Frederic Weisbecker
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-07 14:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
> +void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
> +{
> +	unsigned long flags;
> +	int newcnt;
> +
> +	WARN_ON_ONCE(in_hardirq());
> +
> +	/* First entry of a task into a BH disabled section? */
> +	if (!current->softirq_disable_cnt) {
> +		if (preemptible()) {
> +			local_lock(&softirq_ctrl.lock);
> +			/* Required to meet the RCU bottomhalf requirements. */
> +			rcu_read_lock();
> +		} else {
> +			DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt));

So, to be clear this adds a new constraint where we can't call
local_bh_disable() inside a preempt disabled section? I guess the rest of the
RT code chased all the new offenders :-)

> +		}
> +	}
> +
> +	/*
> +	 * Track the per CPU softirq disabled state. On RT this is per CPU
> +	 * state to allow preemption of bottom half disabled sections.
> +	 */
> +	newcnt = __this_cpu_add_return(softirq_ctrl.cnt, cnt);
> +	/*
> +	 * Reflect the result in the task state to prevent recursion on the
> +	 * local lock and to make softirq_count() & al work.
> +	 */
> +	current->softirq_disable_cnt = newcnt;
> +
> +	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && newcnt == cnt) {
> +		raw_local_irq_save(flags);
> +		lockdep_softirqs_off(ip);
> +		raw_local_irq_restore(flags);
> +	}
> +}
> +EXPORT_SYMBOL(__local_bh_disable_ip);
> +
[...]
> +
> +void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
> +{
> +	bool preempt_on = preemptible();
> +	unsigned long flags;
> +	u32 pending;
> +	int curcnt;
> +
> +	WARN_ON_ONCE(in_irq());
> +	lockdep_assert_irqs_enabled();
> +
> +	local_irq_save(flags);
> +	curcnt = this_cpu_read(softirq_ctrl.cnt);

__this_cpu_read() ?

> +
> +	/*
> +	 * If this is not reenabling soft interrupts, no point in trying to
> +	 * run pending ones.
> +	 */
> +	if (curcnt != cnt)
> +		goto out;

I guess you could move the local_irq_save() here?

> +	pending = local_softirq_pending();
> +	if (!pending || ksoftirqd_running(pending))
> +		goto out;
> +
> +	/*
> +	 * If this was called from non preemptible context, wake up the
> +	 * softirq daemon.
> +	 */
> +	if (!preempt_on) {
> +		wakeup_softirqd();
> +		goto out;
> +	}
> +
> +	/*
> +	 * Adjust softirq count to SOFTIRQ_OFFSET which makes
> +	 * in_serving_softirq() become true.
> +	 */
> +	cnt = SOFTIRQ_OFFSET;
> +	__local_bh_enable(cnt, false);
> +	__do_softirq();
> +
> +out:
> +	__local_bh_enable(cnt, preempt_on);
> +	local_irq_restore(flags);
> +}
> +EXPORT_SYMBOL(__local_bh_enable_ip);

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 14:00     ` Sebastian Andrzej Siewior
@ 2020-12-07 14:27       ` Peter Zijlstra
  2020-12-07 17:55         ` Thomas Gleixner
  2020-12-07 15:22       ` Thomas Gleixner
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-12-07 14:27 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, LKML, Frederic Weisbecker, Paul McKenney

On Mon, Dec 07, 2020 at 03:00:40PM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 12:47:43 [+0100], Peter Zijlstra wrote:
> > On Fri, Dec 04, 2020 at 06:02:00PM +0100, Thomas Gleixner wrote:
> > > @@ -825,7 +848,20 @@ void tasklet_kill(struct tasklet_struct
> > >  
> > >  	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
> > >  		do {
> > > -			yield();
> > >  		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
> > >  	}
> > >  	tasklet_unlock_wait(t);
> > 
> > 
> > Egads... should we not start by doing something like this?
> 
> So we keep the RT part as-is and replace the non-RT bits with this?

For RT you probably want to wrap the wait_var_event() in that
local_bh_disable()/enable() pear. I just figured those unbounded
spin/yield loops suck and we should get rid of em.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 2/9] irqtime: Make accounting correct on RT
  2020-12-07 13:27   ` Frederic Weisbecker
@ 2020-12-07 14:44     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 14:44 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Mon, Dec 07 2020 at 14:27, Frederic Weisbecker wrote:
> On Fri, Dec 04, 2020 at 06:01:53PM +0100, Thomas Gleixner wrote:
>> vtime_account_irq and irqtime_account_irq() base checks on preempt_count()
>> which fails on RT because preempt_count() does not contain the softirq
>> accounting which is seperate on RT.
>> 
>> These checks do not need the full preempt count as they only operate on the
>> hard and softirq sections.
>> 
>> Use irq_count() instead which provides the correct value on both RT and non
>> RT kernels. The compiler is clever enough to fold the masking for !RT:
>> 
>>        99b:	65 8b 05 00 00 00 00 	mov    %gs:0x0(%rip),%eax
>>  -     9a2:	25 ff ff ff 7f       	and    $0x7fffffff,%eax
>>  +     9a2:	25 00 ff ff 00       	and    $0xffff00,%eax
>> 
>> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
>
> Also I'm seeing a few other explicit SOFTIRQ_MASK checks on top
> of preempt_count(), especially on RCU:
>
>    $ git grep SOFTIRQ_MASK
>    arch/sh/kernel/irq.c:                   (irqctx->tinfo.preempt_count & ~SOFTIRQ_MASK) |
>    arch/sh/kernel/irq.c:                   (curctx->tinfo.preempt_count & SOFTIRQ_MASK);
>    kernel/rcu/rcutorture.c:                if (preempt_count() & (SOFTIRQ_MASK | HARDIRQ_MASK))
>    kernel/rcu/tree_exp.h:          if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) ||
>    kernel/rcu/tree_plugin.h:                       !!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK));
>    kernel/rcu/tree_plugin.h:           (preempt_count() &
>    (PREEMPT_MASK | SOFTIRQ_MASK))) {
>
> I guess some RT RCU config take care of that?

I ignored sh so far and RCU has some quirks for RT vs. softirqs.

> Also tracing:
>
>      kernel/trace/ring_buffer.c:     if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
>      kernel/trace/trace.c:           ((pc & SOFTIRQ_OFFSET) ? TRACE_FLAG_SOFTIRQ : 0)

Bah, I somehow lost the cure for that while rebasing.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-07 14:16   ` Frederic Weisbecker
@ 2020-12-07 15:08     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 15:08 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Mon, Dec 07 2020 at 15:16, Frederic Weisbecker wrote:
> On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
>> +void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
>> +{
>> +	unsigned long flags;
>> +	int newcnt;
>> +
>> +	WARN_ON_ONCE(in_hardirq());
>> +
>> +	/* First entry of a task into a BH disabled section? */
>> +	if (!current->softirq_disable_cnt) {
>> +		if (preemptible()) {
>> +			local_lock(&softirq_ctrl.lock);
>> +			/* Required to meet the RCU bottomhalf requirements. */
>> +			rcu_read_lock();
>> +		} else {
>> +			DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt));
>
> So, to be clear this adds a new constraint where we can't call
> local_bh_disable() inside a preempt disabled section? I guess the rest of the
> RT code chased all the new offenders :-)

There are not that many.

>> +		}
>> +	}
>> +
>> +	/*
>> +	 * Track the per CPU softirq disabled state. On RT this is per CPU
>> +	 * state to allow preemption of bottom half disabled sections.
>> +	 */
>> +	newcnt = __this_cpu_add_return(softirq_ctrl.cnt, cnt);
>> +	/*
>> +	 * Reflect the result in the task state to prevent recursion on the
>> +	 * local lock and to make softirq_count() & al work.
>> +	 */
>> +	current->softirq_disable_cnt = newcnt;
>> +
>> +	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && newcnt == cnt) {
>> +		raw_local_irq_save(flags);
>> +		lockdep_softirqs_off(ip);
>> +		raw_local_irq_restore(flags);
>> +	}
>> +}
>> +EXPORT_SYMBOL(__local_bh_disable_ip);
>> +
> [...]
>> +
>> +void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
>> +{
>> +	bool preempt_on = preemptible();
>> +	unsigned long flags;
>> +	u32 pending;
>> +	int curcnt;
>> +
>> +	WARN_ON_ONCE(in_irq());
>> +	lockdep_assert_irqs_enabled();
>> +
>> +	local_irq_save(flags);
>> +	curcnt = this_cpu_read(softirq_ctrl.cnt);
>
> __this_cpu_read() ?

Yes.

>> +
>> +	/*
>> +	 * If this is not reenabling soft interrupts, no point in trying to
>> +	 * run pending ones.
>> +	 */
>> +	if (curcnt != cnt)
>> +		goto out;
>
> I guess you could move the local_irq_save() here?

Indeed.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait()
  2020-12-07 11:39   ` Peter Zijlstra
@ 2020-12-07 15:21     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 15:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Paul McKenney, Sebastian Andrzej Siewior

On Mon, Dec 07 2020 at 12:39, Peter Zijlstra wrote:
> On Fri, Dec 04, 2020 at 06:01:58PM +0100, Thomas Gleixner wrote:
>>  static inline void tasklet_unlock_wait(struct tasklet_struct *t)
>>  {
>> -	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
>> +	while (test_bit(TASKLET_STATE_RUN, &(t)->state))
>> +		cpu_relax();
>>  }
>
> Wouldn't it be nicer to stick a completion in tasklet_struct ? Or at the
> very least use wait_var_event() or something?

It would be nicer. Just need to audit all possible callers. That would
lose the implicit boosting of the preempted softirq thread, but these
wait pathes should not be in any way relevant for user task
latencies. Emphasis on should.

Lemme stare at the callers including the ones which have it wrapped into
some other inline just because.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 14:00     ` Sebastian Andrzej Siewior
  2020-12-07 14:27       ` Peter Zijlstra
@ 2020-12-07 15:22       ` Thomas Gleixner
  2020-12-07 15:39         ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 15:22 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Paul McKenney

On Mon, Dec 07 2020 at 15:00, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 12:47:43 [+0100], Peter Zijlstra wrote:
>> On Fri, Dec 04, 2020 at 06:02:00PM +0100, Thomas Gleixner wrote:
>> > @@ -825,7 +848,20 @@ void tasklet_kill(struct tasklet_struct
>> >  
>> >  	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>> >  		do {
>> > -			yield();
>> >  		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
>> >  	}
>> >  	tasklet_unlock_wait(t);
>> 
>> 
>> Egads... should we not start by doing something like this?
>
> So we keep the RT part as-is and replace the non-RT bits with this?

No. It would work for both.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 15:22       ` Thomas Gleixner
@ 2020-12-07 15:39         ` Sebastian Andrzej Siewior
  2020-12-07 17:49           ` Thomas Gleixner
  0 siblings, 1 reply; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-07 15:39 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Peter Zijlstra, LKML, Frederic Weisbecker, Paul McKenney

On 2020-12-07 16:22:07 [+0100], Thomas Gleixner wrote:
> On Mon, Dec 07 2020 at 15:00, Sebastian Andrzej Siewior wrote:
> > So we keep the RT part as-is and replace the non-RT bits with this?
> 
> No. It would work for both.

So instead of boosting our way through we simply wait until the tasklet
completes. Given that canceling is usually done on start/stop events, it
shouldn't matter if the RT priority is lost.

Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 15:39         ` Sebastian Andrzej Siewior
@ 2020-12-07 17:49           ` Thomas Gleixner
  2020-12-07 17:50             ` Thomas Gleixner
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 17:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, LKML, Frederic Weisbecker, Paul McKenney

On Mon, Dec 07 2020 at 16:39, Sebastian Andrzej Siewior wrote:
> On 2020-12-07 16:22:07 [+0100], Thomas Gleixner wrote:
>> On Mon, Dec 07 2020 at 15:00, Sebastian Andrzej Siewior wrote:
>> > So we keep the RT part as-is and replace the non-RT bits with this?
>> 
>> No. It would work for both.
>
> So instead of boosting our way through we simply wait until the tasklet
> completes. Given that canceling is usually done on start/stop events, it
> shouldn't matter if the RT priority is lost.

That was my reasoning. The only thing we need to figure out whether
there are callers on !RT which invoke that muck from non-sleepable
context.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 17:49           ` Thomas Gleixner
@ 2020-12-07 17:50             ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 17:50 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, LKML, Frederic Weisbecker, Paul McKenney

On Mon, Dec 07 2020 at 18:49, Thomas Gleixner wrote:

> On Mon, Dec 07 2020 at 16:39, Sebastian Andrzej Siewior wrote:
>> On 2020-12-07 16:22:07 [+0100], Thomas Gleixner wrote:
>>> On Mon, Dec 07 2020 at 15:00, Sebastian Andrzej Siewior wrote:
>>> > So we keep the RT part as-is and replace the non-RT bits with this?
>>> 
>>> No. It would work for both.
>>
>> So instead of boosting our way through we simply wait until the tasklet
>> completes. Given that canceling is usually done on start/stop events, it
>> shouldn't matter if the RT priority is lost.
>
> That was my reasoning. The only thing we need to figure out whether
> there are callers on !RT which invoke that muck from non-sleepable
> context.

The kill() variant is fine. It must be called from sleepable
context. The otherone can be invoked everywhere except hard interrupt
context IIRC.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT
  2020-12-07 14:27       ` Peter Zijlstra
@ 2020-12-07 17:55         ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-07 17:55 UTC (permalink / raw)
  To: Peter Zijlstra, Sebastian Andrzej Siewior
  Cc: LKML, Frederic Weisbecker, Paul McKenney

On Mon, Dec 07 2020 at 15:27, Peter Zijlstra wrote:

> On Mon, Dec 07, 2020 at 03:00:40PM +0100, Sebastian Andrzej Siewior wrote:
>> On 2020-12-07 12:47:43 [+0100], Peter Zijlstra wrote:
>> > On Fri, Dec 04, 2020 at 06:02:00PM +0100, Thomas Gleixner wrote:
>> > > @@ -825,7 +848,20 @@ void tasklet_kill(struct tasklet_struct
>> > >  
>> > >  	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>> > >  		do {
>> > > -			yield();
>> > >  		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
>> > >  	}
>> > >  	tasklet_unlock_wait(t);
>> > 
>> > 
>> > Egads... should we not start by doing something like this?
>> 
>> So we keep the RT part as-is and replace the non-RT bits with this?
>
> For RT you probably want to wrap the wait_var_event() in that
> local_bh_disable()/enable() pear.

Is that a new species local to the Netherlands? Never heard about
bh-pears before. Are they tasty?

> I just figured those unbounded spin/yield loops suck and we should get
> rid of em.

Ack.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
  2020-12-07 14:16   ` Frederic Weisbecker
@ 2020-12-08  0:08   ` Frederic Weisbecker
  2020-12-09 10:11   ` Peter Zijlstra
  2020-12-09 10:34   ` Peter Zijlstra
  3 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-08  0:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
> +static void __local_bh_enable(unsigned int cnt, bool unlock)
> +{
> +	unsigned long flags;
> +	int newcnt;
> +
> +	DEBUG_LOCKS_WARN_ON(current->softirq_disable_cnt !=
> +			    this_cpu_read(softirq_ctrl.cnt));

Less important since it's debug code, but still can be __this_cpu_read().

> +
> +	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && softirq_count() == cnt) {
> +		raw_local_irq_save(flags);
> +		lockdep_softirqs_on(_RET_IP_);
> +		raw_local_irq_restore(flags);
> +	}
> +
> +	newcnt = __this_cpu_sub_return(softirq_ctrl.cnt, cnt);
> +	current->softirq_disable_cnt = newcnt;
> +
> +	if (!newcnt && unlock) {
> +		rcu_read_unlock();
> +		local_unlock(&softirq_ctrl.lock);
> +	}
> +}
> +
> +static inline bool should_wake_ksoftirqd(void)
> +{
> +	return !this_cpu_read(softirq_ctrl.cnt);

And that too.

Other than these boring details:

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 5/9] tick/sched: Prevent false positive softirq pending warnings on RT
  2020-12-04 17:01 ` [patch V2 5/9] tick/sched: Prevent false positive softirq pending warnings on RT Thomas Gleixner
@ 2020-12-08 12:23   ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2020-12-08 12:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:56PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> On RT a task which has soft interrupts disabled can block on a lock and
> schedule out to idle while soft interrupts are pending. This triggers the
> warning in the NOHZ idle code which complains about going idle with pending
> soft interrupts. But as the task is blocked soft interrupt processing is
> temporarily blocked as well which means that such a warning is a false
> positive.
> 
> To prevent that check the per CPU state which indicates that a scheduled
> out task has soft interrupts disabled.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/bottom_half.h |    6 ++++++
>  kernel/softirq.c            |   15 +++++++++++++++
>  kernel/time/tick-sched.c    |    2 +-
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> --- a/include/linux/bottom_half.h
> +++ b/include/linux/bottom_half.h
> @@ -32,4 +32,10 @@ static inline void local_bh_enable(void)
>  	__local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>  }
>  
> +#ifdef CONFIG_PREEMPT_RT
> +extern bool local_bh_blocked(void);
> +#else
> +static inline bool local_bh_blocked(void) { return false; }
> +#endif
> +
>  #endif /* _LINUX_BH_H */
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -138,6 +138,21 @@ static DEFINE_PER_CPU(struct softirq_ctr
>  	.lock	= INIT_LOCAL_LOCK(softirq_ctrl.lock),
>  };
>  
> +/**
> + * local_bh_blocked() - Check for idle whether BH processing is blocked
> + *
> + * Returns false if the per CPU softirq::cnt is 0 otherwise true.
> + *
> + * This is invoked from the idle task to guard against false positive
> + * softirq pending warnings, which would happen when the task which holds
> + * softirq_ctrl::lock was the only running task on the CPU and blocks on
> + * some other lock.
> + */
> +bool local_bh_blocked(void)
> +{
> +	return this_cpu_read(softirq_ctrl.cnt) != 0;

__this_cpu_read()

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
  2020-12-07 14:16   ` Frederic Weisbecker
  2020-12-08  0:08   ` Frederic Weisbecker
@ 2020-12-09 10:11   ` Peter Zijlstra
  2020-12-09 12:36     ` Thomas Gleixner
  2020-12-09 10:34   ` Peter Zijlstra
  3 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-12-09 10:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior, Boqun Feng, Ingo Molnar, Will Deacon

On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Provide a local lock based serialization for soft interrupts on RT which
> allows the local_bh_disabled() sections and servicing soft interrupts to be
> preemptible.
> 
> Provide the necessary inline helpers which allow to reuse the bulk of the
> softirq processing code.

> +struct softirq_ctrl {
> +	local_lock_t	lock;
> +	int		cnt;
> +};
> +
> +static DEFINE_PER_CPU(struct softirq_ctrl, softirq_ctrl) = {
> +	.lock	= INIT_LOCAL_LOCK(softirq_ctrl.lock),
> +};
> +
> +void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
> +{
> +	unsigned long flags;
> +	int newcnt;
> +
> +	WARN_ON_ONCE(in_hardirq());
> +
> +	/* First entry of a task into a BH disabled section? */
> +	if (!current->softirq_disable_cnt) {
> +		if (preemptible()) {
> +			local_lock(&softirq_ctrl.lock);

AFAICT this significantly changes the locking rules.

Where previously we could do:

	spin_lock(&ponies)
	spin_lock_bh(&foo);

vs

	spin_lock_bh(&bar);
	spin_lock(&ponies)

provided the rest of the code observed: bar -> ponies -> foo
and never takes ponies from in-softirq.

This is now a genuine deadlock on RT.

Also see:

  https://lkml.kernel.org/r/X9CheYjuXWc75Spa@hirez.programming.kicks-ass.net

> +			/* Required to meet the RCU bottomhalf requirements. */
> +			rcu_read_lock();
> +		} else {
> +			DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt));
> +		}
> +	}
> +
> +	/*
> +	 * Track the per CPU softirq disabled state. On RT this is per CPU
> +	 * state to allow preemption of bottom half disabled sections.
> +	 */
> +	newcnt = __this_cpu_add_return(softirq_ctrl.cnt, cnt);
> +	/*
> +	 * Reflect the result in the task state to prevent recursion on the
> +	 * local lock and to make softirq_count() & al work.
> +	 */
> +	current->softirq_disable_cnt = newcnt;
> +
> +	if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && newcnt == cnt) {
> +		raw_local_irq_save(flags);
> +		lockdep_softirqs_off(ip);
> +		raw_local_irq_restore(flags);
> +	}
> +}



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
                     ` (2 preceding siblings ...)
  2020-12-09 10:11   ` Peter Zijlstra
@ 2020-12-09 10:34   ` Peter Zijlstra
  3 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2020-12-09 10:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Paul McKenney, Sebastian Andrzej Siewior

On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
> +void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
> +{
> +	unsigned long flags;
> +	int newcnt;
> +
> +	WARN_ON_ONCE(in_hardirq());
> +
> +	/* First entry of a task into a BH disabled section? */
> +	if (!current->softirq_disable_cnt) {
> +		if (preemptible()) {

This reminds me of that migrate_disable() thing that didn't work, and
has similar problems.

	preempt_disable();
	local_bh_disable();
	preempt_enable();
	spin_lock(&foo);

is valid (albeit daft) code that will now malfunction.

> +			local_lock(&softirq_ctrl.lock);



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-09 10:11   ` Peter Zijlstra
@ 2020-12-09 12:36     ` Thomas Gleixner
  2020-12-09 12:42       ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-09 12:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior, Boqun Feng, Ingo Molnar, Will Deacon

On Wed, Dec 09 2020 at 11:11, Peter Zijlstra wrote:
> On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
>> From: Thomas Gleixner <tglx@linutronix.de>
>> +	/* First entry of a task into a BH disabled section? */
>> +	if (!current->softirq_disable_cnt) {
>> +		if (preemptible()) {
>> +			local_lock(&softirq_ctrl.lock);
>
> AFAICT this significantly changes the locking rules.
>
> Where previously we could do:
>
> 	spin_lock(&ponies)
> 	spin_lock_bh(&foo);
>
> vs
>
> 	spin_lock_bh(&bar);
> 	spin_lock(&ponies)
>
> provided the rest of the code observed: bar -> ponies -> foo
> and never takes ponies from in-softirq.
>
> This is now a genuine deadlock on RT.

I know, but making this work is trying to square the circle.

Any approach we tried before going this way had worse problems than
this particular limitation.

> Also see:
>
>   https://lkml.kernel.org/r/X9CheYjuXWc75Spa@hirez.programming.kicks-ass.net

I'm aware of that and it's fortunately not that many instances of this.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-09 12:36     ` Thomas Gleixner
@ 2020-12-09 12:42       ` Peter Zijlstra
  2020-12-09 13:30         ` Thomas Gleixner
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-12-09 12:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior, Boqun Feng, Ingo Molnar, Will Deacon

On Wed, Dec 09, 2020 at 01:36:54PM +0100, Thomas Gleixner wrote:
> On Wed, Dec 09 2020 at 11:11, Peter Zijlstra wrote:
> > On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
> >> From: Thomas Gleixner <tglx@linutronix.de>
> >> +	/* First entry of a task into a BH disabled section? */
> >> +	if (!current->softirq_disable_cnt) {
> >> +		if (preemptible()) {
> >> +			local_lock(&softirq_ctrl.lock);
> >
> > AFAICT this significantly changes the locking rules.
> >
> > Where previously we could do:
> >
> > 	spin_lock(&ponies)
> > 	spin_lock_bh(&foo);
> >
> > vs
> >
> > 	spin_lock_bh(&bar);
> > 	spin_lock(&ponies)
> >
> > provided the rest of the code observed: bar -> ponies -> foo
> > and never takes ponies from in-softirq.
> >
> > This is now a genuine deadlock on RT.
> 
> I know, but making this work is trying to square the circle.

:-)

> Any approach we tried before going this way had worse problems than
> this particular limitation.

OK, but that would've been very good Changelog material methinks.

Also, then we should probably make sure PREEMPT_RT=n builds start
suffering the same problem by adding the local_lock unconditionally,
otherwise this keeps being a PREEMPT_RT special and we'll keep having to
fix it up.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch V2 4/9] softirq: Make softirq control and processing RT aware
  2020-12-09 12:42       ` Peter Zijlstra
@ 2020-12-09 13:30         ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2020-12-09 13:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Paul McKenney,
	Sebastian Andrzej Siewior, Boqun Feng, Ingo Molnar, Will Deacon

On Wed, Dec 09 2020 at 13:42, Peter Zijlstra wrote:
> On Wed, Dec 09, 2020 at 01:36:54PM +0100, Thomas Gleixner wrote:
>> On Wed, Dec 09 2020 at 11:11, Peter Zijlstra wrote:
>> > On Fri, Dec 04, 2020 at 06:01:55PM +0100, Thomas Gleixner wrote:
>> >> From: Thomas Gleixner <tglx@linutronix.de>
>> >> +	/* First entry of a task into a BH disabled section? */
>> >> +	if (!current->softirq_disable_cnt) {
>> >> +		if (preemptible()) {
>> >> +			local_lock(&softirq_ctrl.lock);
>> >
>> > AFAICT this significantly changes the locking rules.
>> >
>> > Where previously we could do:
>> >
>> > 	spin_lock(&ponies)
>> > 	spin_lock_bh(&foo);
>> >
>> > vs
>> >
>> > 	spin_lock_bh(&bar);
>> > 	spin_lock(&ponies)
>> >
>> > provided the rest of the code observed: bar -> ponies -> foo
>> > and never takes ponies from in-softirq.
>> >
>> > This is now a genuine deadlock on RT.
>> 
>> I know, but making this work is trying to square the circle.
>
> :-)
>
>> Any approach we tried before going this way had worse problems than
>> this particular limitation.
>
> OK, but that would've been very good Changelog material methinks.

Let me add that.

> Also, then we should probably make sure PREEMPT_RT=n builds start
> suffering the same problem by adding the local_lock unconditionally,
> otherwise this keeps being a PREEMPT_RT special and we'll keep having to
> fix it up.

For lockdep builds I assume. I'd like to postpone that for a while like
we postponed the rawlock nesting lockdep stuff until we have the vast
majority sorted out.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2020-12-09 13:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 17:01 [patch V2 0/9] softirq: Make it RT aware Thomas Gleixner
2020-12-04 17:01 ` [patch V2 1/9] softirq: Add RT specific softirq accounting Thomas Gleixner
2020-12-07 13:06   ` Frederic Weisbecker
2020-12-04 17:01 ` [patch V2 2/9] irqtime: Make accounting correct on RT Thomas Gleixner
2020-12-07  0:23   ` Frederic Weisbecker
2020-12-07  0:57     ` Thomas Gleixner
2020-12-07  1:14       ` Frederic Weisbecker
2020-12-07 13:27   ` Frederic Weisbecker
2020-12-07 14:44     ` Thomas Gleixner
2020-12-04 17:01 ` [patch V2 3/9] softirq: Move various protections into inline helpers Thomas Gleixner
2020-12-07 13:37   ` Frederic Weisbecker
2020-12-04 17:01 ` [patch V2 4/9] softirq: Make softirq control and processing RT aware Thomas Gleixner
2020-12-07 14:16   ` Frederic Weisbecker
2020-12-07 15:08     ` Thomas Gleixner
2020-12-08  0:08   ` Frederic Weisbecker
2020-12-09 10:11   ` Peter Zijlstra
2020-12-09 12:36     ` Thomas Gleixner
2020-12-09 12:42       ` Peter Zijlstra
2020-12-09 13:30         ` Thomas Gleixner
2020-12-09 10:34   ` Peter Zijlstra
2020-12-04 17:01 ` [patch V2 5/9] tick/sched: Prevent false positive softirq pending warnings on RT Thomas Gleixner
2020-12-08 12:23   ` Frederic Weisbecker
2020-12-04 17:01 ` [patch V2 6/9] rcu: Prevent false positive softirq warning " Thomas Gleixner
2020-12-04 17:59   ` Paul E. McKenney
2020-12-04 17:01 ` [patch V2 7/9] softirq: Replace barrier() with cpu_relax() in tasklet_unlock_wait() Thomas Gleixner
2020-12-07 11:39   ` Peter Zijlstra
2020-12-07 15:21     ` Thomas Gleixner
2020-12-04 17:01 ` [patch V2 8/9] tasklets: Use static inlines for stub implementations Thomas Gleixner
2020-12-04 17:02 ` [patch V2 9/9] tasklets: Prevent kill/unlock_wait deadlock on RT Thomas Gleixner
2020-12-07 11:47   ` Peter Zijlstra
2020-12-07 14:00     ` Sebastian Andrzej Siewior
2020-12-07 14:27       ` Peter Zijlstra
2020-12-07 17:55         ` Thomas Gleixner
2020-12-07 15:22       ` Thomas Gleixner
2020-12-07 15:39         ` Sebastian Andrzej Siewior
2020-12-07 17:49           ` Thomas Gleixner
2020-12-07 17:50             ` Thomas Gleixner
2020-12-06 10:05 ` [patch V2 0/9] softirq: Make it RT aware Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).