All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.10-rt] tick-sched: fix inadvertent enabling of interrupts
@ 2014-05-16 18:10 Paul Gortmaker
  2015-02-16 16:51 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 2+ messages in thread
From: Paul Gortmaker @ 2014-05-16 18:10 UTC (permalink / raw)
  To: linux-rt-users; +Cc: Paul Gortmaker

Booting an RT kernel as kvm guest with "highres=off" results in this:

Switching to clocksource kvm-clock
 ------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2593 trace_hardirqs_on_caller+0x1dc/0x210()
DEBUG_LOCKS_WARN_ON(current->hardirq_context)
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.40-rt40_preempt-rt #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 0000000000000009 ffff88007fc03db8 ffffffff8171fb1c ffff88007fc03df0
 ffffffff8103fb31 0000000000000000 ffffffff8109c3b4 ffff88007fc0e6c0
 ffff88007647b900 ffff88007647b900 ffff88007fc03e50 ffffffff8103fbbc
Call Trace:
 <IRQ>  [<ffffffff8171fb1c>] dump_stack+0x19/0x1b
 [<ffffffff8103fb31>] warn_slowpath_common+0x61/0xa0
 [<ffffffff8109c3b4>] ? tick_check_oneshot_change+0xf4/0x120
 [<ffffffff8103fbbc>] warn_slowpath_fmt+0x4c/0x50
 [<ffffffff81092b43>] ? ktime_get+0x43/0xc0
 [<ffffffff8109f8cc>] trace_hardirqs_on_caller+0x1dc/0x210
 [<ffffffff8109f90d>] trace_hardirqs_on+0xd/0x10
 [<ffffffff8109c3b4>] tick_check_oneshot_change+0xf4/0x120  <---
 [<ffffffff8106d383>] hrtimer_run_queues+0x63/0x300
 [<ffffffff81052128>] run_local_timers+0x18/0x70
 [<ffffffff810521b7>] update_process_times+0x37/0x70
 [<ffffffff81099c1c>] tick_periodic+0x2c/0xd0
 [<ffffffff81099d50>] tick_handle_periodic+0x20/0x70
 [<ffffffff81727a26>] smp_apic_timer_interrupt+0x66/0x9b
 [<ffffffff81726c32>] apic_timer_interrupt+0x72/0x80
 <EOI>  [<ffffffff81381169>] ? memset+0x9/0xb0
 [<ffffffff81164acb>] ? address_space_init_once+0x1b/0x120
 [<ffffffff81164c34>] inode_init_once+0x64/0x80
 <...snip...>
 ------------[ cut here ]------------

If you don't have lock debugging on, you'll instead get this:

   kernel BUG at kernel/posix-cpu-timers.c:1469!

which is just a sanity check to ensure irqs are disabled in
run_posix_cpu_timers().  The debug lock splat is more informative
since it points right at tick_check_oneshot_change as the culprit
who re-enabled interrupts [which in turn has called the
tick_nohz_switch_to_nohz before returning].

This isn't a problem on mainline, since there, the call path is
clearly limited to softirq context as follows:

    run_timer_softirq
     --> hrtimer_run_pending
          --> tick_check_oneshot_change
               --> tick_nohz_switch_to_nohz

On rt, there is also only one call path, and since we know irqs
are off for it, we can delete the local_irq ops vs making them
the save/restore variants.  Insert a BUG_ON, since as above
we'll bug anyway later on if we screw this up.  And fix up
the comments to match the way -rt does things.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---

[found on 3.10-rt -- have not checked others yet, but I bet they
 have the same thing, since who disables highres normally anyway?]

 kernel/time/tick-sched.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index aa1e4b2411d5..836cf05831db 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -974,20 +974,21 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 
 /**
  * tick_nohz_switch_to_nohz - switch to nohz mode
+ * Called with interrupts disabled.
  */
 static void tick_nohz_switch_to_nohz(void)
 {
 	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
 	ktime_t next;
 
+	BUG_ON(!irqs_disabled());
+
 	if (!tick_nohz_enabled)
 		return;
 
-	local_irq_disable();
-	if (tick_switch_to_oneshot(tick_nohz_handler)) {
-		local_irq_enable();
+	if (tick_switch_to_oneshot(tick_nohz_handler))
 		return;
-	}
+
 
 	ts->nohz_mode = NOHZ_MODE_LOWRES;
 
@@ -1005,7 +1006,6 @@ static void tick_nohz_switch_to_nohz(void)
 			break;
 		next = ktime_add(next, tick_period);
 	}
-	local_irq_enable();
 }
 
 /*
@@ -1191,15 +1191,17 @@ void tick_oneshot_notify(void)
 /**
  * Check, if a change happened, which makes oneshot possible.
  *
- * Called cyclic from the hrtimer softirq (driven by the timer
- * softirq) allow_nohz signals, that we can switch into low-res nohz
- * mode, because high resolution timers are disabled (either compile
+ * Called cyclic from the hardirq (each jiffie from hrtimer_run_queues)
+ * allow_nohz signals, that we can switch into low-res nohz mode,
+ * because high resolution timers are disabled (either compile
  * or runtime).
  */
 int tick_check_oneshot_change(int allow_nohz)
 {
 	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
 
+	BUG_ON(!irqs_disabled());
+
 	if (!test_and_clear_bit(0, &ts->check_clocks))
 		return 0;
 
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH 3.10-rt] tick-sched: fix inadvertent enabling of interrupts
  2014-05-16 18:10 [PATCH 3.10-rt] tick-sched: fix inadvertent enabling of interrupts Paul Gortmaker
@ 2015-02-16 16:51 ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 2+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-02-16 16:51 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: linux-rt-users

* Paul Gortmaker | 2014-05-16 14:10:23 [-0400]:

>which is just a sanity check to ensure irqs are disabled in
>run_posix_cpu_timers().  The debug lock splat is more informative
>since it points right at tick_check_oneshot_change as the culprit
>who re-enabled interrupts [which in turn has called the
>tick_nohz_switch_to_nohz before returning].
>
>This isn't a problem on mainline, since there, the call path is
>clearly limited to softirq context as follows:
>
>    run_timer_softirq
>     --> hrtimer_run_pending
>          --> tick_check_oneshot_change
>               --> tick_nohz_switch_to_nohz
>
>On rt, there is also only one call path, and since we know irqs
>are off for it, we can delete the local_irq ops vs making them
>the save/restore variants.  Insert a BUG_ON, since as above
>we'll bug anyway later on if we screw this up.  And fix up
>the comments to match the way -rt does things.

I reverted one patch in v3.18 release and we are back to this path
softirq path on -RT as well. I got to think how to proceed in future.
Maybe I keep that nohz / highres check as is since it.
However I tried to boot the way you said in KVM and I did not trigger
the splat.

>Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

Sebastian

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-02-16 16:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-16 18:10 [PATCH 3.10-rt] tick-sched: fix inadvertent enabling of interrupts Paul Gortmaker
2015-02-16 16:51 ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.