All of lore.kernel.org
 help / color / mirror / Atom feed
* native_smp_send_reschedule() splat from rt_mutex_lock()?
@ 2017-09-18 16:51 Paul E. McKenney
  2017-09-20 16:24 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2017-09-18 16:51 UTC (permalink / raw)
  To: peterz, mingo; +Cc: linux-kernel, bigeasy, tglx

Hello!

Just moved ahead to v4.14-rc1, and I am seeing a native_smp_send_reschedule()
splat from rt_mutex_lock():

[11072.586518] sched: Unexpected reschedule of offline CPU#6!
[11072.587578] ------------[ cut here ]------------
[11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
[11072.591543] Modules linked in:
[11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1
[11072.592572] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[11072.594602] task: ffff9928de772640 task.stack: ffff9f580031c000
[11072.596655] RIP: 0010:native_smp_send_reschedule+0x37/0x40
[11072.597599] RSP: 0000:ffff9f580031fd10 EFLAGS: 00010082
[11072.598572] RAX: 000000000000002e RBX: ffff9928dd3fd940 RCX: 0000000000000004
[11072.599693] RDX: 0000000080000004 RSI: 0000000000000086 RDI: 00000000ffffffff
[11072.601602] RBP: ffff9f580031fd10 R08: 000000000008f316 R09: 0000000000007e52
[11072.603563] R10: 0000000000000001 R11: ffffffffb957c2cd R12: 0000000000000006
[11072.604610] R13: ffff9928de772640 R14: 0000000000000061 R15: ffff9928deb991c0
[11072.606537] FS:  0000000000000000(0000) GS:ffff9928dea00000(0000) knlGS:0000000000000000
[11072.607654] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11072.608646] CR2: 0000000009728b40 CR3: 000000001b640000 CR4: 00000000000006f0
[11072.610596] Call Trace:
[11072.611531]  resched_curr+0x61/0xd0
[11072.611531]  switched_to_rt+0x8f/0xa0
[11072.612647]  rt_mutex_setprio+0x25c/0x410
[11072.613591]  task_blocks_on_rt_mutex+0x1b3/0x1f0
[11072.614601]  rt_mutex_slowlock+0xa9/0x1e0
[11072.615567]  rt_mutex_lock+0x29/0x30
[11072.615567]  rcu_boost_kthread+0x127/0x3c0
[11072.616618]  kthread+0x104/0x140
[11072.617641]  ? rcu_report_unblock_qs_rnp+0x90/0x90
[11072.618565]  ? kthread_create_on_node+0x40/0x40
[11072.619509]  ret_from_fork+0x22/0x30
[11072.620593] Code: f0 00 0f 92 c0 84 c0 74 14 48 8b 05 84 67 c5 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 70 c4 fc b7 e8 05 b3 06 00 <0f> ff 5d c3 0f 1f 44 00 00 8b 05 f2 d4 13 02 85 c0 75 38 55 48 

In theory, I could work around this by excluding CPU-hotplug operations
while doing RCU priority boosting, but in practice I am very much hoping
that there is a more reasonable solution out there...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: native_smp_send_reschedule() splat from rt_mutex_lock()?
  2017-09-18 16:51 native_smp_send_reschedule() splat from rt_mutex_lock()? Paul E. McKenney
@ 2017-09-20 16:24 ` Sebastian Andrzej Siewior
  2017-09-20 19:44   ` Paul E. McKenney
  2017-09-21 12:41   ` Peter Zijlstra
  0 siblings, 2 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-09-20 16:24 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: peterz, mingo, linux-kernel, tglx

On 2017-09-18 09:51:10 [-0700], Paul E. McKenney wrote:
> Hello!
Hi,

> [11072.586518] sched: Unexpected reschedule of offline CPU#6!
> [11072.587578] ------------[ cut here ]------------
> [11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
> [11072.591543] Modules linked in:
> [11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1
> [11072.610596] Call Trace:
> [11072.611531]  resched_curr+0x61/0xd0
> [11072.611531]  switched_to_rt+0x8f/0xa0
> [11072.612647]  rt_mutex_setprio+0x25c/0x410
> [11072.613591]  task_blocks_on_rt_mutex+0x1b3/0x1f0
> [11072.614601]  rt_mutex_slowlock+0xa9/0x1e0
> [11072.615567]  rt_mutex_lock+0x29/0x30
> [11072.615567]  rcu_boost_kthread+0x127/0x3c0

> In theory, I could work around this by excluding CPU-hotplug operations
> while doing RCU priority boosting, but in practice I am very much hoping
> that there is a more reasonable solution out there...

so in CPUHP_TEARDOWN_CPU / take_cpu_down() / __cpu_disable() the CPU is
marked as offline and interrupt handling is disabled. Later in
CPUHP_AP_SCHED_STARTING / sched_cpu_dying() all tasks are migrated away.

Did this hit a random task during a CPU-hotplug operation which was not
yet migrated away from the dying CPU? In theory a futex_unlock() of a RT
task could also produce such a backtrace.

> 							Thanx, Paul
> 

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: native_smp_send_reschedule() splat from rt_mutex_lock()?
  2017-09-20 16:24 ` Sebastian Andrzej Siewior
@ 2017-09-20 19:44   ` Paul E. McKenney
  2017-09-21 12:41   ` Peter Zijlstra
  1 sibling, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2017-09-20 19:44 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: peterz, mingo, linux-kernel, tglx

On Wed, Sep 20, 2017 at 06:24:47PM +0200, Sebastian Andrzej Siewior wrote:
> On 2017-09-18 09:51:10 [-0700], Paul E. McKenney wrote:
> > Hello!
> Hi,
> 
> > [11072.586518] sched: Unexpected reschedule of offline CPU#6!
> > [11072.587578] ------------[ cut here ]------------
> > [11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
> > [11072.591543] Modules linked in:
> > [11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1
> > [11072.610596] Call Trace:
> > [11072.611531]  resched_curr+0x61/0xd0
> > [11072.611531]  switched_to_rt+0x8f/0xa0
> > [11072.612647]  rt_mutex_setprio+0x25c/0x410
> > [11072.613591]  task_blocks_on_rt_mutex+0x1b3/0x1f0
> > [11072.614601]  rt_mutex_slowlock+0xa9/0x1e0
> > [11072.615567]  rt_mutex_lock+0x29/0x30
> > [11072.615567]  rcu_boost_kthread+0x127/0x3c0
> 
> > In theory, I could work around this by excluding CPU-hotplug operations
> > while doing RCU priority boosting, but in practice I am very much hoping
> > that there is a more reasonable solution out there...
> 
> so in CPUHP_TEARDOWN_CPU / take_cpu_down() / __cpu_disable() the CPU is
> marked as offline and interrupt handling is disabled. Later in
> CPUHP_AP_SCHED_STARTING / sched_cpu_dying() all tasks are migrated away.
> 
> Did this hit a random task during a CPU-hotplug operation which was not
> yet migrated away from the dying CPU? In theory a futex_unlock() of a RT
> task could also produce such a backtrace.

It could well have.  The rcutorture test suite does frequent random
CPU-hotplug operations, so if there is a window here, rcutorture is
likely to hit it sooner rather than later.

It also injects delays at the hypervisor level, with the tests running
as guest OSes, if that helps.

What should I do to diagnose this?  I could add a WARN_ON() in the
priority-boosting path, but as far as I can see, this would be a
probabilistic thing -- I don't see a way to guarantee it because
migration could happen at pretty much any time in the PREEMPT=y case
where this happens.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: native_smp_send_reschedule() splat from rt_mutex_lock()?
  2017-09-20 16:24 ` Sebastian Andrzej Siewior
  2017-09-20 19:44   ` Paul E. McKenney
@ 2017-09-21 12:41   ` Peter Zijlstra
  2017-09-21 13:28     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2017-09-21 12:41 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Paul E. McKenney, mingo, linux-kernel, tglx

On Wed, Sep 20, 2017 at 06:24:47PM +0200, Sebastian Andrzej Siewior wrote:
> On 2017-09-18 09:51:10 [-0700], Paul E. McKenney wrote:
> > Hello!
> Hi,
> 
> > [11072.586518] sched: Unexpected reschedule of offline CPU#6!
> > [11072.587578] ------------[ cut here ]------------
> > [11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
> > [11072.591543] Modules linked in:
> > [11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1
> > [11072.610596] Call Trace:
> > [11072.611531]  resched_curr+0x61/0xd0
> > [11072.611531]  switched_to_rt+0x8f/0xa0
> > [11072.612647]  rt_mutex_setprio+0x25c/0x410
> > [11072.613591]  task_blocks_on_rt_mutex+0x1b3/0x1f0
> > [11072.614601]  rt_mutex_slowlock+0xa9/0x1e0
> > [11072.615567]  rt_mutex_lock+0x29/0x30
> > [11072.615567]  rcu_boost_kthread+0x127/0x3c0
> 
> > In theory, I could work around this by excluding CPU-hotplug operations
> > while doing RCU priority boosting, but in practice I am very much hoping
> > that there is a more reasonable solution out there...
> 
> so in CPUHP_TEARDOWN_CPU / take_cpu_down() / __cpu_disable() the CPU is
> marked as offline and interrupt handling is disabled. Later in
> CPUHP_AP_SCHED_STARTING / sched_cpu_dying() all tasks are migrated away.
> 
> Did this hit a random task during a CPU-hotplug operation which was not
> yet migrated away from the dying CPU? In theory a futex_unlock() of a RT
> task could also produce such a backtrace.

So this is an interrupt that got received while we were going down, and
processed after we've migrated the tasks away, right?

Should we not clear the IRQ pending masks somewhere along the line?

Other than that, there's nothing we can do to avoid this race.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: native_smp_send_reschedule() splat from rt_mutex_lock()?
  2017-09-21 12:41   ` Peter Zijlstra
@ 2017-09-21 13:28     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-09-21 13:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, mingo, linux-kernel, tglx

On 2017-09-21 14:41:05 [+0200], Peter Zijlstra wrote:
> On Wed, Sep 20, 2017 at 06:24:47PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2017-09-18 09:51:10 [-0700], Paul E. McKenney wrote:
> > > Hello!
> > Hi,
> > 
> > > [11072.586518] sched: Unexpected reschedule of offline CPU#6!
> > > [11072.587578] ------------[ cut here ]------------
> > > [11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
> > > [11072.591543] Modules linked in:
> > > [11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1
> > > [11072.610596] Call Trace:
> > > [11072.611531]  resched_curr+0x61/0xd0
> > > [11072.611531]  switched_to_rt+0x8f/0xa0
> > > [11072.612647]  rt_mutex_setprio+0x25c/0x410
> > > [11072.613591]  task_blocks_on_rt_mutex+0x1b3/0x1f0
> > > [11072.614601]  rt_mutex_slowlock+0xa9/0x1e0
> > > [11072.615567]  rt_mutex_lock+0x29/0x30
> > > [11072.615567]  rcu_boost_kthread+0x127/0x3c0
> > 
> > > In theory, I could work around this by excluding CPU-hotplug operations
> > > while doing RCU priority boosting, but in practice I am very much hoping
> > > that there is a more reasonable solution out there...
> > 
> > so in CPUHP_TEARDOWN_CPU / take_cpu_down() / __cpu_disable() the CPU is
> > marked as offline and interrupt handling is disabled. Later in
> > CPUHP_AP_SCHED_STARTING / sched_cpu_dying() all tasks are migrated away.
> > 
> > Did this hit a random task during a CPU-hotplug operation which was not
> > yet migrated away from the dying CPU? In theory a futex_unlock() of a RT
> > task could also produce such a backtrace.
> 
> So this is an interrupt that got received while we were going down, and
> processed after we've migrated the tasks away, right?

No, I don't think so. A random CPU sent an IPI to an offline not yet
dead CPU which got lost. However before the CPU went dead it migrated
all task to another CPU and I *think* since that task was runnable it
got on the CPU soon.

> Should we not clear the IRQ pending masks somewhere along the line?

no I think we are good.

> Other than that, there's nothing we can do to avoid this race.

We could not send the IPI. Migrating the task (instead the IPI in this
case) is probably too much since it will be done soon (at least in this
scenario). So maybe we could just remove that warning _or_ add something
to cpu.c to check the "hotplug-state < CPUHP_AP_SCHED_STARTING" and cpu_offline()
and warn only in this case.

This is what the removal would look like:

diff --git a/arch/m32r/kernel/smp.c b/arch/m32r/kernel/smp.c
index 564052e3d3a0..2df5373063ae 100644
--- a/arch/m32r/kernel/smp.c
+++ b/arch/m32r/kernel/smp.c
@@ -103,7 +103,6 @@ static void send_IPI_mask(const struct cpumask *, int, int);
  *==========================================================================*/
 void smp_send_reschedule(int cpu_id)
 {
-	WARN_ON(cpu_is_offline(cpu_id));
 	send_IPI_mask(cpumask_of(cpu_id), RESCHEDULE_IPI, 1);
 }
 
diff --git a/arch/tile/kernel/smp.c b/arch/tile/kernel/smp.c
index 94a62e1197ce..8234e3c04d50 100644
--- a/arch/tile/kernel/smp.c
+++ b/arch/tile/kernel/smp.c
@@ -260,8 +260,6 @@ void __init ipi_init(void)
 
 void smp_send_reschedule(int cpu)
 {
-	WARN_ON(cpu_is_offline(cpu));
-
 	/*
 	 * We just want to do an MMIO store.  The traditional writeq()
 	 * functions aren't really correct here, since they're always
@@ -277,8 +275,6 @@ void smp_send_reschedule(int cpu)
 {
 	HV_Coord coord;
 
-	WARN_ON(cpu_is_offline(cpu));
-
 	coord.y = cpu_y(cpu);
 	coord.x = cpu_x(cpu);
 	hv_trigger_ipi(coord, IRQ_RESCHEDULE);
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d3c66a15bbde..31493748dd2d 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -123,10 +123,6 @@ static bool smp_no_nmi_ipi = false;
  */
 static void native_smp_send_reschedule(int cpu)
 {
-	if (unlikely(cpu_is_offline(cpu))) {
-		WARN_ON(1);
-		return;
-	}
 	apic->send_IPI(cpu, RESCHEDULE_VECTOR);
 }
 

Sebastian

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-09-21 13:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 16:51 native_smp_send_reschedule() splat from rt_mutex_lock()? Paul E. McKenney
2017-09-20 16:24 ` Sebastian Andrzej Siewior
2017-09-20 19:44   ` Paul E. McKenney
2017-09-21 12:41   ` Peter Zijlstra
2017-09-21 13:28     ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.