* WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0() on v3.10-rc3
@ 2013-05-30 18:29 Konrad Rzeszutek Wilk
2013-05-30 20:05 ` Thomas Gleixner
0 siblings, 1 reply; 3+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-30 18:29 UTC (permalink / raw)
To: tglx, linux-kernel, xen-devel
Hello,
I had not yet done a full git bisection run but since this is new code added
in v3.10. I did not see this in v3.9. I think I saw this in v3.10-rc1 but never
got to look at it in depth.
Either way on a PV guest, if I do:
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu1/online
I get this fat warning:
[ 39.946760] Broke affinity for irq 16
[ 40.071242] installing Xen timer for CPU 1
[ 40.076109] cpu 1 spinlock event irq 48
[ 40.081207] ------------[ cut here ]------------
[ 40.085841] WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0()
[ 40.095970] Modules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c i915 radeon sg sd_mod mperf skge e1000 sata_nv nouveau ata_generic libata fbcon tileblit font bitblit scsi_mod ttm softcursor drm_kms_helper mxm_wmi video wmi xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd
[ 40.130893] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc3upstream-00068-gdcdbe33 #1
[ 40.139212] Hardware name: BIOSTAR Group N61PB-M2S/N61PB-M2S, BIOS 6.00 PG 09/03/2009
[ 40.147098] ffffffff8193b448 ffff880039da5e60 ffffffff816707c8 ffff880039da5ea0
[ 40.154550] ffffffff8108ce8b ffff880039da4010 ffff88003fa8e500 ffff880039da4010
[ 40.162004] 0000000000000001 ffff880039da4000 ffff880039da4010 ffff880039da5eb0
[ 40.169458] Call Trace:
[ 40.171977] [<ffffffff816707c8>] dump_stack+0x19/0x1b
[ 40.177175] [<ffffffff8108ce8b>] warn_slowpath_common+0x6b/0xa0
[ 40.183243] [<ffffffff8108ced5>] warn_slowpath_null+0x15/0x20
[ 40.189134] [<ffffffff810e4745>] tick_nohz_idle_exit+0x195/0x1b0
[ 40.195287] [<ffffffff810da755>] cpu_startup_entry+0x205/0x250
[ 40.201268] [<ffffffff81661070>] cpu_bringup_and_idle+0x13/0x15
[ 40.207332] ---[ end trace 915c8c486004dda1 ]---
which I presume is b/c the code does not expect to be run _after_ it has
offlined. However, under the PV code, the mechanism is that that a CPU
that has been offlined, can resume (if it is onlined). If you look at:
445 static void __cpuinit xen_play_dead(void) /* used only with HOTPLUG_CPU */
446 {
447 play_dead_common();
448 HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
449 cpu_bringup();
450 }
That is called right after the CPU is put to sleep and the hypercall
VCPUOP_down blocks - until the CPU is brough back up. And which point
we end up calling cpu_bringup - which sets up the clockevets, timers, etc.
I am wondering if part of this is that the ts->inidle gets reset
b/c we end up resetting all the timers but then when xen_play_dead
exits, it ends up right back in the cpu_idle_loop() loop - and we
call tick_nohz_idle_exit().
Thoughts?
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0() on v3.10-rc3
2013-05-30 18:29 WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0() on v3.10-rc3 Konrad Rzeszutek Wilk
@ 2013-05-30 20:05 ` Thomas Gleixner
2013-06-03 13:42 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Gleixner @ 2013-05-30 20:05 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: linux-kernel, xen-devel
On Thu, 30 May 2013, Konrad Rzeszutek Wilk wrote:
> [ 40.085841] WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0()
>
> which I presume is b/c the code does not expect to be run _after_ it has
> offlined. However, under the PV code, the mechanism is that that a CPU
> that has been offlined, can resume (if it is onlined). If you look at:
>
> 445 static void __cpuinit xen_play_dead(void) /* used only with HOTPLUG_CPU */
> 446 {
> 447 play_dead_common();
> 448 HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
> 449 cpu_bringup();
> 450 }
>
> That is called right after the CPU is put to sleep and the hypercall
> VCPUOP_down blocks - until the CPU is brough back up. And which point
> we end up calling cpu_bringup - which sets up the clockevets, timers, etc.
>
> I am wondering if part of this is that the ts->inidle gets reset
> b/c we end up resetting all the timers but then when xen_play_dead
> exits, it ends up right back in the cpu_idle_loop() loop - and we
> call tick_nohz_idle_exit().
>
> Thoughts?
cpu_dead() is definitely not expected to return after the cpu has been
declared dead. I should have put a big fat warning into the generic
idle loop for this :)
The reason why you get that warning only now is commit 4b0c0f294
(tick: Cleanup NOHZ per cpu data on cpu down), which is btw. targeted
for stable as well.
We can't revert the above commit as it fixes a long standing
nastiness, so for now until I come around to make the idle loop return
on cpu down you probably need to call tick_nohz_idle_enter() before
returning from play_dead().
Thanks,
tglx
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0() on v3.10-rc3
2013-05-30 20:05 ` Thomas Gleixner
@ 2013-06-03 13:42 ` Konrad Rzeszutek Wilk
0 siblings, 0 replies; 3+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-06-03 13:42 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, xen-devel
On Thu, May 30, 2013 at 10:05:46PM +0200, Thomas Gleixner wrote:
> On Thu, 30 May 2013, Konrad Rzeszutek Wilk wrote:
> > [ 40.085841] WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0()
> >
> > which I presume is b/c the code does not expect to be run _after_ it has
> > offlined. However, under the PV code, the mechanism is that that a CPU
> > that has been offlined, can resume (if it is onlined). If you look at:
> >
> > 445 static void __cpuinit xen_play_dead(void) /* used only with HOTPLUG_CPU */
> > 446 {
> > 447 play_dead_common();
> > 448 HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
> > 449 cpu_bringup();
> > 450 }
> >
> > That is called right after the CPU is put to sleep and the hypercall
> > VCPUOP_down blocks - until the CPU is brough back up. And which point
> > we end up calling cpu_bringup - which sets up the clockevets, timers, etc.
> >
> > I am wondering if part of this is that the ts->inidle gets reset
> > b/c we end up resetting all the timers but then when xen_play_dead
> > exits, it ends up right back in the cpu_idle_loop() loop - and we
> > call tick_nohz_idle_exit().
> >
> > Thoughts?
>
> cpu_dead() is definitely not expected to return after the cpu has been
> declared dead. I should have put a big fat warning into the generic
> idle loop for this :)
>
> The reason why you get that warning only now is commit 4b0c0f294
> (tick: Cleanup NOHZ per cpu data on cpu down), which is btw. targeted
> for stable as well.
Ah, that would explain it. Thanks!
>
> We can't revert the above commit as it fixes a long standing
> nastiness, so for now until I come around to make the idle loop return
> on cpu down you probably need to call tick_nohz_idle_enter() before
> returning from play_dead().
OK. Could you keep me in mind when you do that cleanup and CC me? Thank you.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-06-03 13:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-30 18:29 WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0() on v3.10-rc3 Konrad Rzeszutek Wilk
2013-05-30 20:05 ` Thomas Gleixner
2013-06-03 13:42 ` Konrad Rzeszutek Wilk
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.