All of lore.kernel.org
 help / color / mirror / Atom feed
* Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12
@ 2009-05-13  0:00 Denys Fedoryschenko
  2009-05-13  5:01 ` David Miller
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Denys Fedoryschenko @ 2009-05-13  0:00 UTC (permalink / raw)
  To: sparclinux

Hi

Got recently lockups on T2000 machine. Machine is relatively new.
While compiling with crazy makeflags "-j17" getting such messages.
Process which was being compiled stuck, and i cannot terminate it.

Please CC me on reply.
Do i need to submit bugzilla message?

Full dmesg: http://www.nuclearcat.com/files/t2000-dmesg-1.txt
Kernel config: http://www.nuclearcat.com/files/t2000-config-1.txt

Here it relevant part (unwrapped available in dmesg):
[110645.897910] BUG: NMI Watchdog detected LOCKUP on CPU12, ip 0046630c, 
registers:
[110645.898120] TSTATE: 0000004480e01607 TPC: 000000000046630c TNPC: 
0000000000466310 Y: 00000000    Not tainted
[110645.898280] TPC: <run_timer_softirq+0x1a4/0x1d0>
[110645.898379] g0: 000000000000000c g1: fffff801ffc2fe60 g2: fffff801fc684138 
g3: fffff801ffc2fe60
[110645.898611] g4: fffff801fc662e00 g5: fffff801ff494000 g6: fffff801fc6a0000 
g7: fffff801fc684138
[110645.898697] o0: 0000000000000003 o1: fffff801ffc2fd90 o2: 0000000104514312 
o3: 0000000000000100
[110645.898783] o4: 0000000000000080 o5: fffff801ffc2fe60 sp: fffff801ffc2f5b1 
ret_pc: 00000000004661c0
[110645.898971] RPC: <run_timer_softirq+0x58/0x1d0>
[110645.899163] l0: fffff801fc684000 l1: 0000000000000012 l2: 000000000046d358 
l3: 000000000000000a
[110645.899249] l4: 000000000000000c l5: 0000000000000000 l6: fffff801fc6a0000 
l7: 0000000080001002
[110645.899480] i0: 000000000082d5c8 i1: fffff801ffd02010 i2: 0000000000000100 
i3: 00000000f025c8a0
[110645.899575] i4: 00000000fef42760 i5: 00000000fef417f8 i6: fffff801ffc2f681 
i7: 0000000000461ca4
[110645.899762] I7: <__do_softirq+0x4c/0x110>
[110645.899902] Call Trace:
[110645.900001]  [00000000004209f4] tl0_irq15+0x14/0x20
[110645.900113]  [000000000046630c] run_timer_softirq+0x1a4/0x1d0
[110645.900324]  [0000000000461ca4] __do_softirq+0x4c/0x110
[110645.900396]  [000000000042a750] do_softirq+0x60/0x8c
[110645.900505]  [00000000004617c0] irq_exit+0x3c/0x94
[110645.900610]  [000000000042e694] timer_interrupt+0xa0/0xbc
[110645.900820]  [00000000004209d4] tl0_irq14+0x14/0x20
[110645.900883]  [000000000042bc5c] cpu_idle+0xb0/0x148
[110645.901050]  [00000000006face0] after_lock_tlb+0x1b4/0x1cc
[110645.901256]  [0000000000000000] (null)
[111203.146591] INFO: task emerge:9830 blocked for more than 480 seconds.
[111203.146775] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[111203.146906] emerge        D 00000000006fdaf0     0  9830   9806
[111203.147132] Call Trace:
[111203.147279]  [00000000006fe7c0] schedule_timeout+0x18/0xa4
[111203.147390]  [00000000006fdaf0] wait_for_common+0xbc/0x148
[111203.147486]  [000000000046cd58] flush_cpu_workqueue+0x74/0x88
[111203.147672]  [000000000046cde0] flush_workqueue+0x34/0x60
[111203.147912]  [00000000005988a0] tty_ldisc_release+0x30/0x1e0
[111203.147978]  [0000000000592efc] tty_release_dev+0x400/0x430
[111203.148138]  [0000000000592f38] tty_release+0xc/0x24
[111203.148372]  [00000000004c4dd8] __fput+0xdc/0x1a4
[111203.148430]  [00000000004c2694] filp_close+0x68/0x7c
[111203.148620]  [00000000004c2718] SyS_close+0x70/0xb4
[111203.148692]  [0000000000406214] linux_sparc_syscall32+0x34/0x40


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12
  2009-05-13  0:00 Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12 Denys Fedoryschenko
@ 2009-05-13  5:01 ` David Miller
  2009-05-13  8:01 ` Denys Fedoryschenko
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2009-05-13  5:01 UTC (permalink / raw)
  To: sparclinux

From: Denys Fedoryschenko <denys@visp.net.lb>
Date: Wed, 13 May 2009 03:00:11 +0300

> Hi
> 
> Got recently lockups on T2000 machine. Machine is relatively new.
> While compiling with crazy makeflags "-j17" getting such messages.
> Process which was being compiled stuck, and i cannot terminate it.

That's not "crazy", I use -j32 as a minumum for my builds. :-)

I started seeing these hangs with 2.6.30-rcX and will try to
bisect it down when I get back from my week long cruise.

It's definitely a post 2.6.29 regression.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12
  2009-05-13  0:00 Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12 Denys Fedoryschenko
  2009-05-13  5:01 ` David Miller
@ 2009-05-13  8:01 ` Denys Fedoryschenko
  2009-05-14  3:32 ` David Miller
  2009-05-14  3:35 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Denys Fedoryschenko @ 2009-05-13  8:01 UTC (permalink / raw)
  To: sparclinux

On Wednesday 13 May 2009 08:01:46 David Miller wrote:
> From: Denys Fedoryschenko <denys@visp.net.lb>
> Date: Wed, 13 May 2009 03:00:11 +0300
>
> > Hi
> >
> > Got recently lockups on T2000 machine. Machine is relatively new.
> > While compiling with crazy makeflags "-j17" getting such messages.
> > Process which was being compiled stuck, and i cannot terminate it.
>
> That's not "crazy", I use -j32 as a minumum for my builds. :-)
>
> I started seeing these hangs with 2.6.30-rcX and will try to
> bisect it down when I get back from my week long cruise.
>
> It's definitely a post 2.6.29 regression.
Well, buying this t2k was a mistake of management, but it is kind of useful 
for porting my "heavily bitfielded" userspace app for big-endian's. Other 
choice is only PS3 or emulators in this country :-(

Also this expensive Sun box is terribly slow... dunno if it is a bug of 
software/kernel or hardware, but it starts from latency while compiling 
single 100kbyte source, ending even time to make -j17 for a linux kernel 
source. With such price, at x86, i will see bytes popping from server like a 
popcorn, getting dizzy because of speed.
But seems linux support of both, kind of weak, i will try to fill 
bugreports/patches. 
PS3 also had problems with WPA2 + dhcp. Recent commits with maintainers 
changes shows that Sony guys handling this question (not only for games) and 
maybe it can be fixed.

Sorry for offtopic :-) Now back to main subject...

I'm not sure, but maybe even 2.6.28.7, i had something similar.
Kernel where hangs occuiring ,i forgot to tell, but it is visible in dmesg - 
2.6.29.3. I will try meanwhile falling back to older kernels and to bisect by 
myself, if i will not become elder person while compiling kernel.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12
  2009-05-13  0:00 Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12 Denys Fedoryschenko
  2009-05-13  5:01 ` David Miller
  2009-05-13  8:01 ` Denys Fedoryschenko
@ 2009-05-14  3:32 ` David Miller
  2009-05-14  3:35 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2009-05-14  3:32 UTC (permalink / raw)
  To: sparclinux


The Niagara-T1 is incredibly slow as individual cpus.

Roughly the equivalent of a 300Mhz UltraSPARC-II.

The gain is that you have 32 of them.

Set your expectations appropriately.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12
  2009-05-13  0:00 Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12 Denys Fedoryschenko
                   ` (2 preceding siblings ...)
  2009-05-14  3:32 ` David Miller
@ 2009-05-14  3:35 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2009-05-14  3:35 UTC (permalink / raw)
  To: sparclinux

From: Denys Fedoryschenko <denys@visp.net.lb>
Date: Wed, 13 May 2009 11:01:35 +0300

> I'm not sure, but maybe even 2.6.28.7, i had something similar.
> Kernel where hangs occuiring ,i forgot to tell, but it is visible in
> dmesg - 2.6.29.3. I will try meanwhile falling back to older kernels
> and to bisect by myself, if i will not become elder person while
> compiling kernel.

There must be something wrong with your machine if this happens.

arch/sparc/configs/sparc64_defconfig builds in about 5 or 6 minutes
at "-j32" on my Niagara-T1 based boxes.

My Niagara-T2 box at "-j64" can build that in just over a minute.
An allmodconfig takes ~22minutes, and almost half of that is the
single-threaded work of linking the final vmlinux image and running
modpost on thousands of modules.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-05-14  3:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-13  0:00 Sun fire T2000,BUG: NMI Watchdog detected LOCKUP on CPU12 Denys Fedoryschenko
2009-05-13  5:01 ` David Miller
2009-05-13  8:01 ` Denys Fedoryschenko
2009-05-14  3:32 ` David Miller
2009-05-14  3:35 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.