All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.39: crash w/threadirqs option enabled
@ 2011-05-20  8:39 Justin Piszcz
  2011-05-20 12:48 ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Justin Piszcz @ 2011-05-20  8:39 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Piszcz

Hi,

I tried this in 2.6.39 and experienced crashes for the first time, when
using: threadirqs

user     pts/1                         Fri May 20 04:31   still logged in 
reboot   system boot  2.6.39           Fri May 20 04:28 - 04:36  (00:07) 
user     pts/2                         Thu May 19 20:01 - down   (08:24) 
reboot   system boot  2.6.39           Thu May 19 20:01 - 04:26  (08:25) 
user     pts/28                        Thu May 19 15:57 - crash  (04:03)

Not sure I can give any useful output as there is nothing in the logs, 
even with netconsole enabled, no useful output was produced.

I've removed the threadirqs option for now, will see if there are any
further stability issues.

Justin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20  8:39 2.6.39: crash w/threadirqs option enabled Justin Piszcz
@ 2011-05-20 12:48 ` Thomas Gleixner
  2011-05-20 12:51   ` Justin Piszcz
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2011-05-20 12:48 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, Alan Piszcz

On Fri, 20 May 2011, Justin Piszcz wrote:

> Hi,
> 
> I tried this in 2.6.39 and experienced crashes for the first time, when
> using: threadirqs
> 
> user     pts/1                         Fri May 20 04:31   still logged in
> reboot   system boot  2.6.39           Fri May 20 04:28 - 04:36  (00:07) user
> pts/2                         Thu May 19 20:01 - down   (08:24) reboot
> system boot  2.6.39           Thu May 19 20:01 - 04:26  (08:25) user
> pts/28                        Thu May 19 15:57 - crash  (04:03)
> 
> Not sure I can give any useful output as there is nothing in the logs, even
> with netconsole enabled, no useful output was produced.

Hmm. Machine lacks serial console, right? Can you send me your .config
please?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 12:48 ` Thomas Gleixner
@ 2011-05-20 12:51   ` Justin Piszcz
  2011-05-20 13:30     ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Justin Piszcz @ 2011-05-20 12:51 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Alan Piszcz



On Fri, 20 May 2011, Thomas Gleixner wrote:

> On Fri, 20 May 2011, Justin Piszcz wrote:
>
>> Hi,
>>
>> I tried this in 2.6.39 and experienced crashes for the first time, when
>> using: threadirqs
>>
>> user     pts/1                         Fri May 20 04:31   still logged in
>> reboot   system boot  2.6.39           Fri May 20 04:28 - 04:36  (00:07) user
>> pts/2                         Thu May 19 20:01 - down   (08:24) reboot
>> system boot  2.6.39           Thu May 19 20:01 - 04:26  (08:25) user
>> pts/28                        Thu May 19 15:57 - crash  (04:03)
>>
>> Not sure I can give any useful output as there is nothing in the logs, even
>> with netconsole enabled, no useful output was produced.
>
> Hmm. Machine lacks serial console, right? Can you send me your .config
> please?
>
> Thanks,
>
> 	tglx
>

Hi,

Correct, no serial port or header, config:
http://home.comcast.net/~jpiszcz/20110520/config-2.6.39-3.txt

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 12:51   ` Justin Piszcz
@ 2011-05-20 13:30     ` Thomas Gleixner
  2011-05-20 13:49       ` Justin Piszcz
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2011-05-20 13:30 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, Alan Piszcz

On Fri, 20 May 2011, Justin Piszcz wrote:
> On Fri, 20 May 2011, Thomas Gleixner wrote:
> 
> Correct, no serial port or header, config:
> http://home.comcast.net/~jpiszcz/20110520/config-2.6.39-3.txt

Does it crash right away or just when doing something particular? Is
the box fully dead after the crash ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 13:30     ` Thomas Gleixner
@ 2011-05-20 13:49       ` Justin Piszcz
  2011-05-20 15:17         ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Justin Piszcz @ 2011-05-20 13:49 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Alan Piszcz



On Fri, 20 May 2011, Thomas Gleixner wrote:

> On Fri, 20 May 2011, Justin Piszcz wrote:
>> On Fri, 20 May 2011, Thomas Gleixner wrote:
>>
>> Correct, no serial port or header, config:
>> http://home.comcast.net/~jpiszcz/20110520/config-2.6.39-3.txt
>

Hello Thomas,

> Does it crash right away or just when doing something particular?
It crashed at 2100, this is when I run a few I/O intensive processes:
- backup (dump ext4 filesystem -> to a separate raid device)
- backup (dump ext4 on remote host -> to separate raid device)
- backup (dump xfs on remote host -> to separate raid device)

This looks like it is what caused it to crash.

> Is the box fully dead after the crash ?
The host was online and I went away for awhile, when I came back the system
had rebooted on its own (as I lost all of my X windows/etc).

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 13:49       ` Justin Piszcz
@ 2011-05-20 15:17         ` Thomas Gleixner
  2011-05-20 16:10           ` Justin Piszcz
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2011-05-20 15:17 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, Alan Piszcz

On Fri, 20 May 2011, Justin Piszcz wrote:
> On Fri, 20 May 2011, Thomas Gleixner wrote:
> > Does it crash right away or just when doing something particular?
> It crashed at 2100, this is when I run a few I/O intensive processes:
> - backup (dump ext4 filesystem -> to a separate raid device)
> - backup (dump ext4 on remote host -> to separate raid device)
> - backup (dump xfs on remote host -> to separate raid device)
> 
> This looks like it is what caused it to crash.

That narrows it down somewhat, but does not give us a clue at all :(
 
> > Is the box fully dead after the crash ?
> The host was online and I went away for awhile, when I came back the system
> had rebooted on its own (as I lost all of my X windows/etc).

Hmm. Did you have panic_timeout set ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 15:17         ` Thomas Gleixner
@ 2011-05-20 16:10           ` Justin Piszcz
  2011-05-20 16:22             ` Thomas Gleixner
  2011-05-26 16:30             ` 2.6.39: crash w/threadirqs option enabled Justin Piszcz
  0 siblings, 2 replies; 17+ messages in thread
From: Justin Piszcz @ 2011-05-20 16:10 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Alan Piszcz



On Fri, 20 May 2011, Thomas Gleixner wrote:

> On Fri, 20 May 2011, Justin Piszcz wrote:
>> On Fri, 20 May 2011, Thomas Gleixner wrote:
>>> Does it crash right away or just when doing something particular?
>> It crashed at 2100, this is when I run a few I/O intensive processes:
>> - backup (dump ext4 filesystem -> to a separate raid device)
>> - backup (dump ext4 on remote host -> to separate raid device)
>> - backup (dump xfs on remote host -> to separate raid device)
>>
>> This looks like it is what caused it to crash.
>
> That narrows it down somewhat, but does not give us a clue at all :(
>
>>> Is the box fully dead after the crash ?
>> The host was online and I went away for awhile, when I came back the system
>> had rebooted on its own (as I lost all of my X windows/etc).
>
> Hmm. Did you have panic_timeout set ?

Hi,

No, I do not use panic_timeout or any type of watchdog that would reboot
the system upon a lockup/crash.

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 16:10           ` Justin Piszcz
@ 2011-05-20 16:22             ` Thomas Gleixner
  2011-05-21  0:00               ` Uwaysi Bin Kareem
                                 ` (2 more replies)
  2011-05-26 16:30             ` 2.6.39: crash w/threadirqs option enabled Justin Piszcz
  1 sibling, 3 replies; 17+ messages in thread
From: Thomas Gleixner @ 2011-05-20 16:22 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra

On Fri, 20 May 2011, Justin Piszcz wrote:
> On Fri, 20 May 2011, Thomas Gleixner wrote:
> 
> > On Fri, 20 May 2011, Justin Piszcz wrote:
> > > On Fri, 20 May 2011, Thomas Gleixner wrote:
> > > > Does it crash right away or just when doing something particular?
> > > It crashed at 2100, this is when I run a few I/O intensive processes:
> > > - backup (dump ext4 filesystem -> to a separate raid device)
> > > - backup (dump ext4 on remote host -> to separate raid device)
> > > - backup (dump xfs on remote host -> to separate raid device)
> > > 
> > > This looks like it is what caused it to crash.
> > 
> > That narrows it down somewhat, but does not give us a clue at all :(
> > 
> > > > Is the box fully dead after the crash ?
> > > The host was online and I went away for awhile, when I came back the
> > > system
> > > had rebooted on its own (as I lost all of my X windows/etc).
> > 
> > Hmm. Did you have panic_timeout set ?
> 
> Hi,
> 
> No, I do not use panic_timeout or any type of watchdog that would reboot
> the system upon a lockup/crash.

Yuck, that means it ran into a triple fault. Nasty. I have no idea how
to debug that at the moment and I was not able to reproduce on one of
my test systems. Maybe I need to try harder.

Thanks,

	tglx





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 16:22             ` Thomas Gleixner
@ 2011-05-21  0:00               ` Uwaysi Bin Kareem
  2011-06-10  8:17               ` Justin Piszcz
  2012-09-16 20:34               ` Low os-jitter operating system Uwaysi Bin Kareem
  2 siblings, 0 replies; 17+ messages in thread
From: Uwaysi Bin Kareem @ 2011-05-21  0:00 UTC (permalink / raw)
  To: linux-kernel

2.6.39 first observations: The framerate jitter in games that seem to  
depend on kernel for low-jitter, like doom 3, is almost completely gone  
now. (Tested with latest nvidia driver something 75 something)
I decided to try threadirqs from grub, but that did not work. Looks like  
the same error as I get when I tried the rt kernels, which was this one:

http://www.paradoxuncreated.com/tmp/rterror.jpg

Peace Be With You,
Uwaysi.


http://www.paradoxuncreated.com/tmp/.config39

Pentium(R) Dual-Core  CPU      E5200  @ 2.50GHz

uwaysi@Millennium:~$ lspci -v
00:00.0 Host bridge: nVidia Corporation C55 Host Bridge (rev a2)
	Flags: bus master, 66MHz, fast devsel, latency 0
	Capabilities: <access denied>

00:00.1 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:00.2 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:00.3 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: bus master, 66MHz, fast devsel, latency 0

00:00.4 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: bus master, 66MHz, fast devsel, latency 0

00:00.5 RAM memory: nVidia Corporation C55 Memory Controller (rev a2)
	Flags: bus master, 66MHz, fast devsel, latency 0

00:00.6 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:00.7 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.0 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.1 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.2 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.3 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.4 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.5 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:01.6 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:02.0 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:02.1 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: bus master, 66MHz, fast devsel, latency 0

00:02.2 RAM memory: nVidia Corporation C55 Memory Controller (rev a1)
	Flags: 66MHz, fast devsel

00:03.0 PCI bridge: nVidia Corporation C55 PCI Express bridge (rev a1)  
(prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=01, subordinate=05, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff
	Memory behind bridge: fa000000-feafffff
	Prefetchable memory behind bridge: 00000000d0000000-00000000dfffffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

00:06.0 PCI bridge: nVidia Corporation C55 PCI Express bridge (rev a1)  
(prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=06, subordinate=06, sec-latency=0
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

00:07.0 PCI bridge: nVidia Corporation C55 PCI Express bridge (rev a1)  
(prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=07, subordinate=07, sec-latency=0
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

00:09.0 RAM memory: nVidia Corporation MCP51 Host Bridge (rev a2)
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0
	Capabilities: <access denied>

00:0a.0 ISA bridge: nVidia Corporation MCP51 LPC Bridge (rev a3)
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0
	I/O ports at 4f00 [size=128]

00:0a.1 SMBus: nVidia Corporation MCP51 SMBus (rev a3)
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: 66MHz, fast devsel, IRQ 11
	I/O ports at 5000 [size=64]
	I/O ports at 6000 [size=64]
	Capabilities: <access denied>
	Kernel driver in use: nForce2_smbus
	Kernel modules: i2c-nforce2

00:0a.2 RAM memory: nVidia Corporation MCP51 Memory Controller 0 (rev a3)
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: 66MHz, fast devsel

00:0b.0 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)  
(prog-if 10 [OHCI])
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 20
	Memory at f9fff000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: ohci_hcd

00:0b.1 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)  
(prog-if 20 [EHCI])
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 21
	Memory at f9ffec00 (32-bit, non-prefetchable) [size=256]
	Capabilities: <access denied>
	Kernel driver in use: ehci_hcd

00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1) (prog-if 8a  
[Master SecP PriP])
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0
	[virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
	[virtual] Memory at 000003f0 (type 3, non-prefetchable) [size=1]
	[virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
	[virtual] Memory at 00000370 (type 3, non-prefetchable) [size=1]
	I/O ports at ffa0 [size=16]
	Capabilities: <access denied>
	Kernel driver in use: pata_amd
	Kernel modules: pata_amd

00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev  
a1) (prog-if 85 [Master SecO PriO])
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 23
	I/O ports at c800 [size=8]
	I/O ports at c480 [size=4]
	I/O ports at c400 [size=8]
	I/O ports at c080 [size=4]
	I/O ports at c000 [size=16]
	Memory at f9ffd000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: sata_nv
	Kernel modules: sata_nv

00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev  
a1) (prog-if 85 [Master SecO PriO])
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 22
	I/O ports at bc00 [size=8]
	I/O ports at b880 [size=4]
	I/O ports at b800 [size=8]
	I/O ports at b480 [size=4]
	I/O ports at b400 [size=16]
	Memory at f9ffc000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <access denied>
	Kernel driver in use: sata_nv
	Kernel modules: sata_nv

00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2) (prog-if  
01 [Subtractive decode])
	Flags: bus master, 66MHz, fast devsel, latency 0
	Bus: primary=00, secondary=08, subordinate=08, sec-latency=32
	I/O behind bridge: 0000e000-0000efff
	Memory behind bridge: feb00000-febfffff
	Capabilities: <access denied>

00:10.1 Audio device: nVidia Corporation MCP51 High Definition Audio (rev  
a2)
	Subsystem: Micro-Star International Co., Ltd. Device 7380
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 22
	Memory at f9ff8000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: HDA Intel
	Kernel modules: snd-hda-intel

00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a3)
	Subsystem: Micro-Star International Co., Ltd. Device 380c
	Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 23
	Memory at f9ff7000 (32-bit, non-prefetchable) [size=4K]
	I/O ports at b080 [size=8]
	Capabilities: <access denied>
	Kernel driver in use: forcedeth
	Kernel modules: forcedeth

01:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for  
mainboards (rev a2) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=01, secondary=02, subordinate=05, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff
	Memory behind bridge: fa000000-feafffff
	Prefetchable memory behind bridge: 00000000d0000000-00000000dfffffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for  
mainboards (rev a2) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: 0000d000-0000dfff
	Memory behind bridge: fa000000-feafffff
	Prefetchable memory behind bridge: 00000000d0000000-00000000dfffffff
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:02.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for  
mainboards (rev a2) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=04, subordinate=04, sec-latency=0
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:03.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for  
mainboards (rev a2) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
	Capabilities: <access denied>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

03:00.0 VGA compatible controller: nVidia Corporation GT200 [GeForce GTX  
280] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: eVga.com. Corp. Device 1280
	Flags: bus master, fast devsel, latency 0, IRQ 19
	Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
	I/O ports at dc00 [size=128]
	[virtual] Expansion ROM at fea80000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nvidia, nvidiafb

08:09.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306/7/8 [Fire  
II(M)] IEEE 1394 OHCI Controller (rev c0) (prog-if 10 [OHCI])
	Subsystem: Micro-Star International Co., Ltd. Device 380d
	Flags: bus master, medium devsel, latency 32, IRQ 10
	Memory at febff800 (32-bit, non-prefetchable) [size=2K]
	I/O ports at ec00 [size=128]
	Capabilities: <access denied>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 16:10           ` Justin Piszcz
  2011-05-20 16:22             ` Thomas Gleixner
@ 2011-05-26 16:30             ` Justin Piszcz
  1 sibling, 0 replies; 17+ messages in thread
From: Justin Piszcz @ 2011-05-26 16:30 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Alan Piszcz



On Fri, 20 May 2011, Justin Piszcz wrote:

>
>
> On Fri, 20 May 2011, Thomas Gleixner wrote:
>
>> On Fri, 20 May 2011, Justin Piszcz wrote:
>>> On Fri, 20 May 2011, Thomas Gleixner wrote:
>>>> Does it crash right away or just when doing something particular?
>>> It crashed at 2100, this is when I run a few I/O intensive processes:
>>> - backup (dump ext4 filesystem -> to a separate raid device)
>>> - backup (dump ext4 on remote host -> to separate raid device)
>>> - backup (dump xfs on remote host -> to separate raid device)
>>> 
>>> This looks like it is what caused it to crash.
>> 
>> That narrows it down somewhat, but does not give us a clue at all :(
>> 
>>>> Is the box fully dead after the crash ?
>>> The host was online and I went away for awhile, when I came back the 
>>> system
>>> had rebooted on its own (as I lost all of my X windows/etc).
>> 
>> Hmm. Did you have panic_timeout set ?
>
> Hi,
>
> No, I do not use panic_timeout or any type of watchdog that would reboot
> the system upon a lockup/crash.
>
> Justin.
>
>

Hi,

I like to be as accurate as possible, since this occurred, I've removed 
threadirqs..

Please disregard this, I also updated the BIOS on the same day (BIOS 
update + kernel update), I'm re-running w/thread irqs enabled again, and 
I'll update you if there are any issues, thanks.

(I've set the bios to factory defaults -> tweaked), we'll see what 
happens.

Justin.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-05-20 16:22             ` Thomas Gleixner
  2011-05-21  0:00               ` Uwaysi Bin Kareem
@ 2011-06-10  8:17               ` Justin Piszcz
  2011-06-10  8:24                 ` Justin Piszcz
  2011-06-10 12:52                 ` Thomas Gleixner
  2012-09-16 20:34               ` Low os-jitter operating system Uwaysi Bin Kareem
  2 siblings, 2 replies; 17+ messages in thread
From: Justin Piszcz @ 2011-06-10  8:17 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra



On Fri, 20 May 2011, Thomas Gleixner wrote:

> On Fri, 20 May 2011, Justin Piszcz wrote:
>> On Fri, 20 May 2011, Thomas Gleixner wrote:
>>
>>> On Fri, 20 May 2011, Justin Piszcz wrote:
>>>> On Fri, 20 May 2011, Thomas Gleixner wrote:
>>>>> Does it crash right away or just when doing something particular?
>>>> It crashed at 2100, this is when I run a few I/O intensive processes:
>>>> - backup (dump ext4 filesystem -> to a separate raid device)
>>>> - backup (dump ext4 on remote host -> to separate raid device)
>>>> - backup (dump xfs on remote host -> to separate raid device)
>>>>
>>>> This looks like it is what caused it to crash.
>>>
>>> That narrows it down somewhat, but does not give us a clue at all :(
>>>
>>>>> Is the box fully dead after the crash ?
>>>> The host was online and I went away for awhile, when I came back the
>>>> system
>>>> had rebooted on its own (as I lost all of my X windows/etc).
>>>
>>> Hmm. Did you have panic_timeout set ?
>>
>> Hi,
>>
>> No, I do not use panic_timeout or any type of watchdog that would reboot
>> the system upon a lockup/crash.
>
> Yuck, that means it ran into a triple fault. Nasty. I have no idea how
> to debug that at the moment and I was not able to reproduce on one of
> my test systems. Maybe I need to try harder.
>
> Thanks,
>
> 	tglx
>
>
>
>

Hi,

Crashed again and it rebooted too:

reboot   system boot  2.6.39             Thu Jun  9 23:58 - 04:05  (04:06) 
user1      pts/0        X                Thu Jun  9 19:25 - 19:30  (00:04)
user1      pts/10       X                Thu Jun  9 18:23 - crash  (05:35)

Any thoughts on what could be causing this?
Should I go back to 2.6.38?

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-06-10  8:17               ` Justin Piszcz
@ 2011-06-10  8:24                 ` Justin Piszcz
  2011-06-10 12:52                 ` Thomas Gleixner
  1 sibling, 0 replies; 17+ messages in thread
From: Justin Piszcz @ 2011-06-10  8:24 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra



On Fri, 10 Jun 2011, Justin Piszcz wrote:
>
> Hi,
>
> Crashed again and it rebooted too:
>
> reboot   system boot  2.6.39             Thu Jun  9 23:58 - 04:05  (04:06) 
> user1      pts/0        X                Thu Jun  9 19:25 - 19:30  (00:04)
> user1      pts/10       X                Thu Jun  9 18:23 - crash  (05:35)
>
> Any thoughts on what could be causing this?
> Should I go back to 2.6.38?
>

I have a lot of USB devices attached, perhaps that is what's causing the
crashes, I will disconnect some of them and see if there is another crash.

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 003: ID 2001:f103 D-Link Corp. DUB-H7 7-port USB 2.0 hub
Bus 001 Device 004: ID 2001:f103 D-Link Corp. DUB-H7 7-port USB 2.0 hub
Bus 001 Device 005: ID 0764:0501 Cyber Power System, Inc. CP1500 AVR UPS
Bus 001 Device 006: ID 413c:1002 Dell Computer Corp. Keyboard Hub
Bus 002 Device 003: ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode)
Bus 001 Device 007: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port
Bus 001 Device 008: ID 0424:2502 Standard Microsystems Corp. 
Bus 001 Device 009: ID 054c:002c Sony Corp. USB Floppy Disk Drive
Bus 001 Device 010: ID 413c:2002 Dell Computer Corp. SK-8125 Keyboard
Bus 001 Device 011: ID 0461:4d15 Primax Electronics, Ltd Dell Optical Mouse
Bus 001 Device 012: ID 0424:2602 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 013: ID 093b:0027 Plextor Corp. 
Bus 001 Device 014: ID 0424:2228 Standard Microsystems Corp. 9-in-2 Card Reader

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-06-10  8:17               ` Justin Piszcz
  2011-06-10  8:24                 ` Justin Piszcz
@ 2011-06-10 12:52                 ` Thomas Gleixner
  2011-06-10 13:07                   ` Justin Piszcz
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2011-06-10 12:52 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra

On Fri, 10 Jun 2011, Justin Piszcz wrote:
> On Fri, 20 May 2011, Thomas Gleixner wrote:
> Crashed again and it rebooted too:
> 
> reboot   system boot  2.6.39             Thu Jun  9 23:58 - 04:05  (04:06)
> user1      pts/0        X                Thu Jun  9 19:25 - 19:30  (00:04)
> user1      pts/10       X                Thu Jun  9 18:23 - crash  (05:35)
> 
> Any thoughts on what could be causing this?
> Should I go back to 2.6.38?

If you remove the threadirqs option from the commandline it does not
happen, right?

Can you try the following patch ?

Thanks,

	tglx
---
commit fd8a7de177b6f56a0fc59ad211c197a7df06b1ad
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Jul 20 14:34:50 2010 +0200

    x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU
    
    After a newly plugged CPU sets the cpu_online bit it enables
    interrupts and goes idle. The cpu which brought up the new cpu waits
    for the cpu_online bit and when it observes it, it sets the cpu_active
    bit for this cpu. The cpu_active bit is the relevant one for the
    scheduler to consider the cpu as a viable target.
    
    With forced threaded interrupt handlers which imply forced threaded
    softirqs we observed the following race:
    
    cpu 0                         cpu 1
    
    bringup(cpu1);
                                  set_cpu_online(smp_processor_id(), true);
    		              local_irq_enable();
    while (!cpu_online(cpu1));
                                  timer_interrupt()
                                    -> wake_up(softirq_thread_cpu1);
                                         -> enqueue_on(softirq_thread_cpu1, cpu0);
    
                                                                            ^^^^
    
    cpu_notify(CPU_ONLINE, cpu1);
      -> sched_cpu_active(cpu1)
         -> set_cpu_active((cpu1, true);
    
    When an interrupt happens before the cpu_active bit is set by the cpu
    which brought up the newly onlined cpu, then the scheduler refuses to
    enqueue the woken thread which is bound to that newly onlined cpu on
    that newly onlined cpu due to the not yet set cpu_active bit and
    selects a fallback runqueue. Not really an expected and desirable
    behaviour.
    
    So far this has only been observed with forced hard/softirq threading,
    but in theory this could happen without forced threaded hard/softirqs
    as well. It's probably unobservable as it would take a massive
    interrupt storm on the newly onlined cpu which causes the softirq loop
    to wake up the softirq thread and an even longer delay of the cpu
    which waits for the cpu_online bit.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Peter Zijlstra <peterz@infradead.org>
    Cc: stable@kernel.org # 2.6.39

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 33a0c11..9fd3137 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -285,6 +285,19 @@ notrace static void __cpuinit start_secondary(void *unused)
 	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
 	x86_platform.nmi_init();
 
+	/*
+	 * Wait until the cpu which brought this one up marked it
+	 * online before enabling interrupts. If we don't do that then
+	 * we can end up waking up the softirq thread before this cpu
+	 * reached the active state, which makes the scheduler unhappy
+	 * and schedule the softirq thread on the wrong cpu. This is
+	 * only observable with forced threaded interrupts, but in
+	 * theory it could also happen w/o them. It's just way harder
+	 * to achieve.
+	 */
+	while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask))
+		cpu_relax();
+
 	/* enable local interrupts */
 	local_irq_enable();
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-06-10 12:52                 ` Thomas Gleixner
@ 2011-06-10 13:07                   ` Justin Piszcz
  2011-06-10 13:10                     ` Thomas Gleixner
  2011-06-12 12:04                     ` Justin Piszcz
  0 siblings, 2 replies; 17+ messages in thread
From: Justin Piszcz @ 2011-06-10 13:07 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra



On Fri, 10 Jun 2011, Thomas Gleixner wrote:

> On Fri, 10 Jun 2011, Justin Piszcz wrote:
>> On Fri, 20 May 2011, Thomas Gleixner wrote:
>> Crashed again and it rebooted too:
>>
>> reboot   system boot  2.6.39             Thu Jun  9 23:58 - 04:05  (04:06)
>> user1      pts/0        X                Thu Jun  9 19:25 - 19:30  (00:04)
>> user1      pts/10       X                Thu Jun  9 18:23 - crash  (05:35)
>>
>> Any thoughts on what could be causing this?
>> Should I go back to 2.6.38?
>
> If you remove the threadirqs option from the commandline it does not
> happen, right?
Yes, most often anyhow (still using that option)

>
> Can you try the following patch ?
Yup, patched:

# patch -p1 < ../patch-for-cpu 
patching file arch/x86/kernel/smpboot.c
#

New kernel running now, I also plugged in my USB devices back in as well, we'll see what happens.

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-06-10 13:07                   ` Justin Piszcz
@ 2011-06-10 13:10                     ` Thomas Gleixner
  2011-06-12 12:04                     ` Justin Piszcz
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2011-06-10 13:10 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra

On Fri, 10 Jun 2011, Justin Piszcz wrote:
> On Fri, 10 Jun 2011, Thomas Gleixner wrote:
> > If you remove the threadirqs option from the commandline it does not
> > happen, right?
> Yes, most often anyhow (still using that option)

-ENOPARSE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.39: crash w/threadirqs option enabled
  2011-06-10 13:07                   ` Justin Piszcz
  2011-06-10 13:10                     ` Thomas Gleixner
@ 2011-06-12 12:04                     ` Justin Piszcz
  1 sibling, 0 replies; 17+ messages in thread
From: Justin Piszcz @ 2011-06-12 12:04 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Alan Piszcz, Ingo Molnar, Peter Zijlstra



On Fri, 10 Jun 2011, Justin Piszcz wrote:

>
>

Hi,

It crashed again with the patch, so it must be something else when it
happened, I was not able to get any output since i was in X.

Will leave X off, console/monitor on and disconnect those USB devices
I mentioned earlier and see if the problem persists.

The call trace in the picture the last time I was able to get a screen
shot was @ blk_peek_request and scsi_request_fn.
http://home.comcast.net/~jpiszcz/20110528/2639-ss1.jpg

I will disconnect the USB devices I mentioned earlier and see if the issues
persist.

Justin.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Low os-jitter operating system.
  2011-05-20 16:22             ` Thomas Gleixner
  2011-05-21  0:00               ` Uwaysi Bin Kareem
  2011-06-10  8:17               ` Justin Piszcz
@ 2012-09-16 20:34               ` Uwaysi Bin Kareem
  2 siblings, 0 replies; 17+ messages in thread
From: Uwaysi Bin Kareem @ 2012-09-16 20:34 UTC (permalink / raw)
  To: linux-kernel, vlc-devel

Hiya, I sendt this to the ubuntu suggestions box. I am emailing a copy  
here aswell.
---


Mark Shuttleworth has talked about "ultrasmooth" gaming, desktop etc. You  
can have that now already, or atleast quite close. Games will be perfect.  
Webanims and videoes will depend on syncing code or target frame rate.

First of all drop unessecary layers from the kernel. Config it for minimal  
latency. (max preemption, more preemption = less os-jitter = less latency)
Standard config usually is not configged this way, and leads to uneccesary  
chopping/frames dropped. That is not the way you want it.

Use low-jitter applications. For instance - Webkit based browsers have  
lower jitter. Resulting in smoother youtube playback among other things.  
Webanimation in general. So you`d want to use Chromium as the default  
browser.

Intelligent priorities for tasks. If you have background tasks that are  
insensitive to jitter and latency (non audio/video/net), let them have  
lower priority and only one cpu, so they are transparent to the user. Etc.  
Automatic lower priority for a wordprocessor for instance, would also fit  
the profile.

Consider system applications with lowest jitter.

This will also prepare the OS well for Wayland which is opengl based.  
Low-jitter = smoothest opengl experience.

You should also consider promoting a refresh rate of 72, as optimal.  
Research shows that many prefer 72hz refresh rate. I can personally also  
confirm that this is a nice rate, quiet and peaceful, and if one shall  
describe it a "minimal psychovisual noise" setting. More seems to only add  
noise to the screen. Less is flickery. I further tuned this to 72.734hz,  
which you can try and confirm, as a "minimal psychovisual noise" profile.
One should also inform about videos with 30fps, should have 60/90hz  
refresh rate, to avoid video-jitter. Or 25fps, 75hz etc. Similar for other  
formats. An automatically changing screenmode syncing mediaplayer would be  
optimal, with 72.734hz as standard mode, for hzadapting games or other.

Peace Be With You.

PS: If you need a video, to test with low-jitter, I have a low-jitter  
video here: This should play smoothly on a well-configged system @ 60hz  
refresh rate. Unfortunately I doubt the clocks are synced, so there still  
will be small jitter. Should not be much though.

http://paradoxuncreated.com/Blog/wordpress/?page_id=70

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-09-16 20:34 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-20  8:39 2.6.39: crash w/threadirqs option enabled Justin Piszcz
2011-05-20 12:48 ` Thomas Gleixner
2011-05-20 12:51   ` Justin Piszcz
2011-05-20 13:30     ` Thomas Gleixner
2011-05-20 13:49       ` Justin Piszcz
2011-05-20 15:17         ` Thomas Gleixner
2011-05-20 16:10           ` Justin Piszcz
2011-05-20 16:22             ` Thomas Gleixner
2011-05-21  0:00               ` Uwaysi Bin Kareem
2011-06-10  8:17               ` Justin Piszcz
2011-06-10  8:24                 ` Justin Piszcz
2011-06-10 12:52                 ` Thomas Gleixner
2011-06-10 13:07                   ` Justin Piszcz
2011-06-10 13:10                     ` Thomas Gleixner
2011-06-12 12:04                     ` Justin Piszcz
2012-09-16 20:34               ` Low os-jitter operating system Uwaysi Bin Kareem
2011-05-26 16:30             ` 2.6.39: crash w/threadirqs option enabled Justin Piszcz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.