linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.9 rc2 freezing
@ 2004-09-13 16:55 Zilvinas Valinskas
  2004-09-13 17:12 ` Jeff Garzik
  0 siblings, 1 reply; 19+ messages in thread
From: Zilvinas Valinskas @ 2004-09-13 16:55 UTC (permalink / raw)
  To: linux-kernel mailing list

[1.] One line summary of the problem: 
 Applied patch-2.6.9.bz2 on top of linux 2.6.8 tree, reboot - 
 then suddenly laptop frozes.
 
[2.] Full description of the problem/report:
 It is the same .config used to compile 2.6.9 rc1 and 2.6.9 rc2 
 (http://www.gemtek.lt/~zilvinas/oops/ for kern.log and .config).
 Laptop booted as usual, logged in KDE and started up evolution -
 mouse froze, keyboard seemed dead - although sysrq-s, sysrq-u & 
 sysrq-b worked just fine. After reboot I found a lot of messages
 repeated like :

Sep 13 18:51:24 evo800 kernel: Warning: kfree_skb on hard IRQ 000000d8
Sep 13 18:51:24 evo800 kernel: bad: scheduling while atomic!
Sep 13 18:51:24 evo800 kernel:  [schedule+1208/1213] schedule+0x4b8/0x4bd
Sep 13 18:51:24 evo800 kernel:  [sys_time+22/80] sys_time+0x16/0x50
Sep 13 18:51:24 evo800 kernel:  [work_resched+5/22] work_resched+0x5/0x16

or 

Sep 13 18:51:24 evo800 kernel: Warning: kfree_skb on hard IRQ 000000d8
Sep 13 18:51:24 evo800 kernel: bad: scheduling while atomic!
Sep 13 18:51:24 evo800 kernel:  [schedule+1208/1213] schedule+0x4b8/0x4bd
Sep 13 18:51:24 evo800 kernel:  [schedule_timeout+96/179] schedule_timeout+0x60/0xb3
Sep 13 18:51:24 evo800 kernel:  [__get_free_pages+31/59] __get_free_pages+0x1f/0x3b
Sep 13 18:51:24 evo800 kernel:  [process_timeout+0/5] process_timeout+0x0/0x5
Sep 13 18:51:24 evo800 kernel:  [do_select+394/698] do_select+0x18a/0x2ba
Sep 13 18:51:24 evo800 kernel:  [__pollwait+0/192] __pollwait+0x0/0xc0
Sep 13 18:51:24 evo800 kernel:  [print_context_stack+35/93] print_context_stack+0x23/0x5d
Sep 13 18:51:24 evo800 kernel:  [sys_select+670/1176] sys_select+0x29e/0x498
Sep 13 18:51:24 evo800 kernel:  [sys_time+22/80] sys_time+0x16/0x50
Sep 13 18:51:24 evo800 kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
 

[3.] Keywords (i.e., modules, networking, kernel):
 Modules Loaded         nfs esp4 nfsd exportfs lockd sunrpc nsc_ircc
 ipt_state iptable_filter iptable_nat crypto_null microcode ehci_hcd
 ohci_hcd floppy irtty_sir sir_dev irda crc_ccitt 8250_pnp khazad
 twofish sha512 sha256 sha1 serpent md5 md4 des deflate zlib_deflate
 zlib_inflate cast6 cast5 blowfish arc4 aes_i586 xfrm_user
 ip_conntrack_irc ip_conntrack_ftp ip_conntrack ip_tables ide_cd cdrom
 8250 serial_core snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss
 snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi
 snd_seq_device snd soundcore yenta_socket radeon intel_agp agpgart

[4.] Kernel version (from /proc/version):
 see (http://www.gemtek.lt/~zilvinas/oops/

[7.1.] Software (add the output of the ver_linux script here)
sh scripts/ver_linux
Linux swoop 2.6.9-rc1 #1 Wed Aug 25 10:52:32 EEST 2004 i686 GNU/Linux
 
 Gnu C                  3.3.4
 Gnu make               3.80
 binutils               2.15
 util-linux             2.12
 mount                  2.12
 module-init-tools      3.1-pre5
 e2fsprogs              1.35
 reiserfsprogs          3.6.18
 reiser4progs           line
 xfsprogs               2.6.20
 pcmcia-cs              3.2.5
 nfs-utils              1.0.6
 Linux C Library        2.3.2
 Dynamic linker (ldd)   2.3.2
 Procps                 3.2.3
 Net-tools              1.60
 Console-tools          0.2.3
 Sh-utils               5.2.1


[7.2.] Processor information (from /proc/cpuinfo):
   see http://www.gemtek.lt/~zilvinas/oops/
Thank you

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-13 16:55 2.6.9 rc2 freezing Zilvinas Valinskas
@ 2004-09-13 17:12 ` Jeff Garzik
  2004-09-13 17:16   ` Zilvinas Valinskas
  2004-09-13 17:19   ` Zilvinas Valinskas
  0 siblings, 2 replies; 19+ messages in thread
From: Jeff Garzik @ 2004-09-13 17:12 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: linux-kernel mailing list

Zilvinas Valinskas wrote:
> [1.] One line summary of the problem: 
>  Applied patch-2.6.9.bz2 on top of linux 2.6.8 tree, reboot - 
>  then suddenly laptop frozes.
>  
> [2.] Full description of the problem/report:
>  It is the same .config used to compile 2.6.9 rc1 and 2.6.9 rc2 
>  (http://www.gemtek.lt/~zilvinas/oops/ for kern.log and .config).
>  Laptop booted as usual, logged in KDE and started up evolution -
>  mouse froze, keyboard seemed dead - although sysrq-s, sysrq-u & 
>  sysrq-b worked just fine. After reboot I found a lot of messages
>  repeated like :
> 
> Sep 13 18:51:24 evo800 kernel: Warning: kfree_skb on hard IRQ 000000d8
> Sep 13 18:51:24 evo800 kernel: bad: scheduling while atomic!
> Sep 13 18:51:24 evo800 kernel:  [schedule+1208/1213] schedule+0x4b8/0x4bd
> Sep 13 18:51:24 evo800 kernel:  [sys_time+22/80] sys_time+0x16/0x50
> Sep 13 18:51:24 evo800 kernel:  [work_resched+5/22] work_resched+0x5/0x16
> 
> or 
> 
> Sep 13 18:51:24 evo800 kernel: Warning: kfree_skb on hard IRQ 000000d8
> Sep 13 18:51:24 evo800 kernel: bad: scheduling while atomic!
> Sep 13 18:51:24 evo800 kernel:  [schedule+1208/1213] schedule+0x4b8/0x4bd
> Sep 13 18:51:24 evo800 kernel:  [schedule_timeout+96/179] schedule_timeout+0x60/0xb3
> Sep 13 18:51:24 evo800 kernel:  [__get_free_pages+31/59] __get_free_pages+0x1f/0x3b
> Sep 13 18:51:24 evo800 kernel:  [process_timeout+0/5] process_timeout+0x0/0x5
> Sep 13 18:51:24 evo800 kernel:  [do_select+394/698] do_select+0x18a/0x2ba
> Sep 13 18:51:24 evo800 kernel:  [__pollwait+0/192] __pollwait+0x0/0xc0
> Sep 13 18:51:24 evo800 kernel:  [print_context_stack+35/93] print_context_stack+0x23/0x5d
> Sep 13 18:51:24 evo800 kernel:  [sys_select+670/1176] sys_select+0x29e/0x498
> Sep 13 18:51:24 evo800 kernel:  [sys_time+22/80] sys_time+0x16/0x50
> Sep 13 18:51:24 evo800 kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
>  
> 
> [3.] Keywords (i.e., modules, networking, kernel):
>  Modules Loaded         nfs esp4 nfsd exportfs lockd sunrpc nsc_ircc
>  ipt_state iptable_filter iptable_nat crypto_null microcode ehci_hcd
>  ohci_hcd floppy irtty_sir sir_dev irda crc_ccitt 8250_pnp khazad
>  twofish sha512 sha256 sha1 serpent md5 md4 des deflate zlib_deflate
>  zlib_inflate cast6 cast5 blowfish arc4 aes_i586 xfrm_user
>  ip_conntrack_irc ip_conntrack_ftp ip_conntrack ip_tables ide_cd cdrom
>  8250 serial_core snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss
>  snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi
>  snd_seq_device snd soundcore yenta_socket radeon intel_agp agpgart

I'm totally blind, because I don't see your network driver in that big 
list of modules.

Your network driver should probably be doing dev_kfree_skb_any() 
somewhere, but isn't.

	Jeff




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-13 17:12 ` Jeff Garzik
@ 2004-09-13 17:16   ` Zilvinas Valinskas
  2004-09-15  8:25     ` Erik Tews
  2004-09-13 17:19   ` Zilvinas Valinskas
  1 sibling, 1 reply; 19+ messages in thread
From: Zilvinas Valinskas @ 2004-09-13 17:16 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel mailing list

On Mon, Sep 13, 2004 at 01:12:13PM -0400, Jeff Garzik wrote:
> I'm totally blind, because I don't see your network driver in that big 
> list of modules.
> 
> Your network driver should probably be doing dev_kfree_skb_any() 
> somewhere, but isn't.
> 
> 	Jeff
> 
It is compiled in, see :

CONFIG_E100=y
CONFIG_E100_NAPI=y

Can it be IPsec related ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-13 17:12 ` Jeff Garzik
  2004-09-13 17:16   ` Zilvinas Valinskas
@ 2004-09-13 17:19   ` Zilvinas Valinskas
  1 sibling, 0 replies; 19+ messages in thread
From: Zilvinas Valinskas @ 2004-09-13 17:19 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel mailing list

On Mon, Sep 13, 2004 at 01:12:13PM -0400, Jeff Garzik wrote:
> I'm totally blind, because I don't see your network driver in that big 
> list of modules.
> 
> Your network driver should probably be doing dev_kfree_skb_any() 
> somewhere, but isn't.
> 
> 	Jeff

it e100 network driver, compiled in. 
CONFIG_E100=y
CONFIG_E100_NAPI=y

@see http://www.gemtek.lt/~zilvinas/oops/

there are full boot logs for 2.6.9 rc1 and rc2.

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-13 17:16   ` Zilvinas Valinskas
@ 2004-09-15  8:25     ` Erik Tews
  2004-09-15  9:58       ` Zilvinas Valinskas
  0 siblings, 1 reply; 19+ messages in thread
From: Erik Tews @ 2004-09-15  8:25 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: Jeff Garzik, linux-kernel mailing list

Am Mo, den 13.09.2004 schrieb Zilvinas Valinskas um 19:16:
> On Mon, Sep 13, 2004 at 01:12:13PM -0400, Jeff Garzik wrote:
> > I'm totally blind, because I don't see your network driver in that big 
> > list of modules.
> > 
> > Your network driver should probably be doing dev_kfree_skb_any() 
> > somewhere, but isn't.
> > 
> > 	Jeff
> > 
> It is compiled in, see :
> 
> CONFIG_E100=y
> CONFIG_E100_NAPI=y
> 
> Can it be IPsec related ?

I got a similar problem here, I am running 2.6.9-rc2 with acpi patch. I
got an e1000, ipsec is compiled in, modules loaded, racoon started but
no tunnels configured.

The system freezes when I type apt-get update, in the moment apt-get
tries to connect all the mirrors or resolves them.

I did not see any messages, sysrq was not compiled in, so I cannot check
if it still works.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15  8:25     ` Erik Tews
@ 2004-09-15  9:58       ` Zilvinas Valinskas
  2004-09-15 14:55         ` Ricky Beam
  0 siblings, 1 reply; 19+ messages in thread
From: Zilvinas Valinskas @ 2004-09-15  9:58 UTC (permalink / raw)
  To: Erik Tews; +Cc: Jeff Garzik, linux-kernel mailing list

On Wed, Sep 15, 2004 at 10:25:50AM +0200, Erik Tews wrote:
> Am Mo, den 13.09.2004 schrieb Zilvinas Valinskas um 19:16:
> > On Mon, Sep 13, 2004 at 01:12:13PM -0400, Jeff Garzik wrote:
> > > I'm totally blind, because I don't see your network driver in that big 
> > > list of modules.
> > > 
> > > Your network driver should probably be doing dev_kfree_skb_any() 
> > > somewhere, but isn't.
> > > 
> > > 	Jeff
> > > 
> > It is compiled in, see :
> > 
> > CONFIG_E100=y
> > CONFIG_E100_NAPI=y
> > 
> > Can it be IPsec related ?
> 
> I got a similar problem here, I am running 2.6.9-rc2 with acpi patch. I
> got an e1000, ipsec is compiled in, modules loaded, racoon started but
> no tunnels configured.
> 
> The system freezes when I type apt-get update, in the moment apt-get
> tries to connect all the mirrors or resolves them.
That is the first impression I've got. When I rebooted back to 2.6.9-rc1 
I went through /var/log/kern.log and found messages I sent earlier. 

> 
> I did not see any messages, sysrq was not compiled in, so I cannot check
> if it still works.

In my cases, I've got a DHCP enabled, racoon running. If I set up policies 
via script :

#!/usr/sbin/setkey -f
flush;
spdflush;

spdadd 0.0.0.0 0.0.0.0[500] udp -P out none;
spdadd 0.0.0.0[500] 0.0.0.0 udp -P in none;

spdadd 192.168.3.3 192.168.3.2 any -P out ipsec
        esp/transport//require;

spdadd 192.168.3.2 192.168.3.3 any -P in ipsec
        esp/transport//require;

Mine laptop ip address is 192.168.3.3, and if I have 192.168.3.2 
connecting my laptop freezes ... Last linux kernel I used was 2.6.9-rc1-bk16 
and it was ok. 2.6.9-rc2 freezes laptop ...

Perhaps that is mixture of PREEMPT=y and ipsec ? dunno ...

> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15  9:58       ` Zilvinas Valinskas
@ 2004-09-15 14:55         ` Ricky Beam
  2004-09-15 15:48           ` Lee Revell
  0 siblings, 1 reply; 19+ messages in thread
From: Ricky Beam @ 2004-09-15 14:55 UTC (permalink / raw)
  To: Zilvinas Valinskas; +Cc: Erik Tews, Jeff Garzik, linux-kernel mailing list

On Wed, 15 Sep 2004, Zilvinas Valinskas wrote:
>Perhaps that is mixture of PREEMPT=y and ipsec ? dunno ...

No mixture necessary.  PREEMPT is uber-screwed up.  Try rebuilding your
kernel/modules with it disabled. (make clean first; the kernel deps don't
track CONFIG_PREEMPT correctly.)

--Ricky



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 14:55         ` Ricky Beam
@ 2004-09-15 15:48           ` Lee Revell
  2004-09-15 15:58             ` Jeff Garzik
  0 siblings, 1 reply; 19+ messages in thread
From: Lee Revell @ 2004-09-15 15:48 UTC (permalink / raw)
  To: Ricky Beam
  Cc: Zilvinas Valinskas, Erik Tews, Jeff Garzik, linux-kernel mailing list

On Wed, 2004-09-15 at 10:55, Ricky Beam wrote:
> On Wed, 15 Sep 2004, Zilvinas Valinskas wrote:
> >Perhaps that is mixture of PREEMPT=y and ipsec ? dunno ...
> 
> No mixture necessary.  PREEMPT is uber-screwed up.  Try rebuilding your
> kernel/modules with it disabled. (make clean first; the kernel deps don't
> track CONFIG_PREEMPT correctly.)

Um, PREEMPT works just fine.  Anything that breaks on PREEMPT will also
break on SMP.  And the kernel deps do track CONFIG_PREEMPT correctly.

Maybe you are doing it wrong.

Lee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 15:48           ` Lee Revell
@ 2004-09-15 15:58             ` Jeff Garzik
  2004-09-15 16:06               ` Lee Revell
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Garzik @ 2004-09-15 15:58 UTC (permalink / raw)
  To: Lee Revell
  Cc: Ricky Beam, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

Lee Revell wrote:
> On Wed, 2004-09-15 at 10:55, Ricky Beam wrote:
> 
>>On Wed, 15 Sep 2004, Zilvinas Valinskas wrote:
>>
>>>Perhaps that is mixture of PREEMPT=y and ipsec ? dunno ...
>>
>>No mixture necessary.  PREEMPT is uber-screwed up.  Try rebuilding your
>>kernel/modules with it disabled. (make clean first; the kernel deps don't
>>track CONFIG_PREEMPT correctly.)
> 
> 
> Um, PREEMPT works just fine.  Anything that breaks on PREEMPT will also
> break on SMP.  And the kernel deps do track CONFIG_PREEMPT correctly.


PREEMPT is a hack.  I do not recommend using it on production servers.

	Jeff



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 15:58             ` Jeff Garzik
@ 2004-09-15 16:06               ` Lee Revell
  2004-09-15 16:11                 ` Jeff Garzik
  2004-09-15 16:59                 ` Dave Jones
  0 siblings, 2 replies; 19+ messages in thread
From: Lee Revell @ 2004-09-15 16:06 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Ricky Beam, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

On Wed, 2004-09-15 at 11:58, Jeff Garzik wrote:
> Lee Revell wrote:
> > On Wed, 2004-09-15 at 10:55, Ricky Beam wrote:
> > 
> >>On Wed, 15 Sep 2004, Zilvinas Valinskas wrote:
> >>
> >>>Perhaps that is mixture of PREEMPT=y and ipsec ? dunno ...
> >>
> >>No mixture necessary.  PREEMPT is uber-screwed up.  Try rebuilding your
> >>kernel/modules with it disabled. (make clean first; the kernel deps don't
> >>track CONFIG_PREEMPT correctly.)
> > 
> > 
> > Um, PREEMPT works just fine.  Anything that breaks on PREEMPT will also
> > break on SMP.  And the kernel deps do track CONFIG_PREEMPT correctly.
> 
> 
> PREEMPT is a hack.  I do not recommend using it on production servers.
> 

Not every Linux machine is a server.  Just because you can't bang a
square peg through a round hole does not mean the peg is defective.

Anyway, if you are running anything on your server that breaks under
PREEMPT, it will break anyway as soon as you add another processor.

Lee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 16:06               ` Lee Revell
@ 2004-09-15 16:11                 ` Jeff Garzik
  2004-09-15 16:58                   ` Ricky Beam
  2004-09-15 16:59                 ` Dave Jones
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff Garzik @ 2004-09-15 16:11 UTC (permalink / raw)
  To: Lee Revell
  Cc: Ricky Beam, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

Lee Revell wrote:
> Anyway, if you are running anything on your server that breaks under
> PREEMPT, it will break anyway as soon as you add another processor.

Incorrect.  The spinlock behavior is very different.

That's why we had net stack problems in the past under preempt but not 
under SMP.

	Jeff



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 16:11                 ` Jeff Garzik
@ 2004-09-15 16:58                   ` Ricky Beam
  2004-09-15 17:49                     ` Lee Revell
  0 siblings, 1 reply; 19+ messages in thread
From: Ricky Beam @ 2004-09-15 16:58 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Lee Revell, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

On Wed, 15 Sep 2004, Jeff Garzik wrote:
>Lee Revell wrote:
>> Anyway, if you are running anything on your server that breaks under
>> PREEMPT, it will break anyway as soon as you add another processor.
>
>Incorrect.  The spinlock behavior is very different.

Indeed.  Enable PREEMPT (my default for some time now) and the machine
will lockup after spewing pages of scheduling while atomic's.  Disable
PREEMPT and the machine is stable again:

[jfbeam:pts/2{2}]gir:~/[12:55pm]:uname -a
Linux gir 2.6.9-SMP-rc2+BK@1.1455 #71 SMP BK[20040914173940] Tue Sep 14 16:14:33 EDT 2004 i686 athlon i386 GNU/Linux
[jfbeam:pts/2{2}]gir:~/[12:55pm]:uptime
 12:55pm  up 19:54,  2 users,  load average: 0.01, 0.02, 0.00
[jfbeam:pts/2{2}]gir:~/[12:55pm]:grep ^proc /proc/cpuinfo
processor       : 0
processor       : 1

--Ricky



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 16:06               ` Lee Revell
  2004-09-15 16:11                 ` Jeff Garzik
@ 2004-09-15 16:59                 ` Dave Jones
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Jones @ 2004-09-15 16:59 UTC (permalink / raw)
  To: Lee Revell
  Cc: Jeff Garzik, Ricky Beam, Zilvinas Valinskas, Erik Tews,
	linux-kernel mailing list

On Wed, Sep 15, 2004 at 12:06:48PM -0400, Lee Revell wrote:

 > Anyway, if you are running anything on your server that breaks under
 > PREEMPT, it will break anyway as soon as you add another processor.

Wrong. Code can be SMP safe but not preempt safe.
This is why we have get_cpu()/put_cpu(), and
preempt_disable()/preempt_enable() pairs around certain parts of code.

Anything using per-CPU data like MSRs for example needs explicit
protection against preemption.

		Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 16:58                   ` Ricky Beam
@ 2004-09-15 17:49                     ` Lee Revell
  2004-09-15 17:52                       ` Jeff Garzik
  0 siblings, 1 reply; 19+ messages in thread
From: Lee Revell @ 2004-09-15 17:49 UTC (permalink / raw)
  To: Ricky Beam
  Cc: Jeff Garzik, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

On Wed, 2004-09-15 at 12:58, Ricky Beam wrote:
> On Wed, 15 Sep 2004, Jeff Garzik wrote:
> >Lee Revell wrote:
> >> Anyway, if you are running anything on your server that breaks under
> >> PREEMPT, it will break anyway as soon as you add another processor.
> >
> >Incorrect.  The spinlock behavior is very different.
> 
> Indeed.  Enable PREEMPT (my default for some time now) and the machine
> will lockup after spewing pages of scheduling while atomic's.  Disable
> PREEMPT and the machine is stable again:
> 

Interesting.  Still, this looks like a specific bug that needs fixing,
it doesn't imply that preemption is a hack.  For many workloads
preemption is a necessity.

Lee 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 17:49                     ` Lee Revell
@ 2004-09-15 17:52                       ` Jeff Garzik
  2004-09-15 17:59                         ` Lee Revell
  2004-09-16  8:39                         ` Helge Hafting
  0 siblings, 2 replies; 19+ messages in thread
From: Jeff Garzik @ 2004-09-15 17:52 UTC (permalink / raw)
  To: Lee Revell
  Cc: Ricky Beam, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

Lee Revell wrote:
> Interesting.  Still, this looks like a specific bug that needs fixing,
> it doesn't imply that preemption is a hack.  For many workloads
> preemption is a necessity.


For any workload that you feel preemption is a necessity, that indicates 
a latency problem in the kernel that should be solved.

Preemption is a hack that hides broken drivers, IMHO.

I would rather directly address any latency problems that appear.

	Jeff



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 17:52                       ` Jeff Garzik
@ 2004-09-15 17:59                         ` Lee Revell
  2004-09-16  8:39                         ` Helge Hafting
  1 sibling, 0 replies; 19+ messages in thread
From: Lee Revell @ 2004-09-15 17:59 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Ricky Beam, Zilvinas Valinskas, Erik Tews, linux-kernel mailing list

On Wed, 2004-09-15 at 13:52, Jeff Garzik wrote:
> Lee Revell wrote:
> > Interesting.  Still, this looks like a specific bug that needs fixing,
> > it doesn't imply that preemption is a hack.  For many workloads
> > preemption is a necessity.
> 
> 
> For any workload that you feel preemption is a necessity, that indicates 
> a latency problem in the kernel that should be solved.
> 
> Preemption is a hack that hides broken drivers, IMHO.
> 
> I would rather directly address any latency problems that appear.
> 

Please explain.  I was under the impression that there was a 1:1
correspondence between latency problems and long non-preemptible code
paths.  The latency problem is solved by making the code path
preemptible.

How else are you going to schedule in the high priority process quickly
if you don't preempt something?

Lee 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-15 17:52                       ` Jeff Garzik
  2004-09-15 17:59                         ` Lee Revell
@ 2004-09-16  8:39                         ` Helge Hafting
  2004-09-17  8:05                           ` Zilvinas Valinskas
  1 sibling, 1 reply; 19+ messages in thread
From: Helge Hafting @ 2004-09-16  8:39 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Lee Revell, Ricky Beam, Zilvinas Valinskas, Erik Tews,
	linux-kernel mailing list

Jeff Garzik wrote:

> Lee Revell wrote:
>
>> Interesting.  Still, this looks like a specific bug that needs fixing,
>> it doesn't imply that preemption is a hack.  For many workloads
>> preemption is a necessity.
>
>
>
> For any workload that you feel preemption is a necessity, that 
> indicates a latency problem in the kernel that should be solved.
>
> Preemption is a hack that hides broken drivers, IMHO.
>
> I would rather directly address any latency problems that appear.

Current preempt is broken, sure.  But having robust preempt
would allow code simplification.  Long loops outside critical
sections would be ok - no time or code spent testing for a need for
rescheduling because you'll be preempted when necessary anyway.

Or am I missing something?  Other than that current preempt isn't up to
this and might be hard to get there?

Helge Hafting



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-16  8:39                         ` Helge Hafting
@ 2004-09-17  8:05                           ` Zilvinas Valinskas
  2004-09-17 13:21                             ` Zilvinas Valinskas
  0 siblings, 1 reply; 19+ messages in thread
From: Zilvinas Valinskas @ 2004-09-17  8:05 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Jeff Garzik, Lee Revell, Ricky Beam, Erik Tews,
	linux-kernel mailing list

On Thu, Sep 16, 2004 at 10:39:10AM +0200, Helge Hafting wrote:
> Jeff Garzik wrote:
> 
> >Lee Revell wrote:
> >
> >>Interesting.  Still, this looks like a specific bug that needs fixing,
> >>it doesn't imply that preemption is a hack.  For many workloads
> >>preemption is a necessity.
> >
> >
> >
> >For any workload that you feel preemption is a necessity, that 
> >indicates a latency problem in the kernel that should be solved.
> >
> >Preemption is a hack that hides broken drivers, IMHO.
> >
> >I would rather directly address any latency problems that appear.
> 
> Current preempt is broken, sure.  But having robust preempt
> would allow code simplification.  Long loops outside critical
> sections would be ok - no time or code spent testing for a need for
> rescheduling because you'll be preempted when necessary anyway.

Could be the case. This morning I've turned off PREEMPT support in
linux 2.6.9-rc2 kernel, booted just fine, ran apt-get update ... it
seemed everything is ok. 

Then setup IPsec policies, ping remote end, racoon has tried to negotiate 
with a remote end and ... laptop freezes again (this time without
PREEMPT).

At a time I was in X, couldn't capture the OOPS, after reboot
/var/log/kern.log is empty ... :(

Doesn't seem it is PREEMPT related I think now.
> 
> Or am I missing something?  Other than that current preempt isn't up to
> this and might be hard to get there?
> 
> Helge Hafting
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.9 rc2 freezing
  2004-09-17  8:05                           ` Zilvinas Valinskas
@ 2004-09-17 13:21                             ` Zilvinas Valinskas
  0 siblings, 0 replies; 19+ messages in thread
From: Zilvinas Valinskas @ 2004-09-17 13:21 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Jeff Garzik, Lee Revell, Ricky Beam, Erik Tews,
	linux-kernel mailing list

On Fri, Sep 17, 2004 at 11:05:04AM +0300, Zilvinas Valinskas wrote:
> On Thu, Sep 16, 2004 at 10:39:10AM +0200, Helge Hafting wrote:
> > Jeff Garzik wrote:
> > 
> > >Lee Revell wrote:
> > >
> > >>Interesting.  Still, this looks like a specific bug that needs fixing,
> > >>it doesn't imply that preemption is a hack.  For many workloads
> > >>preemption is a necessity.
> > >
> > >
> > >
> > >For any workload that you feel preemption is a necessity, that 
> > >indicates a latency problem in the kernel that should be solved.
> > >
> > >Preemption is a hack that hides broken drivers, IMHO.
> > >
> > >I would rather directly address any latency problems that appear.
> > 
> > Current preempt is broken, sure.  But having robust preempt
> > would allow code simplification.  Long loops outside critical
> > sections would be ok - no time or code spent testing for a need for
> > rescheduling because you'll be preempted when necessary anyway.
> 
> Could be the case. This morning I've turned off PREEMPT support in
> linux 2.6.9-rc2 kernel, booted just fine, ran apt-get update ... it
> seemed everything is ok. 
> 
> Then setup IPsec policies, ping remote end, racoon has tried to negotiate 
> with a remote end and ... laptop freezes again (this time without
> PREEMPT).
> 
> At a time I was in X, couldn't capture the OOPS, after reboot
> /var/log/kern.log is empty ... :(

Here is backtrace (with PREEMPT turned off) :

bad: scheduling while atomic!
[<c030cd3e>] schedule+-0x446/0x44b
[<c010595b>] do_IRQ+0xdd/0x14b
[<c0103d36>] work_resched+0x5/0x16

this backtrace is repeated 4x times

bad: scheduling while atomic!
[<c030cd3e>] schedule+0x446/0x44b
[<c0112f82>] sys_sched_yield+0x45/0x57
[<c014ceaa>] coredump_wait+0x32/0x97
[<c014cfd7>] do_coredump+0xc8/0x189
[<c0256b44>] complement_pos+0x1e/0x16e
[<c011cb13>] __dequeue_signal+0xc2/0x154
[<c011cbc8>] dequeue_signal+0x23/0x75
[<c011e12e>] get_signal_to_deliver+0x1d4/0x2c0
[<c0103b04>] do_signal+0x8e/0x10d
[<c010595b>] do_IRQ+0xdd/0x14b
[<c0103e7c>] common_interrupt+0x18/0x20
[<c030cb76>] schedule+0x27e/0x44b
[<c010595b>] do_IRQ+0xdd/0x14b
[<c0110f98>] do_page_fault+0x0/0x544
[<c0103bb8>] do_notify_resume+0x35/0x39
[<c0103d5a>] work_notifysig+0x13/0x15
Kernel panic - not syncing: Aiee, killing interrupt handler!

Any ideas ?


> 
> Doesn't seem it is PREEMPT related I think now.
> > 
> > Or am I missing something?  Other than that current preempt isn't up to
> > this and might be hard to get there?
> > 
> > Helge Hafting
> > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-09-17 13:23 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-13 16:55 2.6.9 rc2 freezing Zilvinas Valinskas
2004-09-13 17:12 ` Jeff Garzik
2004-09-13 17:16   ` Zilvinas Valinskas
2004-09-15  8:25     ` Erik Tews
2004-09-15  9:58       ` Zilvinas Valinskas
2004-09-15 14:55         ` Ricky Beam
2004-09-15 15:48           ` Lee Revell
2004-09-15 15:58             ` Jeff Garzik
2004-09-15 16:06               ` Lee Revell
2004-09-15 16:11                 ` Jeff Garzik
2004-09-15 16:58                   ` Ricky Beam
2004-09-15 17:49                     ` Lee Revell
2004-09-15 17:52                       ` Jeff Garzik
2004-09-15 17:59                         ` Lee Revell
2004-09-16  8:39                         ` Helge Hafting
2004-09-17  8:05                           ` Zilvinas Valinskas
2004-09-17 13:21                             ` Zilvinas Valinskas
2004-09-15 16:59                 ` Dave Jones
2004-09-13 17:19   ` Zilvinas Valinskas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).