* Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
@ 2005-02-15 14:56 Ralf Hildebrandt
2005-02-16 15:33 ` Jan Kara
0 siblings, 1 reply; 10+ messages in thread
From: Ralf Hildebrandt @ 2005-02-15 14:56 UTC (permalink / raw)
To: linux-kernel
Today our mailserver froze after just one day of uptime. I was able to
capture the Oops on the screen using my digital camera:
http://www.stahl.bau.tu-bs.de/~hildeb/bugreport/
Keywords: EIP is at journal_commit_transaction, process kjournald
# mount
/dev/cciss/c0d0p6 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/cciss/c0d0p5 on /boot type ext3 (rw)
/dev/shm on /var/amavis type tmpfs (rw,noatime,size=200m,mode=770,uid=104,gid=108)
--
Ralf Hildebrandt (i.A. des IT-Zentrum) Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to spamtrap@charite.de
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-15 14:56 Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction) Ralf Hildebrandt
@ 2005-02-16 15:33 ` Jan Kara
2005-02-16 20:04 ` Ralf Hildebrandt
0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2005-02-16 15:33 UTC (permalink / raw)
To: linux-kernel
Hello,
> Today our mailserver froze after just one day of uptime. I was able to
> capture the Oops on the screen using my digital camera:
>
> http://www.stahl.bau.tu-bs.de/~hildeb/bugreport/
>
> Keywords: EIP is at journal_commit_transaction, process kjournald
I guess the system is SMP... Sadly a few lines in the beginning of the
report are missing (probably scrolled off the screen) but it seems
similar like a several other oopses I've seen reported recently. Is this
the first time you hit this bug?
> # mount
> /dev/cciss/c0d0p6 on / type ext3 (rw,errors=remount-ro)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> tmpfs on /dev/shm type tmpfs (rw)
> /dev/cciss/c0d0p5 on /boot type ext3 (rw)
> /dev/shm on /var/amavis type tmpfs (rw,noatime,size=200m,mode=770,uid=104,gid=108)
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-16 15:33 ` Jan Kara
@ 2005-02-16 20:04 ` Ralf Hildebrandt
2005-02-16 21:54 ` Dale Blount
0 siblings, 1 reply; 10+ messages in thread
From: Ralf Hildebrandt @ 2005-02-16 20:04 UTC (permalink / raw)
To: linux-kernel
* Jan Kara <jack@suse.cz>:
> I guess the system is SMP...
Indeed it is. Dual Xeon with SMP.
> Sadly a few lines in the beginning of the
> report are missing (probably scrolled off the screen)
Yes, this sucks. I rebooted with vesafb active, no I do have 50 lines :)
> but it seems similar like a several other oopses I've seen reported
> recently. Is this the first time you hit this bug?
It's actually the second time. The first time it hit the SAME box but
with kernel-2.6.10 (vanilla) after 30 days of uptime. Nobody had a
camera at hand, so I couldn't take a photo.
Any suggestions? I'm open to suggestions. One difference between the
2.6.10 and 2.6.10-ac12 was that 2.6.10 has no in-kernel irq
balancing, while in 2.6.10-ac12 I acivated that.
--
Ralf Hildebrandt (i.A. des IT-Zentrum) Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to spamtrap@charite.de
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-16 20:04 ` Ralf Hildebrandt
@ 2005-02-16 21:54 ` Dale Blount
2005-02-16 22:00 ` Ralf Hildebrandt
2005-02-16 22:55 ` Andrew Morton
0 siblings, 2 replies; 10+ messages in thread
From: Dale Blount @ 2005-02-16 21:54 UTC (permalink / raw)
To: Ralf Hildebrandt; +Cc: linux-kernel
On Wed, 2005-02-16 at 21:04 +0100, Ralf Hildebrandt wrote:
> * Jan Kara <jack@suse.cz>:
>
> > I guess the system is SMP...
>
> Indeed it is. Dual Xeon with SMP.
>
This looks very similar (at least to me) to an OOPS I posted with 2.6.9
on 12/03/2004.
http://marc.theaimsgroup.com/?l=linux-kernel&m=110210705504716&w=2
My system is also a dual Xeon using SMP and Hyperthreading
(/proc/cpuinfo shows 4 cpus).
Mine, like Ralf's, is also a mail server running postfix using ext3 for
the spool directory.
> > but it seems similar like a several other oopses I've seen reported
> > recently. Is this the first time you hit this bug?
>
> It's actually the second time. The first time it hit the SAME box but
> with kernel-2.6.10 (vanilla) after 30 days of uptime. Nobody had a
> camera at hand, so I couldn't take a photo.
>
I've actually hit this bug (assuming it's the same) with 2.6.10 also. I
had to power cycle remotely and unfortunately didn't have the serial
console logging enabled when it happened with 2.6.10. I upgraded from
2.4.23 to 2.6.8.1 and crashed within a week, and continued to crash at
least monthly after that. It had been running 2.4.23 for 200+ days with
no problems.
Hope this helps trace it back.
Dale
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-16 21:54 ` Dale Blount
@ 2005-02-16 22:00 ` Ralf Hildebrandt
2005-02-16 22:55 ` Andrew Morton
1 sibling, 0 replies; 10+ messages in thread
From: Ralf Hildebrandt @ 2005-02-16 22:00 UTC (permalink / raw)
To: linux-kernel
* Dale Blount <linux-kernel@dale.us>:
> This looks very similar (at least to me) to an OOPS I posted with 2.6.9
> on 12/03/2004.
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110210705504716&w=2
Could be.
> My system is also a dual Xeon using SMP and Hyperthreading
> (/proc/cpuinfo shows 4 cpus).
Same system here.
> Mine, like Ralf's, is also a mail server running postfix using ext3 for
> the spool directory.
Same here.
> I've actually hit this bug (assuming it's the same) with 2.6.10 also. I
> had to power cycle remotely and unfortunately didn't have the serial
> console logging enabled when it happened with 2.6.10. I upgraded from
> 2.4.23 to 2.6.8.1 and crashed within a week, and continued to crash at
> least monthly after that. It had been running 2.4.23 for 200+ days with
> no problems.
>
> Hope this helps trace it back.
Me too
--
Ralf Hildebrandt (i.A. des IT-Zentrum) Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to spamtrap@charite.de
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-16 21:54 ` Dale Blount
2005-02-16 22:00 ` Ralf Hildebrandt
@ 2005-02-16 22:55 ` Andrew Morton
2005-02-17 10:58 ` Ralf Hildebrandt
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2005-02-16 22:55 UTC (permalink / raw)
To: Dale Blount; +Cc: Ralf.Hildebrandt, linux-kernel
Dale Blount <linux-kernel@dale.us> wrote:
>
> This looks very similar (at least to me) to an OOPS I posted with 2.6.9
> on 12/03/2004.
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110210705504716&w=2
There have been a handful of reports - there's surely a race in there.
Unfortunately I've yet to see a report from which we can identify the
offending line in the very large journal_commit_transaction() function.
The best way to do that is to ensure that the kernel was built with
CONFIG_DEBUG_INFO, note the offending EIP value, then do
# gdb vmlinux
(gdb) l *0xc0<whatever>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-16 22:55 ` Andrew Morton
@ 2005-02-17 10:58 ` Ralf Hildebrandt
2005-02-17 13:21 ` Ralf Hildebrandt
0 siblings, 1 reply; 10+ messages in thread
From: Ralf Hildebrandt @ 2005-02-17 10:58 UTC (permalink / raw)
To: Dale Blount, linux-kernel
* Andrew Morton <akpm@osdl.org>:
> There have been a handful of reports - there's surely a race in there.
>
> Unfortunately I've yet to see a report from which we can identify the
> offending line in the very large journal_commit_transaction() function.
:(
>
> The best way to do that is to ensure that the kernel was built with
> CONFIG_DEBUG_INFO, note the offending EIP value, then do
>
> # gdb vmlinux
> (gdb) l *0xc0<whatever>
I'm rebuilding the ac12 kernel which crashed on me after just one day
and will reboot it today.
--
Ralf Hildebrandt (i.A. des IT-Zentrum) Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to spamtrap@charite.de
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-17 10:58 ` Ralf Hildebrandt
@ 2005-02-17 13:21 ` Ralf Hildebrandt
2005-02-17 15:51 ` Randy.Dunlap
0 siblings, 1 reply; 10+ messages in thread
From: Ralf Hildebrandt @ 2005-02-17 13:21 UTC (permalink / raw)
To: Dale Blount, linux-kernel
* Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>:
> > The best way to do that is to ensure that the kernel was built with
> > CONFIG_DEBUG_INFO, note the offending EIP value, then do
> >
> > # gdb vmlinux
> > (gdb) l *0xc0<whatever>
>
> I'm rebuilding the ac12 kernel which crashed on me after just one day
> and will reboot it today.
Is it normal that the kernel with debugging enabled is not larger than
the normal kernel?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-17 13:21 ` Ralf Hildebrandt
@ 2005-02-17 15:51 ` Randy.Dunlap
2005-02-17 16:00 ` Ralf Hildebrandt
0 siblings, 1 reply; 10+ messages in thread
From: Randy.Dunlap @ 2005-02-17 15:51 UTC (permalink / raw)
To: Ralf Hildebrandt; +Cc: Dale Blount, linux-kernel
Ralf Hildebrandt wrote:
> * Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>:
>
>
>>>The best way to do that is to ensure that the kernel was built with
>>>CONFIG_DEBUG_INFO, note the offending EIP value, then do
>>>
>>># gdb vmlinux
>>>(gdb) l *0xc0<whatever>
>>
>>I'm rebuilding the ac12 kernel which crashed on me after just one day
>>and will reboot it today.
>
>
> Is it normal that the kernel with debugging enabled is not larger than
> the normal kernel?
> -
No, it should be much larger. Recheck the .config file
for CONFIG_DEBUG_INFO=y. Maybe you need to do 'make clean'
first.
--
~Randy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction)
2005-02-17 15:51 ` Randy.Dunlap
@ 2005-02-17 16:00 ` Ralf Hildebrandt
0 siblings, 0 replies; 10+ messages in thread
From: Ralf Hildebrandt @ 2005-02-17 16:00 UTC (permalink / raw)
To: Dale Blount, linux-kernel
* Randy.Dunlap <rddunlap@osdl.org>:
> >Is it normal that the kernel with debugging enabled is not larger than
> >the normal kernel?
> >-
>
> No, it should be much larger. Recheck the .config file
> for CONFIG_DEBUG_INFO=y. Maybe you need to do 'make clean'
> first.
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_INFO=y
# CONFIG_FRAME_POINTER is not set
CONFIG_EARLY_PRINTK=y
I built that using "make-kpkg"
make-kpkg clean
CONCURRENCY_LEVEL=4 MAKEFLAGS="CC=gcc-3.4" make-kpkg --revision=20050217 kernel_image
--
Ralf Hildebrandt (i.A. des IT-Zentrum) Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to spamtrap@charite.de
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-02-17 16:00 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-02-15 14:56 Oops in 2.6.10-ac12 in kjournald (journal_commit_transaction) Ralf Hildebrandt
2005-02-16 15:33 ` Jan Kara
2005-02-16 20:04 ` Ralf Hildebrandt
2005-02-16 21:54 ` Dale Blount
2005-02-16 22:00 ` Ralf Hildebrandt
2005-02-16 22:55 ` Andrew Morton
2005-02-17 10:58 ` Ralf Hildebrandt
2005-02-17 13:21 ` Ralf Hildebrandt
2005-02-17 15:51 ` Randy.Dunlap
2005-02-17 16:00 ` Ralf Hildebrandt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).