linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: dbench has intermittent hang on 2.6.0-test1-ac2
@ 2003-07-30 11:46 rwhron
  0 siblings, 0 replies; 4+ messages in thread
From: rwhron @ 2003-07-30 11:46 UTC (permalink / raw)
  To: linux-kernel

Summary:
dbench has been intermittantly not completing on uniprocessor.
I run dbench 10 times.  1 of the ten runs has 1 dbench child
that never gets started.  That child is in sys_pause.

2.6.0-test1 and 2.6.0-test1-mm2 did not hang.

2.6.0-test1-ac1, 2.6.0-test1-ac2, 2.6.0-test2, and
2.6.0-test2-mm1 have hung.  So it seems like a patch
that Alan may have picked up first.

The hang has occurred on ext2, ext3, reiserfs, and xfs,
so filesystem type seems unrelated.

pkill -9 dbench will let the processes continue.

dbench version 2.0.

<sysrq-t> from 2.6.0-test2-mm1 before pkill is at:
http://home.earthlink.net/~rwhron/kernel/minicom.cap

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: dbench has intermittent hang on 2.6.0-test1-ac2
@ 2003-07-28 23:13 rwhron
  0 siblings, 0 replies; 4+ messages in thread
From: rwhron @ 2003-07-28 23:13 UTC (permalink / raw)
  To: linux-kernel

dbench 32 hang on 2.6.0-test2.  /proc/PID/wchan shows
dbench process in sys_pause, /proc/PPID/wchan shows
other dbench in sys_wait4.

kill -CONT on the two dbench PIDs has the child
wchan change to __pdflush, but the processes don't
appear to continue, nor exit.  After waiting a couple
minutes, I did kill on both PIDs and dbench exited.

This was ext2 filesystem.  Previous was ext3 and 
reiserfs.

sysrq t after "kill -CONT" is at:
http://home.earthlink.net/~rwhron/kernel/sysrq.txt

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: dbench has intermittent hang on 2.6.0-test1-ac2
@ 2003-07-25  3:54 rwhron
  0 siblings, 0 replies; 4+ messages in thread
From: rwhron @ 2003-07-25  3:54 UTC (permalink / raw)
  To: linux-kernel

> dbench 64 hung during a run using 2.6.0-test1-ac2 on ext3.

> I saw the same behavior on a dbench 32 run with 2.6.0-test1-ac1
> on reiserfs.

dbench did not hang in 50 runs on 2.6.0-test1 or 50 runs
on 2.6.0-test1-mm2 on the same machine with various filesystems.

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 4+ messages in thread
* dbench has intermittent hang on 2.6.0-test1-ac2
@ 2003-07-25  3:43 rwhron
  0 siblings, 0 replies; 4+ messages in thread
From: rwhron @ 2003-07-25  3:43 UTC (permalink / raw)
  To: linux-kernel

dbench 64 hung during a run using 2.6.0-test1-ac2 on ext3. One
of the dbench processes never created the clients/clientsXX
directory.

The parent dbench-2.0 process continues to update the throughput
measurement and the MB/sec slowly drops.

I saw the same behavior on a dbench 32 run with 2.6.0-test1-ac1
on reiserfs.

ps -ef|grep dbenc[h]
root     12266 11460  0 21:24 pts/0    00:00:00 ./dbench 64
root     12320 12266  0 21:24 pts/0    00:00:00 ./dbench 64

It isn't highly reproduceable.  Of 28 different dbench runs
on 2.6.0-test1-ac[12], only 2 have done this.

Uniprocessor x86 running RedHat 7.3 + patches.

Sysrq T for the dbench processes shows:

dbench        R C010F024 4089854812 12266  11460 12320               (NOTLB)
c909df60 00000082 bffffa58 c010f024 d5439360 d5439360 00000001 fffffe00
       00000000 c0118dda c909c000 00000001 00000000 d5439360 c0113e10 00000000
       00000000 c909dfc4 c010836b c909dfc4 00000000 d5439360 c0113e10 d54394b4
Call Trace:
 [<c010f024>] restore_i387+0x54/0x80
 [<c0118dda>] sys_wait4+0x1ea/0x220
 [<c0113e10>] default_wake_function+0x0/0x20
 [<c010836b>] sys_sigreturn+0x8b/0xc0
 [<c0113e10>] default_wake_function+0x0/0x20
 [<c0108e27>] syscall_call+0x7/0xb

dbench        S C46F3FC4 4028283312 12320  12266                     (NOTLB)
c46f3fb8 00000086 00000000 c46f3fc4 d4546060 0000000b 00000774 00000040
       c46f2000 c0120f14 c0108e27 0000000b 00000000 40013000 00000774 00000040
       bffffb28 0000001d 0000007b 0000007b 0000001d 400c6837 00000073 00000246
Call Trace:
 [<c0120f14>] sys_pause+0x14/0x20
 [<c0108e27>] syscall_call+0x7/0xb


strace -p 12320		# child
pause(


kill 12320
kill -INT 12320

The state changes from S to T.

ps axu|grep dbenc[h]
root     12266  0.0  0.1  1364  424 pts/0    S    21:24   0:00 ./dbench 64
root     12320  0.0  0.0  1360  356 pts/0    T    21:24   0:00 ./dbench 64

cat /proc/12320/wchan
finish_stop

kill -CONT 12320	# child and parent exit

All of <sysrq t> output at:
http://home.earthlink.net/~rwhron/kernel/sysrq-t.txt

config at
http://home.earthlink.net/~rwhron/kernel/config/config-2.6.0-test1-ac2

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-07-30 11:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-30 11:46 dbench has intermittent hang on 2.6.0-test1-ac2 rwhron
  -- strict thread matches above, loose matches on Subject: below --
2003-07-28 23:13 rwhron
2003-07-25  3:54 rwhron
2003-07-25  3:43 rwhron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).