All of lore.kernel.org
 help / color / mirror / Atom feed
* hung task in iozone test on nfs client
@ 2011-02-04 21:36 Jim Rees
  2011-02-05  0:24 ` Trond Myklebust
  2011-02-10 14:54 ` Jim Rees
  0 siblings, 2 replies; 4+ messages in thread
From: Jim Rees @ 2011-02-04 21:36 UTC (permalink / raw)
  To: linux-nfs; +Cc: peter honeyman

I have a report here of iozone hanging when run on nfs4 client against an
EMC server.  We have reproduced this problem with a wide range of client
kernel versions, from 2.6.33.3-85.fc13.x86_64 up to
2.6.38-0.rc3.git2.1.pnfs_wave3_20110203.fc15.x86_64, and on both 4.0 and
4.1.  It seems to happen only with heavy multi-threaded iozone testing with
big files.  The iozone is something like this:

iozone -r 2m -s 256m -w -W -c -t 12 -i 0 -o

The call trace is usually something like this:

[<ffffffff810c1314>] ? sync_page+0x0/0x45
[<ffffffff814297bc>] io_schedule+0x6e/0xb0
[<ffffffff810c1355>] sync_page+0x41/0x45
[<ffffffff81429cf8>] __wait_on_bit+0x43/0x76
[<ffffffff810c14ae>] wait_on_page_bit+0x6d/0x74
[<ffffffff8106484b>] ? wake_bit_function+0x0/0x2e
[<ffffffff810c94c0>] ? pagevec_lookup_tag+0x20/0x29
[<ffffffff810c1751>] filemap_fdatawait_range+0x9f/0x173
[<ffffffff810c18ce>] filemap_write_and_wait_range+0x3e/0x51
[<ffffffff8111fa53>] vfs_fsync_range+0x5a/0xad
[<ffffffff8111faf9>] generic_write_sync+0x53/0x55
[<ffffffff810c1d4b>] generic_file_aio_write+0x86/0xa2
[<ffffffffa0321bf8>] nfs_file_write+0xed/0x169 [nfs]
[<ffffffff811017c5>] do_sync_write+0xbf/0xfc
[<ffffffff810f4dc9>] ? __slab_free+0x28/0x22e
[<ffffffff81204b7d>] ? might_fault+0x1c/0x1e
[<ffffffff811be22b>] ? security_file_permission+0x11/0x13
[<ffffffff81101d23>] vfs_write+0xa9/0x106
[<ffffffff81101e36>] sys_write+0x45/0x69
[<ffffffff81009b02>] system_call_fastpath+0x16/0x1b

I have a pcap file here but it's 8GB.  I am trying to distill it to the
important parts.

Those of you who are familiar with the page cache, is there any obvious
deadlock here that jumps out at you?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hung task in iozone test on nfs client
  2011-02-04 21:36 hung task in iozone test on nfs client Jim Rees
@ 2011-02-05  0:24 ` Trond Myklebust
  2011-02-07 20:41   ` Jim Rees
  2011-02-10 14:54 ` Jim Rees
  1 sibling, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2011-02-05  0:24 UTC (permalink / raw)
  To: Jim Rees; +Cc: linux-nfs, peter honeyman

On Fri, 2011-02-04 at 16:36 -0500, Jim Rees wrote: 
> I have a report here of iozone hanging when run on nfs4 client against an
> EMC server.  We have reproduced this problem with a wide range of client
> kernel versions, from 2.6.33.3-85.fc13.x86_64 up to
> 2.6.38-0.rc3.git2.1.pnfs_wave3_20110203.fc15.x86_64, and on both 4.0 and
> 4.1.  It seems to happen only with heavy multi-threaded iozone testing with
> big files.  The iozone is something like this:
> 
> iozone -r 2m -s 256m -w -W -c -t 12 -i 0 -o
> 
> The call trace is usually something like this:
> 
> [<ffffffff810c1314>] ? sync_page+0x0/0x45
> [<ffffffff814297bc>] io_schedule+0x6e/0xb0
> [<ffffffff810c1355>] sync_page+0x41/0x45
> [<ffffffff81429cf8>] __wait_on_bit+0x43/0x76
> [<ffffffff810c14ae>] wait_on_page_bit+0x6d/0x74
> [<ffffffff8106484b>] ? wake_bit_function+0x0/0x2e
> [<ffffffff810c94c0>] ? pagevec_lookup_tag+0x20/0x29
> [<ffffffff810c1751>] filemap_fdatawait_range+0x9f/0x173
> [<ffffffff810c18ce>] filemap_write_and_wait_range+0x3e/0x51
> [<ffffffff8111fa53>] vfs_fsync_range+0x5a/0xad
> [<ffffffff8111faf9>] generic_write_sync+0x53/0x55
> [<ffffffff810c1d4b>] generic_file_aio_write+0x86/0xa2
> [<ffffffffa0321bf8>] nfs_file_write+0xed/0x169 [nfs]
> [<ffffffff811017c5>] do_sync_write+0xbf/0xfc
> [<ffffffff810f4dc9>] ? __slab_free+0x28/0x22e
> [<ffffffff81204b7d>] ? might_fault+0x1c/0x1e
> [<ffffffff811be22b>] ? security_file_permission+0x11/0x13
> [<ffffffff81101d23>] vfs_write+0xa9/0x106
> [<ffffffff81101e36>] sys_write+0x45/0x69
> [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
> 
> I have a pcap file here but it's 8GB.  I am trying to distill it to the
> important parts.
> 
> Those of you who are familiar with the page cache, is there any obvious
> deadlock here that jumps out at you?

The above just tells you that something is waiting for the PG_writeback
lock (IOW: it is waiting for a writeback of the page to the server to
complete). It doesn't actually tell you why that page writeback is
failing to complete.

Can you send us the output of 'dmesg' after you do

   echo 0 >/proc/sys/sunrpc/rpc_debug

as root? The 'echo' command needs to be done during the hang.

Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hung task in iozone test on nfs client
  2011-02-05  0:24 ` Trond Myklebust
@ 2011-02-07 20:41   ` Jim Rees
  0 siblings, 0 replies; 4+ messages in thread
From: Jim Rees @ 2011-02-07 20:41 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs, peter honeyman

[-- Attachment #1: Type: text/plain, Size: 2303 bytes --]

Trond Myklebust wrote:

  On Fri, 2011-02-04 at 16:36 -0500, Jim Rees wrote: 
  > I have a report here of iozone hanging when run on nfs4 client against an
  > EMC server.  We have reproduced this problem with a wide range of client
  > kernel versions, from 2.6.33.3-85.fc13.x86_64 up to
  > 2.6.38-0.rc3.git2.1.pnfs_wave3_20110203.fc15.x86_64, and on both 4.0 and
  > 4.1.  It seems to happen only with heavy multi-threaded iozone testing with
  > big files.  The iozone is something like this:
  > 
  > iozone -r 2m -s 256m -w -W -c -t 12 -i 0 -o
  > 
  > The call trace is usually something like this:
  > 
  > [<ffffffff810c1314>] ? sync_page+0x0/0x45
  > [<ffffffff814297bc>] io_schedule+0x6e/0xb0
  > [<ffffffff810c1355>] sync_page+0x41/0x45
  > [<ffffffff81429cf8>] __wait_on_bit+0x43/0x76
  > [<ffffffff810c14ae>] wait_on_page_bit+0x6d/0x74
  > [<ffffffff8106484b>] ? wake_bit_function+0x0/0x2e
  > [<ffffffff810c94c0>] ? pagevec_lookup_tag+0x20/0x29
  > [<ffffffff810c1751>] filemap_fdatawait_range+0x9f/0x173
  > [<ffffffff810c18ce>] filemap_write_and_wait_range+0x3e/0x51
  > [<ffffffff8111fa53>] vfs_fsync_range+0x5a/0xad
  > [<ffffffff8111faf9>] generic_write_sync+0x53/0x55
  > [<ffffffff810c1d4b>] generic_file_aio_write+0x86/0xa2
  > [<ffffffffa0321bf8>] nfs_file_write+0xed/0x169 [nfs]
  > [<ffffffff811017c5>] do_sync_write+0xbf/0xfc
  > [<ffffffff810f4dc9>] ? __slab_free+0x28/0x22e
  > [<ffffffff81204b7d>] ? might_fault+0x1c/0x1e
  > [<ffffffff811be22b>] ? security_file_permission+0x11/0x13
  > [<ffffffff81101d23>] vfs_write+0xa9/0x106
  > [<ffffffff81101e36>] sys_write+0x45/0x69
  > [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
  > 
  > I have a pcap file here but it's 8GB.  I am trying to distill it to the
  > important parts.
  > 
  > Those of you who are familiar with the page cache, is there any obvious
  > deadlock here that jumps out at you?
  
  The above just tells you that something is waiting for the PG_writeback
  lock (IOW: it is waiting for a writeback of the page to the server to
  complete). It doesn't actually tell you why that page writeback is
  failing to complete.
  
  Can you send us the output of 'dmesg' after you do
  
     echo 0 >/proc/sys/sunrpc/rpc_debug
  
  as root? The 'echo' command needs to be done during the hang.

Attached.

[-- Attachment #2: hang-rpcdebug-dmesg.txt --]
[-- Type: text/plain, Size: 9742 bytes --]

r38-13-78: [ 1999.083961] -pid- flgs status -client- --rqstp- -timeout ---ops--
r38-13-78: [ 1999.083987] 34411 0001      0 ffff880247b2e800 ffff8802550e6930    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084004] 34412 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084021] 34413 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084036] 34414 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084050] 34415 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084064] 34416 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084078] 34417 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084092] 34418 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084105] 34419 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084119] 34420 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084137] 34421 0001      0 ffff880247b2e800 ffff8802550e6dc8    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084153] 34422 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084167] 34423 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084181] 34424 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084195] 34425 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084209] 34426 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084223] 34427 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084236] 34428 0001      0 ffff880247b2e800 ffff8802550e7260    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084250] 34429 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084262] 34430 0001      0 ffff880247b2e800 ffff8802550e6498    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084276] 34431 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084288] 34432 0001      0 ffff880247b2e800 ffff8802550e6620    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084301] 34433 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084316] 34434 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084330] 34435 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084344] 34436 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084357] 34437 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084369] 34438 0001    -11 ffff880247b2e800 ffff8802550e6188    60000 ffffffffa03c3810 nfsv4 WRITE a:call_transmit_status q:xprt_pending
r38-13-78: [ 1999.084402] 34439 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084419] 34440 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084433] 34441 0001      0 ffff880247b2e800 ffff8802550e6c40    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084449] 34442 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084465] 34443 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084481] 34444 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084498] 34445 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084514] 34446 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084529] 34447 0001      0 ffff880247b2e800 ffff8802550e73e8    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084545] 34448 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084561] 34449 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084577] 34450 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084592] 34451 0001      0 ffff880247b2e800 ffff8802550e70d8    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084608] 34452 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084625] 34453 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084642] 34454 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084658] 34455 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084674] 34456 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084690] 34457 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084706] 34458 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084722] 34459 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084738] 34460 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084754] 34461 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084770] 34462 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084786] 34463 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084802] 34464 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084817] 34465 0001    -11 ffff880247b2e800 ffff8802550e6f50        0 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_sending
r38-13-78: [ 1999.084833] 34466 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084848] 34467 0001      0 ffff880247b2e800 ffff8802550e7570    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084864] 34468 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084878] 34469 0001    -11 ffff880247b2e800 ffff8802550e6310        0 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_sending
r38-13-78: [ 1999.084894] 34470 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-13-78: [ 1999.084910] 34471 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084925] 34472 0001      0 ffff880247b2e800 ffff8802550e76f8    60000 ffffffffa03c3810 nfsv4 WRITE a:call_status q:xprt_pending
r38-13-78: [ 1999.084943] 34473 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084959] 34474 0001      0 ffff880247b2e800   (null)        0 ffffffffa03c3810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-13-78: [ 1999.084974] 18306 0280      0 ffff880253a4a000 ffff8802550e6ab8    60000 ffffffffa03cdb80 nfsv4 SEQUENCE a:call_status q:xprt_pending

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hung task in iozone test on nfs client
  2011-02-04 21:36 hung task in iozone test on nfs client Jim Rees
  2011-02-05  0:24 ` Trond Myklebust
@ 2011-02-10 14:54 ` Jim Rees
  1 sibling, 0 replies; 4+ messages in thread
From: Jim Rees @ 2011-02-10 14:54 UTC (permalink / raw)
  To: linux-nfs; +Cc: peter honeyman

Jim Rees wrote:

  I have a report here of iozone hanging when run on nfs4 client against an
  EMC server.  We have reproduced this problem with a wide range of client
  kernel versions, from 2.6.33.3-85.fc13.x86_64 up to
  2.6.38-0.rc3.git2.1.pnfs_wave3_20110203.fc15.x86_64, and on both 4.0 and
  4.1.  It seems to happen only with heavy multi-threaded iozone testing with
  big files.

I have a bit more info on this now.  Version of iozone is v3.300, and this
is the command:

/opt/iozone/bin/iozone -r 2m -s 256m -w -W -c -t 12 -i 0 -+m /root/iozone_sw.4x3 -o -+n
cat /root/iozone_sw.4x3
r38-13-78 /ufs10/sw_1 /opt/iozone/bin/iozone
r38-14-79 /ufs10/sw_2 /opt/iozone/bin/iozone   
r38-15-80 /ufs10/sw_3 /opt/iozone/bin/iozone
r38-16-81 /ufs10/sw_4 /opt/iozone/bin/iozone       
r38-13-78 /ufs10/sw_5 /opt/iozone/bin/iozone
r38-14-79 /ufs10/sw_6 /opt/iozone/bin/iozone
r38-15-80 /ufs10/sw_7 /opt/iozone/bin/iozone
r38-16-81 /ufs10/sw_8 /opt/iozone/bin/iozone
r38-13-78 /ufs10/sw_9 /opt/iozone/bin/iozone
r38-14-79 /ufs10/sw_10 /opt/iozone/bin/iozone
r38-15-80 /ufs10/sw_11 /opt/iozone/bin/iozone
r38-16-81 /ufs10/sw_12 /opt/iozone/bin/iozone

The hang happens reliably, but might take anywhere from one to eight runs
before it happens.  There is no shortage of memory, the clients have about
1GB free out of 8GB.  Here's the stack trace and rpc debug:

Jan 31 15:57:17 r38-15-80 kernel: INFO: task iozone:3587 blocked for more than 120 seconds.
Jan 31 15:57:17 r38-15-80 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 31 15:57:17 r38-15-80 kernel: iozone        D 0000000000000000     0  3587      1 0x00000000
Jan 31 15:57:17 r38-15-80 kernel: ffff880245d53ae8 0000000000000082 0000000000000000 ffff880242560908
Jan 31 15:57:17 r38-15-80 kernel: ffff880242560908 ffff8801e9dd29b0 0000000000000000 ffff880245d53fd8
Jan 31 15:57:17 r38-15-80 kernel: ffff880245d53fd8 000000000000f9b0 00000000000157c0 ffff880245c73268
Jan 31 15:57:17 r38-15-80 kernel: Call Trace:
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c1314>] ? sync_page+0x0/0x45
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff814297bc>] io_schedule+0x6e/0xb0
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c1355>] sync_page+0x41/0x45
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff81429cf8>] __wait_on_bit+0x43/0x76
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c14ae>] wait_on_page_bit+0x6d/0x74
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff8106484b>] ? wake_bit_function+0x0/0x2e
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c94c0>] ? pagevec_lookup_tag+0x20/0x29
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c1751>] filemap_fdatawait_range+0x9f/0x173
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c18ce>] filemap_write_and_wait_range+0x3e/0x51        
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff8111fa53>] vfs_fsync_range+0x5a/0xad           
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff8111faf9>] generic_write_sync+0x53/0x55
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810c1d4b>] generic_file_aio_write+0x86/0xa2
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffffa0321bf8>] nfs_file_write+0xed/0x169 [nfs]
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff811017c5>] do_sync_write+0xbf/0xfc
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff810f4dc9>] ? __slab_free+0x28/0x22e
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff81204b7d>] ? might_fault+0x1c/0x1e
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff811be22b>] ? security_file_permission+0x11/0x13
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff81101d23>] vfs_write+0xa9/0x106
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff81101e36>] sys_write+0x45/0x69
Jan 31 15:57:17 r38-15-80 kernel: [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
Jan 31 15:59:17 r38-15-80 kernel: INFO: task iozone:3587 blocked for more than 120 seconds.
Jan 31 15:59:17 r38-15-80 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

r38-14-79: [ 1999.076511] -pid- flgs status -client- --rqstp- -timeout ---ops--
r38-14-79: [ 1999.076540] 34414 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076554] 34415 0001      0 ffff8802456c4a00 ffff880251e1d260    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.076570] 34416 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076585] 34417 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076601] 34418 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076615] 34419 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076629] 34420 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076643] 34421 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076658] 34422 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076672] 34423 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076686] 34424 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076699] 34425 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076713] 34426 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076727] 34427 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076739] 34428 0001      0 ffff8802456c4a00 ffff880251e1d0d8    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.076753] 34429 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076766] 34430 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076782] 34431 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-14-79: [ 1999.076796] 34432 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076810] 34433 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076824] 34434 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076837] 34435 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-14-79: [ 1999.076851] 34436 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076865] 34437 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076878] 34438 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076892] 34439 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076906] 34440 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076920] 34441 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076934] 34442 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076948] 34443 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076960] 34444 0001      0 ffff8802456c4a00 ffff880251e1c000    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.076971] 34445 0001      0 ffff8802456c4a00 ffff880251e1cc40    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.076985] 34446 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.076999] 34447 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077018] 34448 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077032] 34449 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077045] 34450 0001    -11 ffff8802456c4a00 ffff880251e1c930    60000 ffffffffa03c9810 nfsv4 WRITE a:call_transmit_status q:xprt_pending
r38-14-79: [ 1999.077059] 34451 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077073] 34452 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-14-79: [ 1999.077086] 34453 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:NFS client
r38-14-79: [ 1999.077098] 34454 0001      0 ffff8802456c4a00 ffff880251e1d570    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077110] 34455 0001      0 ffff8802456c4a00 ffff880251e1d6f8    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077122] 34456 0001      0 ffff8802456c4a00 ffff880251e1c310    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077135] 34457 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077147] 34458 0001    -11 ffff8802456c4a00 ffff880251e1c188        0 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_sending
r38-14-79: [ 1999.077161] 34459 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077173] 34460 0001      0 ffff8802456c4a00 ffff880251e1cdc8    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077184] 34461 0001      0 ffff8802456c4a00 ffff880251e1cf50    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077198] 34462 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077212] 34463 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077223] 34464 0001      0 ffff8802456c4a00 ffff880251e1cab8    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077237] 34465 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077283] 34466 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077307] 34467 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077332] 34468 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077356] 34469 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077380] 34470 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077404] 34471 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077426] 34472 0001    -11 ffff8802456c4a00 ffff880251e1d3e8        0 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_sending
r38-14-79: [ 1999.077473] 34473 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077495] 34474 0001    -11 ffff8802456c4a00 ffff880251e1c498        0 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_sending
r38-14-79: [ 1999.077519] 34475 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077541] 34476 0001      0 ffff8802456c4a00 ffff880251e1c7a8    60000 ffffffffa03c9810 nfsv4 WRITE a:call_status q:xprt_pending
r38-14-79: [ 1999.077587] 34477 0001      0 ffff8802456c4a00   (null)        0 ffffffffa03c9810 nfsv4 WRITE a:rpc_prepare_task q:ForeChannel Slot table
r38-14-79: [ 1999.077633] 17107 0280      0 ffff880243a8f800 ffff880251e1c620    60000 ffffffffa03d3b80 nfsv4 SEQUENCE a:call_status q:xprt_pending

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-02-10 14:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-04 21:36 hung task in iozone test on nfs client Jim Rees
2011-02-05  0:24 ` Trond Myklebust
2011-02-07 20:41   ` Jim Rees
2011-02-10 14:54 ` Jim Rees

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.