linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [2.6.37] btrfs-transac hanging in prepare_to_wait
@ 2011-02-23 10:40 Christian Schmidt
  2011-03-13 17:52 ` Christian Schmidt
  0 siblings, 1 reply; 2+ messages in thread
From: Christian Schmidt @ 2011-02-23 10:40 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 858 bytes --]

Hi,

After a few weeks of testing and preparation I commissioned a new NFS
server with btrfs for the main storage. I ran into two situations where
the btrfs locked up and I had to hard reboot the machine (sysrq-b).
I end up with btrfs-transac in state D, waiting for the pending
transaction to be completed if I interpret the code right. On top of
that all eight nfsds are in state D waiting to start several different
transactions.
I have attached the sysrq-t output after I killed all processes I could
before rebooting.

It only seems to happen with somewhat heavier IO load, in this case one
process md5summing large files (a few TB in total) while another process
tries to write to the NFS share. I never saw it e.g. while copying
single files onto the file system or reading multiple files.

I'll be glad for any hints and recommendations.

Christian


[-- Attachment #2: dmesg.bz2 --]
[-- Type: application/x-bzip2, Size: 15548 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: btrfs-transac hanging in prepare_to_wait
  2011-02-23 10:40 [2.6.37] btrfs-transac hanging in prepare_to_wait Christian Schmidt
@ 2011-03-13 17:52 ` Christian Schmidt
  0 siblings, 0 replies; 2+ messages in thread
From: Christian Schmidt @ 2011-03-13 17:52 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1847 bytes --]

Hi,

In between I tested with 2.6.38rc6 - no hangs there, but extreme
slowness (copying with ~2MB/s) and periodic zero activity (up to 3
minutes) with programs trying to write to the btrfs. Since I saw very
high CPU utilization in the raid6 (md) code I suspect a problem there.

However, because that behavior didn't seem acceptable as well, I patched
a 2.6.37.3 vanilla kernel with the latest btrfs-unstable. The
performance was back, but it took ~16 hours until the lockup occurred,
the btrfs is inaccessible again. The usage scenario right at that point
was 4 threads writing to the btrfs via NFS with ~2MB/s each.

This time, btrfs-transac itself went into D state, same with all the
nfsd and a "touch" I placed to verify the btrfs lockup. Attached a dmesg
of sysrq-t.

Does anyone have any ideas how to debug this - timeout detection,
in-memory data structure dumps, etc?

Regards,
Christian

On 02/23/2011 11:40 AM, Christian Schmidt wrote:
> Hi,
> 
> After a few weeks of testing and preparation I commissioned a new NFS
> server with btrfs for the main storage. I ran into two situations where
> the btrfs locked up and I had to hard reboot the machine (sysrq-b).
> I end up with btrfs-transac in state D, waiting for the pending
> transaction to be completed if I interpret the code right. On top of
> that all eight nfsds are in state D waiting to start several different
> transactions.
> I have attached the sysrq-t output after I killed all processes I could
> before rebooting.
> 
> It only seems to happen with somewhat heavier IO load, in this case one
> process md5summing large files (a few TB in total) while another process
> tries to write to the NFS share. I never saw it e.g. while copying
> single files onto the file system or reading multiple files.
> 
> I'll be glad for any hints and recommendations.
> 
> Christian
> 

[-- Attachment #2: dmesg.201103131826.bz2 --]
[-- Type: application/x-bzip2, Size: 27475 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-03-13 17:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-23 10:40 [2.6.37] btrfs-transac hanging in prepare_to_wait Christian Schmidt
2011-03-13 17:52 ` Christian Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).