concurrent direct IO write in xfs

* concurrent direct IO write in xfs
@ 2012-01-16  0:01 Zheng Da
  2012-01-16 17:48 ` Christoph Hellwig
  2012-01-16 23:25 ` Dave Chinner
  0 siblings, 2 replies; 19+ messages in thread
From: Zheng Da @ 2012-01-16  0:01 UTC (permalink / raw)
  To: xfs

[-- Attachment #1.1: Type: text/plain, Size: 2285 bytes --]

Hello,

I surprisedly found that writing data to a file (no appending) with direct
IO and with multiple threads has the same performance as a single thread.
Actually, it seems there is only one core is working at a time. In my case,
each time I write a page to a file and the offset is always aligned to the
page size, so there is no overlapping between writes.

According to lockstat, the lock that causes the most waiting time is
xfs_inode.i_lock.
               &(&ip->i_lock)->mr_lock-W:         31568          36170
      0.24       20048.25     7589157.99         130154        3146848
      0.00         217.70     1238310.72
               &(&ip->i_lock)->mr_lock-R:         11251          11886
      0.24       20043.01     2895595.18          46671         526309
      0.00          63.80      264097.96
               -------------------------
                 &(&ip->i_lock)->mr_lock          36170
 [<ffffffffa03be122>] xfs_ilock+0xb2/0x110 [xfs]
                 &(&ip->i_lock)->mr_lock          11886
 [<ffffffffa03be15a>] xfs_ilock+0xea/0x110 [xfs]
               -------------------------
                 &(&ip->i_lock)->mr_lock          38555
 [<ffffffffa03be122>] xfs_ilock+0xb2/0x110 [xfs]
                 &(&ip->i_lock)->mr_lock           9501
 [<ffffffffa03be15a>] xfs_ilock+0xea/0x110 [xfs]

And systemtap shows me that xfs_inode.i_lock is locked exclusively in the
following functions.
0xffffffff81289235 : xfs_file_aio_write_checks+0x45/0x1d0 [kernel]
 0xffffffff812829f4 : __xfs_get_blocks+0x94/0x4a0 [kernel]
 0xffffffff81288b6a : xfs_aio_write_newsize_update+0x3a/0x90 [kernel]
 0xffffffff8129590a : xfs_log_dirty_inode+0x7a/0xe0 [kernel]
xfs_log_dirty_inode is only invoked 3 times when I write 4G data to the
file, so we can completely ignore it. But I'm not sure which of them is the
major cause of the bad write performance or whether they are the cause of
the bad performance. But it seems none of them are the main operations in
direct io write.

It seems to me that the lock might not be necessary for my case. It'll be
nice if I can disable the lock. Or is there any suggestion of achieving
better write performance with multiple threads in XFS?
I tried ext4 and it doesn't perform better than XFS. Does the problem exist
in all FS?

Thanks,
Da

[-- Attachment #1.2: Type: text/html, Size: 2824 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread