All of lore.kernel.org
 help / color / mirror / Atom feed
* XFS: Abysmal write performance because of excessive seeking (allocation groups to blame?)
@ 2012-04-05 18:10 Stefan Ring
  2012-04-05 19:56 ` Peter Grandi
                   ` (3 more replies)
  0 siblings, 4 replies; 64+ messages in thread
From: Stefan Ring @ 2012-04-05 18:10 UTC (permalink / raw)
  To: xfs

Encouraged by reading about the recent improvements to XFS, I decided
to give it another try on a new server machine. I am happy to report
that compared to my previous tests a few years ago, performance has
progressed from unusably slow to barely acceptable, but still lagging
behind ext4, which is a noticeable (and notable) improvement indeed
;).

The filesystem operations I care about the most are the likes which
involve thousands of small files across lots of directories, like
large trees of source code. For my test, I created a tarball of a
finished IcedTea6 build, about 2.5 GB in size. It contains roughly
200,000 files in 20,000 directories. The test I want to report about
here was extracting this tarball onto an XFS filesystem. I tested
other actions as well, but they didn't reveal anything too noticeable.

So the test consists of nothing but un-tarring the archive, followed
by a "sync" to make sure that the time-to-disk is measured. Prior to
running it, I had populated the filesystem in the following way:

I created two directory hierarchies, each containing the unpacked
tarball 20 times, which I rsynced simultaneously to the target
filesystem. When this was done, I deleted one half of them, creating
some free space fragmentation, and what I hoped would mimic real-world
conditions to some degree.

So now to the test itself -- the tar "x" command returned quite fast
(on the order of only a few seconds), but the following sync took
ages. I created a diagram using seekwatcher, and it reveals that the
disk head jumps about wildly between four zones which are written to
in almost perfectly linear fashion.

When I reran the test with only a single allocation group, behavior
was much better (about twice as fast).

OTOH, when I continuously extracted the same tarball in a loop without
syncing in-between, it would continuously slow down in the ag=1 case
to the point of being unacceptably slow. The same behavior did not
occur with ag=4.

I am aware that no filesystem can be optimal, but given that the
entire write set -- all 2.5 GB of it -- is "known" to the file system,
that is, in memory, wouldn't it be possible to write it out to disk in
a somewhat more reasonable fashion?

This is the seekwatcher graph:
http://dl.dropbox.com/u/5338701/dev/xfs/xfs-ag4.png

And for comparison, the same on ext4, on the same partition primed in
the same way (parallel rsyncs mentioned above):
http://dl.dropbox.com/u/5338701/dev/xfs/ext4.png

As can be seen from the time scale in the bottom part, the ext4
version performed about 5 times as fast because of a much more
disk-friendly write pattern.

I ran the tests with a current RHEL 6.2 kernel and also with a 3.3rc2
kernel. Both of them exhibited the same behavior. The disk hardware
used was a SmartArray p400 controller with 6x 10k rpm 300GB SAS disks
in RAID 6. The server has plenty of RAM (64 GB).

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2012-04-14 11:30 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-05 18:10 XFS: Abysmal write performance because of excessive seeking (allocation groups to blame?) Stefan Ring
2012-04-05 19:56 ` Peter Grandi
2012-04-05 22:41   ` Peter Grandi
2012-04-06 14:36   ` Peter Grandi
2012-04-06 15:37     ` Stefan Ring
2012-04-07 13:33       ` Peter Grandi
2012-04-05 21:37 ` Christoph Hellwig
2012-04-06  1:09   ` Peter Grandi
2012-04-06  8:25   ` Stefan Ring
2012-04-07 18:57     ` Martin Steigerwald
2012-04-10 14:02       ` Stefan Ring
2012-04-10 14:32         ` Joe Landman
2012-04-10 15:56           ` Stefan Ring
2012-04-10 18:13         ` Martin Steigerwald
2012-04-10 20:44         ` Stan Hoeppner
2012-04-10 21:00           ` Stefan Ring
2012-04-05 22:32 ` Roger Willcocks
2012-04-06  7:11   ` Stefan Ring
2012-04-06  8:24     ` Stefan Ring
2012-04-05 23:07 ` Peter Grandi
2012-04-06  0:13   ` Peter Grandi
2012-04-06  7:27     ` Stefan Ring
2012-04-06 23:28       ` Stan Hoeppner
2012-04-07  7:27         ` Stefan Ring
2012-04-07  8:53           ` Emmanuel Florac
2012-04-07 14:57           ` Stan Hoeppner
2012-04-09 11:02             ` Stefan Ring
2012-04-09 12:48               ` Emmanuel Florac
2012-04-09 12:53                 ` Stefan Ring
2012-04-09 13:03                   ` Emmanuel Florac
2012-04-09 23:38               ` Stan Hoeppner
2012-04-10  6:11                 ` Stefan Ring
2012-04-10 20:29                   ` Stan Hoeppner
2012-04-10 20:43                     ` Stefan Ring
2012-04-10 21:29                       ` Stan Hoeppner
2012-04-09  0:19           ` Dave Chinner
2012-04-09 11:39             ` Emmanuel Florac
2012-04-09 21:47               ` Dave Chinner
2012-04-07  8:49         ` Emmanuel Florac
2012-04-08 20:33           ` Stan Hoeppner
2012-04-08 21:45             ` Emmanuel Florac
2012-04-09  5:27               ` Stan Hoeppner
2012-04-09 12:45                 ` Emmanuel Florac
2012-04-13 19:36                   ` Stefan Ring
2012-04-14  7:32                     ` Stan Hoeppner
2012-04-14 11:30                       ` Stefan Ring
2012-04-09 14:21         ` Geoffrey Wehrman
2012-04-10 19:30           ` Stan Hoeppner
2012-04-11 22:19             ` Geoffrey Wehrman
2012-04-07 16:50       ` Peter Grandi
2012-04-07 17:10         ` Joe Landman
2012-04-08 21:42           ` Stan Hoeppner
2012-04-09  5:13             ` Stan Hoeppner
2012-04-09 11:52               ` Stefan Ring
2012-04-10  7:34                 ` Stan Hoeppner
2012-04-10 13:59                   ` Stefan Ring
2012-04-09  9:23             ` Stefan Ring
2012-04-09 23:06               ` Stan Hoeppner
2012-04-06  0:53   ` Peter Grandi
2012-04-06  7:32     ` Stefan Ring
2012-04-06  5:53   ` Stefan Ring
2012-04-06 15:35     ` Peter Grandi
2012-04-10 14:05       ` Stefan Ring
2012-04-07 19:11     ` Peter Grandi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.