* Performance decrease over time
@ 2013-08-01 20:21 Markus Trippelsdorf
2013-08-02 2:25 ` Dave Chinner
0 siblings, 1 reply; 5+ messages in thread
From: Markus Trippelsdorf @ 2013-08-01 20:21 UTC (permalink / raw)
To: xfs
Yesterday I noticed that the nightly rsync run that backups my root
fs took over 8 minutes to complete. Half a year ago when the backup disk
was freshly formated it only took 2 minutes. (The size of my root fs stayed
constant during this time).
So I decided to reformat the drive, but first took some measurements.
The drive in question also contains my film and music collection,
several git trees and is used to compile projects quite often.
Model Family: Seagate Barracuda Green (AF)
Device Model: ST1500DL003-9VT16L
/dev/sdb on /var type xfs (rw,relatime,attr2,inode64,logbsize=256k,noquota)
/dev/sdb xfs 1.4T 702G 695G 51% /var
# xfs_db -c frag -r /dev/sdb
actual 1540833, ideal 1529956, fragmentation factor 0.71%
# iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
Iozone: Performance Test of File I/O
Version $Revision: 3.408 $
Compiled for 64 bit mode.
Build: linux-AMD64
...
Run began: Thu Aug 1 12:55:09 2013
O_DIRECT feature enabled
Auto Mode
File size set to 102400 KB
Record Size 4 KB
Record Size 64 KB
Record Size 512 KB
Command line used: iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
102400 4 8083 9218 3817 3786 515 789
102400 64 56905 48177 17239 26347 7381 15643
102400 512 113689 86344 84583 83192 37136 63275
After fresh format and restore from another backup, performance is much
better again:
# iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
102400 4 13923 18760 19461 27305 761 652
102400 64 95822 95724 82331 90763 10455 11944
102400 512 93343 95386 94504 95073 43282 69179
Couple of questions. Is it normal that throughput decreases this much in
half a year on a heavily used disk that is only half full? What can be
done (as a user) to mitigate this effect?
--
Markus
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance decrease over time
2013-08-01 20:21 Performance decrease over time Markus Trippelsdorf
@ 2013-08-02 2:25 ` Dave Chinner
2013-08-02 8:14 ` Stan Hoeppner
0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2013-08-02 2:25 UTC (permalink / raw)
To: Markus Trippelsdorf; +Cc: xfs
On Thu, Aug 01, 2013 at 10:21:08PM +0200, Markus Trippelsdorf wrote:
> Yesterday I noticed that the nightly rsync run that backups my root
> fs took over 8 minutes to complete. Half a year ago when the backup disk
> was freshly formated it only took 2 minutes. (The size of my root fs stayed
> constant during this time).
>
> So I decided to reformat the drive, but first took some measurements.
> The drive in question also contains my film and music collection,
> several git trees and is used to compile projects quite often.
So, lots of static files mixed in with lots of temporary files and
small changing files. And heavy usage. Sounds like a pretty normal
case for slowly fragmenting free space as data of different temporal
locality slowly intermingles....
> Model Family: Seagate Barracuda Green (AF)
> Device Model: ST1500DL003-9VT16L
So, slow rotation speed, and an average seek time of 13ms? Given the
track-to-track seek times of 1ms, that means worst case seek times
are going to be in the order of 25ms. IOWs, you're using close to
the slowest disk on the market, and so seeks are going to have an
abnormally high impact on performance.
Oh, and the disk has a 64MB cache on board, so the test file size
you are using of 100MB will mostly fit in the cache....
> /dev/sdb on /var type xfs (rw,relatime,attr2,inode64,logbsize=256k,noquota)
> /dev/sdb xfs 1.4T 702G 695G 51% /var
>
> # xfs_db -c frag -r /dev/sdb
> actual 1540833, ideal 1529956, fragmentation factor 0.71%
>
> # iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
> Iozone: Performance Test of File I/O
> Version $Revision: 3.408 $
> Compiled for 64 bit mode.
> Build: linux-AMD64
> ...
> Run began: Thu Aug 1 12:55:09 2013
>
> O_DIRECT feature enabled
> Auto Mode
> File size set to 102400 KB
> Record Size 4 KB
> Record Size 64 KB
> Record Size 512 KB
> Command line used: iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> random random bkwd record stride
> KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
> 102400 4 8083 9218 3817 3786 515 789
4k single threaded direct IO can do 8MB/s on a spinning disk? I
think you are hitting the disk cache with these tests, and so they
aren't really representative of application performance at all.
All these numbers reflect is how contiguous the files are on disk.
> 102400 64 56905 48177 17239 26347 7381 15643
> 102400 512 113689 86344 84583 83192 37136 63275
>
> After fresh format and restore from another backup, performance is much
> better again:
>
> # iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
> random random bkwd record stride
> KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
> 102400 4 13923 18760 19461 27305 761 652
> 102400 64 95822 95724 82331 90763 10455 11944
> 102400 512 93343 95386 94504 95073 43282 69179
>
> Couple of questions. Is it normal that throughput decreases this much in
> half a year on a heavily used disk that is only half full?
The process you went through will have completely defragmented your
filesystem, and so now IOZone will be operating on completely
contiguous files and hence getting more disk cache hits.
So really, the numbers only reflect a difference in layout of the
files being tested. And using small direct IO means that the
filesystem will tend to fill small free spaces close to the
inode first, and so will fragment the file based on the locality of
fragmented free space to the owner inode. In the case of the new
filesystem, there is only large, contiguous free space near the
inode....
So, what you are seeing is typical for a heavily used filesystem,
and it's probably more significant for you because of the type of
drive you are using....
> What can be
> done (as a user) to mitigate this effect?
Buy faster disks ;)
Seriously, all filesystems age and get significantly slower as they
get used. XFS is not really designed for single spindles - it's
algorithms are designed to spread data out over the entire device
and so be able to make use of many, many spindles that make up the
device. The behaviour it has works extremely well for this sort of
large scale scenario, but it's close to the worst case aging
behaviour for a single, very slow spindle like you are using. Hence
once the filesystem is over the "we have pristine, contiguous
freespace" hump on your hardware, it's all downhill and there's not
much you can do about it....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance decrease over time
2013-08-02 2:25 ` Dave Chinner
@ 2013-08-02 8:14 ` Stan Hoeppner
2013-08-02 22:30 ` Dave Chinner
0 siblings, 1 reply; 5+ messages in thread
From: Stan Hoeppner @ 2013-08-02 8:14 UTC (permalink / raw)
To: Dave Chinner; +Cc: Markus Trippelsdorf, xfs
On 8/1/2013 9:25 PM, Dave Chinner wrote:
...
> So really, the numbers only reflect a difference in layout of the
> files being tested. And using small direct IO means that the
> filesystem will tend to fill small free spaces close to the
> inode first, and so will fragment the file based on the locality of
> fragmented free space to the owner inode. In the case of the new
> filesystem, there is only large, contiguous free space near the
> inode....
...
>> What can be
>> done (as a user) to mitigate this effect?
>
> Buy faster disks ;)
>
> Seriously, all filesystems age and get significantly slower as they
> get used. XFS is not really designed for single spindles - it's
> algorithms are designed to spread data out over the entire device
> and so be able to make use of many, many spindles that make up the
> device. The behaviour it has works extremely well for this sort of
> large scale scenario, but it's close to the worst case aging
> behaviour for a single, very slow spindle like you are using. Hence
> once the filesystem is over the "we have pristine, contiguous
> freespace" hump on your hardware, it's all downhill and there's not
> much you can do about it....
Wouldn't the inode32 allocator yield somewhat better results with this
direct IO workload? With Markus' single slow spindle? It shouldn't
fragment free space quite as badly in the first place, nor suffer from
trying to use many small fragments surrounding the inode as in the case
above.
Whether or not inode32 would be beneficial to his real workload(s) I
don't know. I tend to think it might make at least a small positive
difference. However, given that XFS is trying to get away from inode32
altogether I can see why you wouldn't mention it, even if it might yield
some improvement in this case.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance decrease over time
2013-08-02 8:14 ` Stan Hoeppner
@ 2013-08-02 22:30 ` Dave Chinner
2013-08-02 23:00 ` aurfalien
0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2013-08-02 22:30 UTC (permalink / raw)
To: Stan Hoeppner; +Cc: Markus Trippelsdorf, xfs
On Fri, Aug 02, 2013 at 03:14:04AM -0500, Stan Hoeppner wrote:
> On 8/1/2013 9:25 PM, Dave Chinner wrote:
> ...
>
> > So really, the numbers only reflect a difference in layout of the
> > files being tested. And using small direct IO means that the
> > filesystem will tend to fill small free spaces close to the
> > inode first, and so will fragment the file based on the locality of
> > fragmented free space to the owner inode. In the case of the new
> > filesystem, there is only large, contiguous free space near the
> > inode....
> ...
> >> What can be
> >> done (as a user) to mitigate this effect?
> >
> > Buy faster disks ;)
> >
> > Seriously, all filesystems age and get significantly slower as they
> > get used. XFS is not really designed for single spindles - it's
> > algorithms are designed to spread data out over the entire device
> > and so be able to make use of many, many spindles that make up the
> > device. The behaviour it has works extremely well for this sort of
> > large scale scenario, but it's close to the worst case aging
> > behaviour for a single, very slow spindle like you are using. Hence
> > once the filesystem is over the "we have pristine, contiguous
> > freespace" hump on your hardware, it's all downhill and there's not
> > much you can do about it....
>
> Wouldn't the inode32 allocator yield somewhat better results with this
> direct IO workload?
What direct IO workload? Oh, you mean the IOZone test?
What's the point of trying to optimise IOzone throughput? it matters
nothing to Marcus - he's just using it to demonstrate a point that
free space is not as contiguous as it once was...
As it is, inode32 will do nothing to speed up performance on a
single spindle - it spreads all files out across the entire disk, so
locality between the inode and the data is guaranteed to be worse
than an aged inode64 filesystem. inode32 intentionally spreads data
across the disk without caring about access locality so the average
seek from inode read to data read is half the spindle. That's why
inode64 is so much faster than inode32 on general workloads - the
seek between inode and data is closer to the track-to-track seek
time than the average seek time.
> With Markus' single slow spindle? It shouldn't
> fragment free space quite as badly in the first place, nor suffer from
> trying to use many small fragments surrounding the inode as in the case
> above.
inode32 fragments free space just as badly as inode64, if not worse,
because it is guaranteed to intermingle data of different temporal
stability in the same localities, rather than clustering different
datasets around individual directory inodes...
> Whether or not inode32 would be beneficial to his real workload(s) I
> don't know. I tend to think it might make at least a small positive
> difference. However, given that XFS is trying to get away from inode32
> altogether I can see why you wouldn't mention it, even if it might yield
> some improvement in this case.
I didn't mention it because as a baseline for the data Macus is
storing (source trees, doing compilations, etc) inode32 starts off
much slower than inode64 and degrades just as much or more over
time....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance decrease over time
2013-08-02 22:30 ` Dave Chinner
@ 2013-08-02 23:00 ` aurfalien
0 siblings, 0 replies; 5+ messages in thread
From: aurfalien @ 2013-08-02 23:00 UTC (permalink / raw)
To: Dave Chinner; +Cc: Markus Trippelsdorf, Stan Hoeppner, xfs
On Aug 2, 2013, at 3:30 PM, Dave Chinner wrote:
> On Fri, Aug 02, 2013 at 03:14:04AM -0500, Stan Hoeppner wrote:
>> On 8/1/2013 9:25 PM, Dave Chinner wrote:
>> ...
>>
>>> So really, the numbers only reflect a difference in layout of the
>>> files being tested. And using small direct IO means that the
>>> filesystem will tend to fill small free spaces close to the
>>> inode first, and so will fragment the file based on the locality of
>>> fragmented free space to the owner inode. In the case of the new
>>> filesystem, there is only large, contiguous free space near the
>>> inode....
>> ...
>>>> What can be
>>>> done (as a user) to mitigate this effect?
>>>
>>> Buy faster disks ;)
>>>
>>> Seriously, all filesystems age and get significantly slower as they
>>> get used. XFS is not really designed for single spindles - it's
>>> algorithms are designed to spread data out over the entire device
>>> and so be able to make use of many, many spindles that make up the
>>> device. The behaviour it has works extremely well for this sort of
>>> large scale scenario, but it's close to the worst case aging
>>> behaviour for a single, very slow spindle like you are using. Hence
>>> once the filesystem is over the "we have pristine, contiguous
>>> freespace" hump on your hardware, it's all downhill and there's not
>>> much you can do about it....
>>
>> Wouldn't the inode32 allocator yield somewhat better results with this
>> direct IO workload?
>
> What direct IO workload? Oh, you mean the IOZone test?
>
> What's the point of trying to optimise IOzone throughput? it matters
> nothing to Marcus - he's just using it to demonstrate a point that
> free space is not as contiguous as it once was...
>
> As it is, inode32 will do nothing to speed up performance on a
> single spindle - it spreads all files out across the entire disk, so
> locality between the inode and the data is guaranteed to be worse
> than an aged inode64 filesystem. inode32 intentionally spreads data
> across the disk without caring about access locality so the average
> seek from inode read to data read is half the spindle. That's why
> inode64 is so much faster than inode32 on general workloads - the
> seek between inode and data is closer to the track-to-track seek
> time than the average seek time.
Totally concur 100%.
In fact I've either obsoleted or made local 32 bit apps as we run most everything off a NAS type setup using XFS.
The benies of inode64 in our env were just too much to side line, especially with our older SATA 2 disks in use.
In fact, for slower disks, I'd say inode64 is a must. But I'm talking several in a RAID config as I don't do single disk XFS.
- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-08-02 23:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01 20:21 Performance decrease over time Markus Trippelsdorf
2013-08-02 2:25 ` Dave Chinner
2013-08-02 8:14 ` Stan Hoeppner
2013-08-02 22:30 ` Dave Chinner
2013-08-02 23:00 ` aurfalien
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.