All of lore.kernel.org
 help / color / mirror / Atom feed
* Performance decrease over time
@ 2013-08-01 20:21 Markus Trippelsdorf
  2013-08-02  2:25 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Trippelsdorf @ 2013-08-01 20:21 UTC (permalink / raw)
  To: xfs

Yesterday I noticed that the nightly rsync run that backups my root
fs took over 8 minutes to complete. Half a year ago when the backup disk
was freshly formated it only took 2 minutes. (The size of my root fs stayed
constant during this time).

So I decided to reformat the drive, but first took some measurements.
The drive in question also contains my film and music collection,
several git trees and is used to compile projects quite often.

Model Family:     Seagate Barracuda Green (AF)
Device Model:     ST1500DL003-9VT16L

/dev/sdb on /var type xfs (rw,relatime,attr2,inode64,logbsize=256k,noquota)
/dev/sdb       xfs       1.4T  702G  695G  51% /var

 # xfs_db -c frag -r /dev/sdb
actual 1540833, ideal 1529956, fragmentation factor 0.71%

# iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
        Iozone: Performance Test of File I/O
                Version $Revision: 3.408 $
                Compiled for 64 bit mode.
                Build: linux-AMD64 
...
        Run began: Thu Aug  1 12:55:09 2013

        O_DIRECT feature enabled
        Auto Mode
        File size set to 102400 KB
        Record Size 4 KB
        Record Size 64 KB
        Record Size 512 KB
        Command line used: iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
          102400       4    8083    9218     3817     3786     515     789                                                          
          102400      64   56905   48177    17239    26347    7381   15643                                                          
          102400     512  113689   86344    84583    83192   37136   63275                                                          

After fresh format and restore from another backup, performance is much
better again:

# iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
          102400       4   13923   18760    19461    27305     761     652                                                          
          102400      64   95822   95724    82331    90763   10455   11944                                                          
          102400     512   93343   95386    94504    95073   43282   69179 

Couple of questions. Is it normal that throughput decreases this much in
half a year on a heavily used disk that is only half full? What can be
done (as a user) to mitigate this effect? 

-- 
Markus

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance decrease over time
  2013-08-01 20:21 Performance decrease over time Markus Trippelsdorf
@ 2013-08-02  2:25 ` Dave Chinner
  2013-08-02  8:14   ` Stan Hoeppner
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2013-08-02  2:25 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: xfs

On Thu, Aug 01, 2013 at 10:21:08PM +0200, Markus Trippelsdorf wrote:
> Yesterday I noticed that the nightly rsync run that backups my root
> fs took over 8 minutes to complete. Half a year ago when the backup disk
> was freshly formated it only took 2 minutes. (The size of my root fs stayed
> constant during this time).
> 
> So I decided to reformat the drive, but first took some measurements.
> The drive in question also contains my film and music collection,
> several git trees and is used to compile projects quite often.

So, lots of static files mixed in with lots of temporary files and
small changing files. And heavy usage. Sounds like a pretty normal
case for slowly fragmenting free space as data of different temporal
locality slowly intermingles....

> Model Family:     Seagate Barracuda Green (AF)
> Device Model:     ST1500DL003-9VT16L

So, slow rotation speed, and an average seek time of 13ms? Given the
track-to-track seek times of 1ms, that means worst case seek times
are going to be in the order of 25ms. IOWs, you're using close to
the slowest disk on the market, and so seeks are going to have an
abnormally high impact on performance.

Oh, and the disk has a 64MB cache on board, so the test file size
you are using of 100MB will mostly fit in the cache....

> /dev/sdb on /var type xfs (rw,relatime,attr2,inode64,logbsize=256k,noquota)
> /dev/sdb       xfs       1.4T  702G  695G  51% /var
> 
>  # xfs_db -c frag -r /dev/sdb
> actual 1540833, ideal 1529956, fragmentation factor 0.71%
> 
> # iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
>         Iozone: Performance Test of File I/O
>                 Version $Revision: 3.408 $
>                 Compiled for 64 bit mode.
>                 Build: linux-AMD64 
> ...
>         Run began: Thu Aug  1 12:55:09 2013
> 
>         O_DIRECT feature enabled
>         Auto Mode
>         File size set to 102400 KB
>         Record Size 4 KB
>         Record Size 64 KB
>         Record Size 512 KB
>         Command line used: iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
>         Output is in Kbytes/sec
>         Time Resolution = 0.000001 seconds.
>         Processor cache size set to 1024 Kbytes.
>         Processor cache line size set to 32 bytes.
>         File stride size set to 17 * record size.
>                                                             random  random    bkwd   record   stride                                   
>               KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
>           102400       4    8083    9218     3817     3786     515     789                                                          

4k single threaded direct IO can do 8MB/s on a spinning disk? I
think you are hitting the disk cache with these tests, and so they
aren't really representative of application performance at all.
All these numbers reflect is how contiguous the files are on disk.

>           102400      64   56905   48177    17239    26347    7381   15643                                                          
>           102400     512  113689   86344    84583    83192   37136   63275                                                          
> 
> After fresh format and restore from another backup, performance is much
> better again:
> 
> # iozone -I -a -s 100M -r 4k -r 64k -r 512k -i 0 -i 1 -i 2
>                                                             random  random    bkwd   record   stride                                   
>               KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
>           102400       4   13923   18760    19461    27305     761     652                                                          
>           102400      64   95822   95724    82331    90763   10455   11944                                                          
>           102400     512   93343   95386    94504    95073   43282   69179 
> 
> Couple of questions. Is it normal that throughput decreases this much in
> half a year on a heavily used disk that is only half full?

The process you went through will have completely defragmented your
filesystem, and so now IOZone will be operating on completely
contiguous files and hence getting more disk cache hits.

So really, the numbers only reflect a difference in layout of the
files being tested. And using small direct IO means that the
filesystem will tend to fill small free spaces close to the
inode first, and so will fragment the file based on the locality of
fragmented free space to the owner inode. In the case of the new
filesystem, there is only large, contiguous free space near the
inode....

So, what you are seeing is typical for a heavily used filesystem,
and it's probably more significant for you because of the type of
drive you are using....

> What can be
> done (as a user) to mitigate this effect? 

Buy faster disks ;)

Seriously, all filesystems age and get significantly slower as they
get used. XFS is not really designed for single spindles - it's
algorithms are designed to spread data out over the entire device
and so be able to make use of many, many spindles that make up the
device. The behaviour it has works extremely well for this sort of
large scale scenario, but it's close to the worst case aging
behaviour for a single, very slow spindle like you are using.  Hence
once the filesystem is over the "we have pristine, contiguous
freespace" hump on your hardware, it's all downhill and there's not
much you can do about it....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance decrease over time
  2013-08-02  2:25 ` Dave Chinner
@ 2013-08-02  8:14   ` Stan Hoeppner
  2013-08-02 22:30     ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Stan Hoeppner @ 2013-08-02  8:14 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Markus Trippelsdorf, xfs

On 8/1/2013 9:25 PM, Dave Chinner wrote:
...

> So really, the numbers only reflect a difference in layout of the
> files being tested. And using small direct IO means that the
> filesystem will tend to fill small free spaces close to the
> inode first, and so will fragment the file based on the locality of
> fragmented free space to the owner inode. In the case of the new
> filesystem, there is only large, contiguous free space near the
> inode....
...
>> What can be
>> done (as a user) to mitigate this effect? 
> 
> Buy faster disks ;)
> 
> Seriously, all filesystems age and get significantly slower as they
> get used. XFS is not really designed for single spindles - it's
> algorithms are designed to spread data out over the entire device
> and so be able to make use of many, many spindles that make up the
> device. The behaviour it has works extremely well for this sort of
> large scale scenario, but it's close to the worst case aging
> behaviour for a single, very slow spindle like you are using.  Hence
> once the filesystem is over the "we have pristine, contiguous
> freespace" hump on your hardware, it's all downhill and there's not
> much you can do about it....

Wouldn't the inode32 allocator yield somewhat better results with this
direct IO workload?  With Markus' single slow spindle?  It shouldn't
fragment free space quite as badly in the first place, nor suffer from
trying to use many small fragments surrounding the inode as in the case
above.

Whether or not inode32 would be beneficial to his real workload(s) I
don't know.  I tend to think it might make at least a small positive
difference.  However, given that XFS is trying to get away from inode32
altogether I can see why you wouldn't mention it, even if it might yield
some improvement in this case.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance decrease over time
  2013-08-02  8:14   ` Stan Hoeppner
@ 2013-08-02 22:30     ` Dave Chinner
  2013-08-02 23:00       ` aurfalien
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2013-08-02 22:30 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Markus Trippelsdorf, xfs

On Fri, Aug 02, 2013 at 03:14:04AM -0500, Stan Hoeppner wrote:
> On 8/1/2013 9:25 PM, Dave Chinner wrote:
> ...
> 
> > So really, the numbers only reflect a difference in layout of the
> > files being tested. And using small direct IO means that the
> > filesystem will tend to fill small free spaces close to the
> > inode first, and so will fragment the file based on the locality of
> > fragmented free space to the owner inode. In the case of the new
> > filesystem, there is only large, contiguous free space near the
> > inode....
> ...
> >> What can be
> >> done (as a user) to mitigate this effect? 
> > 
> > Buy faster disks ;)
> > 
> > Seriously, all filesystems age and get significantly slower as they
> > get used. XFS is not really designed for single spindles - it's
> > algorithms are designed to spread data out over the entire device
> > and so be able to make use of many, many spindles that make up the
> > device. The behaviour it has works extremely well for this sort of
> > large scale scenario, but it's close to the worst case aging
> > behaviour for a single, very slow spindle like you are using.  Hence
> > once the filesystem is over the "we have pristine, contiguous
> > freespace" hump on your hardware, it's all downhill and there's not
> > much you can do about it....
> 
> Wouldn't the inode32 allocator yield somewhat better results with this
> direct IO workload?

What direct IO workload? Oh, you mean the IOZone test? 

What's the point of trying to optimise IOzone throughput? it matters
nothing to Marcus - he's just using it to demonstrate a point that
free space is not as contiguous as it once was...

As it is, inode32 will do nothing to speed up performance on a
single spindle - it spreads all files out across the entire disk, so
locality between the inode and the data is guaranteed to be worse
than an aged inode64 filesystem. inode32 intentionally spreads data
across the disk without caring about access locality so the average
seek from inode read to data read is half the spindle. That's why
inode64 is so much faster than inode32 on general workloads - the
seek between inode and data is closer to the track-to-track seek
time than the average seek time.

> With Markus' single slow spindle?  It shouldn't
> fragment free space quite as badly in the first place, nor suffer from
> trying to use many small fragments surrounding the inode as in the case
> above.

inode32 fragments free space just as badly as inode64, if not worse,
because it is guaranteed to intermingle data of different temporal
stability in the same localities, rather than clustering different
datasets around individual directory inodes...

> Whether or not inode32 would be beneficial to his real workload(s) I
> don't know.  I tend to think it might make at least a small positive
> difference.  However, given that XFS is trying to get away from inode32
> altogether I can see why you wouldn't mention it, even if it might yield
> some improvement in this case.

I didn't mention it because as a baseline for the data Macus is
storing (source trees, doing compilations, etc) inode32 starts off
much slower than inode64 and degrades just as much or more over
time....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Performance decrease over time
  2013-08-02 22:30     ` Dave Chinner
@ 2013-08-02 23:00       ` aurfalien
  0 siblings, 0 replies; 5+ messages in thread
From: aurfalien @ 2013-08-02 23:00 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Markus Trippelsdorf, Stan Hoeppner, xfs


On Aug 2, 2013, at 3:30 PM, Dave Chinner wrote:

> On Fri, Aug 02, 2013 at 03:14:04AM -0500, Stan Hoeppner wrote:
>> On 8/1/2013 9:25 PM, Dave Chinner wrote:
>> ...
>> 
>>> So really, the numbers only reflect a difference in layout of the
>>> files being tested. And using small direct IO means that the
>>> filesystem will tend to fill small free spaces close to the
>>> inode first, and so will fragment the file based on the locality of
>>> fragmented free space to the owner inode. In the case of the new
>>> filesystem, there is only large, contiguous free space near the
>>> inode....
>> ...
>>>> What can be
>>>> done (as a user) to mitigate this effect? 
>>> 
>>> Buy faster disks ;)
>>> 
>>> Seriously, all filesystems age and get significantly slower as they
>>> get used. XFS is not really designed for single spindles - it's
>>> algorithms are designed to spread data out over the entire device
>>> and so be able to make use of many, many spindles that make up the
>>> device. The behaviour it has works extremely well for this sort of
>>> large scale scenario, but it's close to the worst case aging
>>> behaviour for a single, very slow spindle like you are using.  Hence
>>> once the filesystem is over the "we have pristine, contiguous
>>> freespace" hump on your hardware, it's all downhill and there's not
>>> much you can do about it....
>> 
>> Wouldn't the inode32 allocator yield somewhat better results with this
>> direct IO workload?
> 
> What direct IO workload? Oh, you mean the IOZone test? 
> 
> What's the point of trying to optimise IOzone throughput? it matters
> nothing to Marcus - he's just using it to demonstrate a point that
> free space is not as contiguous as it once was...
> 
> As it is, inode32 will do nothing to speed up performance on a
> single spindle - it spreads all files out across the entire disk, so
> locality between the inode and the data is guaranteed to be worse
> than an aged inode64 filesystem. inode32 intentionally spreads data
> across the disk without caring about access locality so the average
> seek from inode read to data read is half the spindle. That's why
> inode64 is so much faster than inode32 on general workloads - the
> seek between inode and data is closer to the track-to-track seek
> time than the average seek time.

Totally concur 100%.

In fact I've either obsoleted or made local 32 bit apps as we run most everything off a NAS type setup using XFS.

The benies of inode64 in our env were just too much to side line, especially with our older SATA 2 disks in use.

In fact, for slower disks, I'd say inode64 is a must.  But I'm talking several in a RAID config as I don't do single disk XFS.

- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-08-02 23:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01 20:21 Performance decrease over time Markus Trippelsdorf
2013-08-02  2:25 ` Dave Chinner
2013-08-02  8:14   ` Stan Hoeppner
2013-08-02 22:30     ` Dave Chinner
2013-08-02 23:00       ` aurfalien

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.