Thank you both for informative replies, much appreciated.

I ended up doing the tedious job of shuffling data around to get a more reasonable distance to ENOSPC, as there was no way I could free up the needed contiguous free space with my dataset. 


Thanks,

André

On 20 June 2012 08:07, Dave Chinner <david@fromorbit.com> wrote:
On Tue, Jun 19, 2012 at 07:36:24AM -0500, Geoffrey Wehrman wrote:
> On Tue, Jun 19, 2012 at 02:05:34PM +0200, André Øien Langvand wrote:
> | Hi,
> |
> | I know there are quite a few posts regarding similar issues around, but I
> | can't seem to find a solution or at least an answer to why this is
> | happening in my case, so I thought I'd try the mailing list and I hope
> | that's okay.
> |
> | We have 2 file servers with identical hardware and identical configuration
> | (Dell R610's, H800 controllers, MD1200 DAS, RAID-5) set up with rsync to
> | mirror the contents. The content is music in several formats (from PCM WAV
> | to 64kbit AAC previews), which means file sizes of about 1 - 40mb. Both
> | systems running SLES 11 SP1. Same kernel (2.6.32.59-0.3-default), same
> | xfsprogs version (xfsprogs-3.1.1-0.1.36).
> |
> | My example partition on the source now has 9.9G (of 9.1T) available space
> | and still doesn't report the drive as full. On the destination, however, it
> | wont allow me to use any of the remaining 51G. This is obviously a problem
> | when trying to do mirroring.
> |
> | Both file systems have been mounted with inode64 option since first mount,
> | there are plenty of inodes available and I've also verified that there are
> | noe sparse files (find -type f -printf "%S\t%p\n" 2>/dev/null | gawk '{if
> | ($1 < 1.0) print $1 $2}'), just in case.
> |
> | I have tried repairing (xfs_repair), defragmenting (xfs_fsr) and alter
> | imaxpct without any luck. Rsync is run like this: # ionice -c3 rsync -rv
> | --size-only --progress --delete-before --inplace.
> |
> |
> | More detailed information on source file system:
> |
> | # df -k | grep sdg1
> | /dev/sdg1            9762777052 9752457156  10319896 100% /content/raid31
> |
> | # df -i | grep sdg1
> | /dev/sdg1            7471884 2311914 5159970   31% /content/raid31
> |
> | # xfs_info /dev/sdg1
> | meta-data=/dev/sdg1              isize=2048   agcount=10, agsize=268435424
> | blks
> |          =                       sectsz=512   attr=2
> | data     =                       bsize=4096   blocks=2441215991, imaxpct=5
> |          =                       sunit=16     swidth=80 blks
> | naming   =version 2              bsize=4096   ascii-ci=0
> | log      =internal               bsize=4096   blocks=521728, version=2
> |          =                       sectsz=512   sunit=16 blks, lazy-count=1
> | realtime =none                   extsz=4096   blocks=0, rtextents=0
> |
> | # xfs_db -r "-c freesp -s" /dev/sdg1
> |    from      to extents  blocks    pct
> |       1       1   69981   69981   2.99
> |       2       3  246574  559149  23.86
> |       4       7  315038 1707929  72.88
> |       8      15     561 6374 0.27
> | total free extents 632154
> | total free blocks 2343433
> | average free extent size 3.70706
> |
> |
> |
> | More detailed information on destination file system:
> |
> | # df -k | grep sdj1
> | /dev/sdj1            9762777052 9710148076  52628976 100% /content/sg08/vd08
> |
> | # df -i | grep sdj1
> | /dev/sdj1            28622264 2307776 26314488    9% /content/sg08/vd08
> |
> | # xfs_info /dev/sdj1
> | meta-data=/dev/sdj1              isize=2048   agcount=10, agsize=268435424
> | blks
> |          =                       sectsz=512   attr=2
> | data     =                       bsize=4096   blocks=2441215991, imaxpct=5
> |          =                       sunit=16     swidth=80 blks
> | naming   =version 2              bsize=4096   ascii-ci=0
> | log      =internal               bsize=4096   blocks=521728, version=2
> |          =                       sectsz=512   sunit=16 blks, lazy-count=1
> | realtime =none                   extsz=4096   blocks=0, rtextents=0
> |
> | # xfs_db -r "-c freesp -s" /dev/sdj1
> |    from      to extents  blocks    pct
> |       1       1   81761   81761   0.62
> |       2       3  530258 1147719   8.73
> |       4       7  675864 3551039  27.01
> |       8      15  743089 8363043  63.62
> |      16      31     102    1972   0.02
> | total free extents 2031074
> | total free blocks 13145534
> | average free extent size 6.47221
> |
> |
> | I would be grateful if anyone could shed some light on why this is
> | happening or maybe even provide a solution.
>
> You are using 2 KiB inodes, so an inode cluster (64 inodes) requires
> 128 KiB of contiguous space on disk.  The freesp output above shows that
> the largest possible contiguous free space chunk available is 31 * 4 KiB
> or 4 KiB short of 128 KiB.  You don't have enough contiguous space to
> create a new inode cluster, and your existing inodes are likely all
> used.  This can be verified using xfs_db:
>       xfs_db -r -c "sb" -c "p ifree" /dev/sdj1
>
> xfs_fsr does not defragment free space, it only makes the problem worse.
> A possible solution:
>   1.  mount the filesystem with the ikeep mount option
>   2.  delete a few large files to free up some contiguous space

large -contiguous- files. It's likely any files written recently
will be as fragmented as the free space....

>   3.  create a few thousand files to "preallocate" inodes
>   4.  delete the newly created files

That will work for a while, but it's really just a temporary
workaround until those "preallocated" inodes are exhausted. Normally
to recover from this situation you need to free 15-20% of the disk
space to allow sufficiently large contiguous free space extents to
reform naturally and allow the allocator to work at full efficiency
again....

> The ikeep mount option will prevent the space for inodes from being
> reused for other purposes.

The problem with using ikeep is that the remaining empty inode
chunks prevent free space from defragmenting itself fully as you
remove files from the filesystem.

Realistically, I think the problem is that you are running your
filesystems at near ENOSPC for extended periods of time. That is
guaranteed to fragment free space and any files that are written
when the filesytem is in this condition. As Geoffrey has said -
xfs_fsr will not fix your problems - only changing the way you use
your storage will prevent the problem from occurring again.

Cheers,

Dave.
--
Dave Chinner
david@fromorbit.com