All of lore.kernel.org
 help / color / mirror / Atom feed
* fallocate bug?
@ 2012-05-07 12:44 Zhu Han
  2012-05-07 23:59 ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Zhu Han @ 2012-05-07 12:44 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 948 bytes --]

Seems like xfs of CentOS 6.X occupies much more storage space than desired
if fallocate is used against the file. Here is the step to reproduce it:

By the way, is it normal when the file is moved around after the
preallocated region is filled with data?

$ uname -r
2.6.32-220.7.1.el6.x86_64

$fallocate -n --offset 0 -l 1G file    ---->Write a little more data than
the preallocated size

$ xfs_bmap -p -vv file
file:
 EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
TOTAL FLAGS
   0: [0..2097151]:    2593408088..2595505239 21 (29420144..31517295)
2097152 10000

$ dd if=/dev/zero of=/tmp/file bs=1M count=1026 conv=fsync

$ xfs_bmap -p -vv file
file:
 EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
TOTAL FLAGS
   0: [0..4194303]:    2709184016..2713378319 22 (23101408..27295711)
4194304 00000

$ du -h --apparent-size file
1.1G    file

$du -h file
2.0G

best regards,
韩竹(Zhu Han)

[-- Attachment #1.2: Type: text/html, Size: 1127 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-07 12:44 fallocate bug? Zhu Han
@ 2012-05-07 23:59 ` Dave Chinner
  2012-05-08  3:24   ` Zhu Han
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2012-05-07 23:59 UTC (permalink / raw)
  To: Zhu Han; +Cc: xfs

On Mon, May 07, 2012 at 08:44:17PM +0800, Zhu Han wrote:
> Seems like xfs of CentOS 6.X occupies much more storage space than desired
> if fallocate is used against the file. Here is the step to reproduce it:

You test case is not doing what you think it is doing.

> By the way, is it normal when the file is moved around after the
> preallocated region is filled with data?
> 
> $ uname -r
> 2.6.32-220.7.1.el6.x86_64
> 
> $fallocate -n --offset 0 -l 1G file    ---->Write a little more data than
> the preallocated size
> 
> $ xfs_bmap -p -vv file
> file:
>  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> TOTAL FLAGS
>    0: [0..2097151]:    2593408088..2595505239 21 (29420144..31517295)
> 2097152 10000
> 
> $ dd if=/dev/zero of=/tmp/file bs=1M count=1026 conv=fsync

That does a truncate first, removing all the preallocated space. Use
conv=notrunc to avoid this. Hence the space allocated by this
new write is different to the space allocated by the above
preallocation. The file has not been moved, the filesystem just did
what you asked it to do.

> 
> $ xfs_bmap -p -vv file
> file:
>  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> TOTAL FLAGS
>    0: [0..4194303]:    2709184016..2713378319 22 (23101408..27295711)
> 4194304 00000

And so now you've triggered the speculative delayed allocation
beyond EOF, which is normal behaviour. Hence there are currently
unused blocks beyond EOF which will get removed either when the next
close(fd) occurs on the file or the inode is removed from the cache.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-07 23:59 ` Dave Chinner
@ 2012-05-08  3:24   ` Zhu Han
  2012-05-08  4:40     ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Zhu Han @ 2012-05-08  3:24 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1: Type: text/plain, Size: 1977 bytes --]

On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@fromorbit.com> wrote:

> On Mon, May 07, 2012 at 08:44:17PM +0800, Zhu Han wrote:
> > Seems like xfs of CentOS 6.X occupies much more storage space than
> desired
> > if fallocate is used against the file. Here is the step to reproduce it:
>
> You test case is not doing what you think it is doing.
>

Thanks for pointing it out.



>
> > By the way, is it normal when the file is moved around after the
> > preallocated region is filled with data?
> >
> > $ uname -r
> > 2.6.32-220.7.1.el6.x86_64
> >
> > $fallocate -n --offset 0 -l 1G file    ---->Write a little more data than
> > the preallocated size
> >
> > $ xfs_bmap -p -vv file
> > file:
> >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > TOTAL FLAGS
> >    0: [0..2097151]:    2593408088..2595505239 21 (29420144..31517295)
> > 2097152 10000
> >
> > $ dd if=/dev/zero of=/tmp/file bs=1M count=1026 conv=fsync
>
> That does a truncate first, removing all the preallocated space. Use
> conv=notrunc to avoid this. Hence the space allocated by this
> new write is different to the space allocated by the above
> preallocation. The file has not been moved, the filesystem just did
> what you asked it to do.
>
> >
> > $ xfs_bmap -p -vv file
> > file:
> >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > TOTAL FLAGS
> >    0: [0..4194303]:    2709184016..2713378319 22 (23101408..27295711)
> > 4194304 00000
>
> And so now you've triggered the speculative delayed allocation
> beyond EOF, which is normal behaviour. Hence there are currently
> unused blocks beyond EOF which will get removed either when the next
> close(fd) occurs on the file or the inode is removed from the cache.
>

Close(fd) should be invoked before dd quits. But why the extra blocks
beyond EOF are not freed?

The only way I found to remove the extra blocks is truncate the file to its
real size.


>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

[-- Attachment #1.2: Type: text/html, Size: 2937 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-08  3:24   ` Zhu Han
@ 2012-05-08  4:40     ` Dave Chinner
  2012-05-08  5:10       ` Zhu Han
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2012-05-08  4:40 UTC (permalink / raw)
  To: Zhu Han; +Cc: xfs

On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@fromorbit.com> wrote:
> 
> > On Mon, May 07, 2012 at 08:44:17PM +0800, Zhu Han wrote:
> > > Seems like xfs of CentOS 6.X occupies much more storage space than
> > desired
> > > if fallocate is used against the file. Here is the step to reproduce it:
> >
> > You test case is not doing what you think it is doing.
> 
> Thanks for pointing it out.
> 
> > > By the way, is it normal when the file is moved around after the
> > > preallocated region is filled with data?
> > >
> > > $ uname -r
> > > 2.6.32-220.7.1.el6.x86_64
> > >
> > > $fallocate -n --offset 0 -l 1G file    ---->Write a little more data than
> > > the preallocated size
> > >
> > > $ xfs_bmap -p -vv file
> > > file:
> > >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > > TOTAL FLAGS
> > >    0: [0..2097151]:    2593408088..2595505239 21 (29420144..31517295)
> > > 2097152 10000
> > >
> > > $ dd if=/dev/zero of=/tmp/file bs=1M count=1026 conv=fsync
> >
> > That does a truncate first, removing all the preallocated space. Use
> > conv=notrunc to avoid this. Hence the space allocated by this
> > new write is different to the space allocated by the above
> > preallocation. The file has not been moved, the filesystem just did
> > what you asked it to do.
> >
> > >
> > > $ xfs_bmap -p -vv file
> > > file:
> > >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > > TOTAL FLAGS
> > >    0: [0..4194303]:    2709184016..2713378319 22 (23101408..27295711)
> > > 4194304 00000
> >
> > And so now you've triggered the speculative delayed allocation
> > beyond EOF, which is normal behaviour. Hence there are currently
> > unused blocks beyond EOF which will get removed either when the next
> > close(fd) occurs on the file or the inode is removed from the cache.
> >
> 
> Close(fd) should be invoked before dd quits. But why the extra blocks
> beyond EOF are not freed?

The removal is conditional on how many times the fd has been closed
with dirty data on the inode.

> The only way I found to remove the extra blocks is truncate the file to its
> real size.

If the close() didn't remove them, they will be removed when the
inode ages out of the cache. Why do you even care about them?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-08  4:40     ` Dave Chinner
@ 2012-05-08  5:10       ` Zhu Han
  2012-05-08  5:47         ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Zhu Han @ 2012-05-08  5:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1: Type: text/plain, Size: 2693 bytes --]

On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david@fromorbit.com> wrote:

> On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@fromorbit.com>
> wrote:
> >
> > > On Mon, May 07, 2012 at 08:44:17PM +0800, Zhu Han wrote:
> > > > Seems like xfs of CentOS 6.X occupies much more storage space than
> > > desired
> > > > if fallocate is used against the file. Here is the step to reproduce
> it:
> > >
> > > You test case is not doing what you think it is doing.
> >
> > Thanks for pointing it out.
> >
> > > > By the way, is it normal when the file is moved around after the
> > > > preallocated region is filled with data?
> > > >
> > > > $ uname -r
> > > > 2.6.32-220.7.1.el6.x86_64
> > > >
> > > > $fallocate -n --offset 0 -l 1G file    ---->Write a little more data
> than
> > > > the preallocated size
> > > >
> > > > $ xfs_bmap -p -vv file
> > > > file:
> > > >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > > > TOTAL FLAGS
> > > >    0: [0..2097151]:    2593408088..2595505239 21 (29420144..31517295)
> > > > 2097152 10000
> > > >
> > > > $ dd if=/dev/zero of=/tmp/file bs=1M count=1026 conv=fsync
> > >
> > > That does a truncate first, removing all the preallocated space. Use
> > > conv=notrunc to avoid this. Hence the space allocated by this
> > > new write is different to the space allocated by the above
> > > preallocation. The file has not been moved, the filesystem just did
> > > what you asked it to do.
> > >
> > > >
> > > > $ xfs_bmap -p -vv file
> > > > file:
> > > >  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET
> > > > TOTAL FLAGS
> > > >    0: [0..4194303]:    2709184016..2713378319 22 (23101408..27295711)
> > > > 4194304 00000
> > >
> > > And so now you've triggered the speculative delayed allocation
> > > beyond EOF, which is normal behaviour. Hence there are currently
> > > unused blocks beyond EOF which will get removed either when the next
> > > close(fd) occurs on the file or the inode is removed from the cache.
> > >
> >
> > Close(fd) should be invoked before dd quits. But why the extra blocks
> > beyond EOF are not freed?
>
> The removal is conditional on how many times the fd has been closed
> with dirty data on the inode.
>
> > The only way I found to remove the extra blocks is truncate the file to
> its
> > real size.
>
> If the close() didn't remove them, they will be removed when the
> inode ages out of the cache. Why do you even care about them?
>

Our distributed system depends on the real length of files to account the
space usage. This behavior make the account inaccurate.


>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

[-- Attachment #1.2: Type: text/html, Size: 3876 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-08  5:10       ` Zhu Han
@ 2012-05-08  5:47         ` Dave Chinner
  2012-05-08 15:25           ` Zhu Han
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2012-05-08  5:47 UTC (permalink / raw)
  To: Zhu Han; +Cc: xfs

On Tue, May 08, 2012 at 01:10:55PM +0800, Zhu Han wrote:
> On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> > > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@fromorbit.com>
> > wrote:
> > > > And so now you've triggered the speculative delayed allocation
> > > > beyond EOF, which is normal behaviour. Hence there are currently
> > > > unused blocks beyond EOF which will get removed either when the next
> > > > close(fd) occurs on the file or the inode is removed from the cache.
> > > >
> > >
> > > Close(fd) should be invoked before dd quits. But why the extra blocks
> > > beyond EOF are not freed?
> >
> > The removal is conditional on how many times the fd has been closed
> > with dirty data on the inode.
> >
> > > The only way I found to remove the extra blocks is truncate the file to
> > its
> > > real size.
> >
> > If the close() didn't remove them, they will be removed when the
> > inode ages out of the cache. Why do you even care about them?
> 
> Our distributed system depends on the real length of files to account the
> space usage.

That's ..... naive. It's never been valid to assume that the file
size is an accurate reflection of space usage, especially as it will
*always* be wrong for sparse files. In the same light, you also
cannot assume that it is an accurate reflection for non-sparse files
because we can do both explicit and speculative allocation beyond
EOF which only du will show. Not to mention that metadata is not
accounted in the file length, and that can consume a significant
amount of space, too.

> This behavior make the account inaccurate.

The block usage reported by XFS is both accurate and correct. The
file size reported by XFS is both accurate and correct. You're
"account inaccuracy" is assuming that they are the same. Perhaps you
should be using quotas for accurate space usage accounting?

Anyway, if you really want to stop speculative delayed allocation
beyond EOF, then use the allocsize mount option to control it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-08  5:47         ` Dave Chinner
@ 2012-05-08 15:25           ` Zhu Han
  2012-05-08 22:31             ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Zhu Han @ 2012-05-08 15:25 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1: Type: text/plain, Size: 2675 bytes --]

On Tue, May 8, 2012 at 1:47 PM, Dave Chinner <david@fromorbit.com> wrote:

> On Tue, May 08, 2012 at 01:10:55PM +0800, Zhu Han wrote:
> > On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david@fromorbit.com>
> wrote:
> > > On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> > > > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@fromorbit.com>
> > > wrote:
> > > > > And so now you've triggered the speculative delayed allocation
> > > > > beyond EOF, which is normal behaviour. Hence there are currently
> > > > > unused blocks beyond EOF which will get removed either when the
> next
> > > > > close(fd) occurs on the file or the inode is removed from the
> cache.
> > > > >
> > > >
> > > > Close(fd) should be invoked before dd quits. But why the extra blocks
> > > > beyond EOF are not freed?
> > >
> > > The removal is conditional on how many times the fd has been closed
> > > with dirty data on the inode.
> > >
> > > > The only way I found to remove the extra blocks is truncate the file
> to
> > > its
> > > > real size.
> > >
> > > If the close() didn't remove them, they will be removed when the
> > > inode ages out of the cache. Why do you even care about them?
> >
> > Our distributed system depends on the real length of files to account the
> > space usage.
>
> That's ..... naive. It's never been valid to assume that the file
> size is an accurate reflection of space usage, especially as it will
> *always* be wrong for sparse files. In the same light, you also
> cannot assume that it is an accurate reflection for non-sparse files
> because we can do both explicit and speculative allocation beyond
> EOF which only du will show. Not to mention that metadata is not
> accounted in the file length, and that can consume a significant
> amount of space, too.
>
> > This behavior make the account inaccurate.
>
> The block usage reported by XFS is both accurate and correct. The
> file size reported by XFS is both accurate and correct. You're
> "account inaccuracy" is assuming that they are the same. Perhaps you
> should be using quotas for accurate space usage accounting?
>
> Anyway, if you really want to stop speculative delayed allocation
> beyond EOF, then use the allocsize mount option to control it.
>


Thanks for help.

I can control the size of pre-allocation, so no data are written beyond the
pre-allocated block range, so no speculative allocation is triggered.
Besides it, our system can sync the accurate space usage of mount point
periodically.

Can you give any hints about the most lightweight approach to get the
accurate block usage of whole file system?


>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

[-- Attachment #1.2: Type: text/html, Size: 3711 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-08 15:25           ` Zhu Han
@ 2012-05-08 22:31             ` Dave Chinner
  2012-05-09  1:43               ` Zhu Han
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2012-05-08 22:31 UTC (permalink / raw)
  To: Zhu Han; +Cc: xfs

On Tue, May 08, 2012 at 11:25:05PM +0800, Zhu Han wrote:
> On Tue, May 8, 2012 at 1:47 PM, Dave Chinner <david@fromorbit.com> wrote:
> 
> > On Tue, May 08, 2012 at 01:10:55PM +0800, Zhu Han wrote:
> > > On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david@fromorbit.com>
> > wrote:
> > > > On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> > > > > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@fromorbit.com>
> > > > wrote:
> > > > > > And so now you've triggered the speculative delayed allocation
> > > > > > beyond EOF, which is normal behaviour. Hence there are currently
> > > > > > unused blocks beyond EOF which will get removed either when the
> > next
> > > > > > close(fd) occurs on the file or the inode is removed from the
> > cache.
> > > > > >
> > > > >
> > > > > Close(fd) should be invoked before dd quits. But why the extra blocks
> > > > > beyond EOF are not freed?
> > > >
> > > > The removal is conditional on how many times the fd has been closed
> > > > with dirty data on the inode.
> > > >
> > > > > The only way I found to remove the extra blocks is truncate the file
> > to
> > > > its
> > > > > real size.
> > > >
> > > > If the close() didn't remove them, they will be removed when the
> > > > inode ages out of the cache. Why do you even care about them?
> > >
> > > Our distributed system depends on the real length of files to account the
> > > space usage.
> >
> > That's ..... naive. It's never been valid to assume that the file
> > size is an accurate reflection of space usage, especially as it will
> > *always* be wrong for sparse files. In the same light, you also
> > cannot assume that it is an accurate reflection for non-sparse files
> > because we can do both explicit and speculative allocation beyond
> > EOF which only du will show. Not to mention that metadata is not
> > accounted in the file length, and that can consume a significant
> > amount of space, too.
> >
> > > This behavior make the account inaccurate.
> >
> > The block usage reported by XFS is both accurate and correct. The
> > file size reported by XFS is both accurate and correct. You're
> > "account inaccuracy" is assuming that they are the same. Perhaps you
> > should be using quotas for accurate space usage accounting?
> >
> > Anyway, if you really want to stop speculative delayed allocation
> > beyond EOF, then use the allocsize mount option to control it.
> >
> 
> 
> Thanks for help.
> 
> I can control the size of pre-allocation, so no data are written beyond the
> pre-allocated block range, so no speculative allocation is triggered.
> Besides it, our system can sync the accurate space usage of mount point
> periodically.
> 
> Can you give any hints about the most lightweight approach to get the
> accurate block usage of whole file system?

If you are just after the whole filesystem, then statfs(2) will give
you blocks used and free. If you are after a finer breakdown, then
quotas are probably what you want - they can be used for accounting
separately to the space limiting enforcement. Hence you get
accurate, up-to-date per user, group or project space accounting
without actually limiting space usage at all.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fallocate bug?
  2012-05-08 22:31             ` Dave Chinner
@ 2012-05-09  1:43               ` Zhu Han
  0 siblings, 0 replies; 9+ messages in thread
From: Zhu Han @ 2012-05-09  1:43 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1: Type: text/plain, Size: 3625 bytes --]

Thank you so much for your kindly help.

best regards,
韩竹(Zhu Han)


On Wed, May 9, 2012 at 6:31 AM, Dave Chinner <david@fromorbit.com> wrote:

> On Tue, May 08, 2012 at 11:25:05PM +0800, Zhu Han wrote:
> > On Tue, May 8, 2012 at 1:47 PM, Dave Chinner <david@fromorbit.com>
> wrote:
> >
> > > On Tue, May 08, 2012 at 01:10:55PM +0800, Zhu Han wrote:
> > > > On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david@fromorbit.com>
> > > wrote:
> > > > > On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote:
> > > > > > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <
> david@fromorbit.com>
> > > > > wrote:
> > > > > > > And so now you've triggered the speculative delayed allocation
> > > > > > > beyond EOF, which is normal behaviour. Hence there are
> currently
> > > > > > > unused blocks beyond EOF which will get removed either when the
> > > next
> > > > > > > close(fd) occurs on the file or the inode is removed from the
> > > cache.
> > > > > > >
> > > > > >
> > > > > > Close(fd) should be invoked before dd quits. But why the extra
> blocks
> > > > > > beyond EOF are not freed?
> > > > >
> > > > > The removal is conditional on how many times the fd has been closed
> > > > > with dirty data on the inode.
> > > > >
> > > > > > The only way I found to remove the extra blocks is truncate the
> file
> > > to
> > > > > its
> > > > > > real size.
> > > > >
> > > > > If the close() didn't remove them, they will be removed when the
> > > > > inode ages out of the cache. Why do you even care about them?
> > > >
> > > > Our distributed system depends on the real length of files to
> account the
> > > > space usage.
> > >
> > > That's ..... naive. It's never been valid to assume that the file
> > > size is an accurate reflection of space usage, especially as it will
> > > *always* be wrong for sparse files. In the same light, you also
> > > cannot assume that it is an accurate reflection for non-sparse files
> > > because we can do both explicit and speculative allocation beyond
> > > EOF which only du will show. Not to mention that metadata is not
> > > accounted in the file length, and that can consume a significant
> > > amount of space, too.
> > >
> > > > This behavior make the account inaccurate.
> > >
> > > The block usage reported by XFS is both accurate and correct. The
> > > file size reported by XFS is both accurate and correct. You're
> > > "account inaccuracy" is assuming that they are the same. Perhaps you
> > > should be using quotas for accurate space usage accounting?
> > >
> > > Anyway, if you really want to stop speculative delayed allocation
> > > beyond EOF, then use the allocsize mount option to control it.
> > >
> >
> >
> > Thanks for help.
> >
> > I can control the size of pre-allocation, so no data are written beyond
> the
> > pre-allocated block range, so no speculative allocation is triggered.
> > Besides it, our system can sync the accurate space usage of mount point
> > periodically.
> >
> > Can you give any hints about the most lightweight approach to get the
> > accurate block usage of whole file system?
>
> If you are just after the whole filesystem, then statfs(2) will give
> you blocks used and free. If you are after a finer breakdown, then
> quotas are probably what you want - they can be used for accounting
> separately to the space limiting enforcement. Hence you get
> accurate, up-to-date per user, group or project space accounting
> without actually limiting space usage at all.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

[-- Attachment #1.2: Type: text/html, Size: 4863 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-05-09  1:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-07 12:44 fallocate bug? Zhu Han
2012-05-07 23:59 ` Dave Chinner
2012-05-08  3:24   ` Zhu Han
2012-05-08  4:40     ` Dave Chinner
2012-05-08  5:10       ` Zhu Han
2012-05-08  5:47         ` Dave Chinner
2012-05-08 15:25           ` Zhu Han
2012-05-08 22:31             ` Dave Chinner
2012-05-09  1:43               ` Zhu Han

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.