All of lore.kernel.org
 help / color / mirror / Atom feed
* file preallocation without unwritten flag being set
@ 2009-05-12 23:02 p v
  2009-05-13  0:04 ` Eric Sandeen
  0 siblings, 1 reply; 11+ messages in thread
From: p v @ 2009-05-12 23:02 UTC (permalink / raw)
  To: xfs



Hello,

I need to create large files fast without initializing them - in the past I used these steps -

mkfs -t xfs -f -d unwritten=0 /dev/sda1
mount -t xfs -o noatime /dev/sda1 /hay
touch /hay/foo
xfs_io /hay/foo
xfs_io> resvsp 0 1024g
xfs_io> quit
ls -i /hay/foo
131 /hay/foo
umount /hay
xfs_db -x /dev/sda1
xfs_db> inode 131
xfs_db> write core.size 1099511627776
core.size = 1099511627776
xfs_db> q

But unwritten=0 is failing as unrecognized option now (was it deprecated????) so I tried to clear the unwritten extent flag directly -

xfs_db> a u.bmbt.ptrs[1]
xfs_db> write recs[1].extentflag 0
recs[1].extentflag = 1
xfs_db> 

It just won't change to 0 - any way to do this? Or is there any straightforward way to preallocate a large file and set it's file size without the unwritten flags being turned on?

Thanks

Peter Vajgel



      

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-12 23:02 file preallocation without unwritten flag being set p v
@ 2009-05-13  0:04 ` Eric Sandeen
  2009-05-13  4:34   ` p v
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2009-05-13  0:04 UTC (permalink / raw)
  To: p v; +Cc: xfs

p v wrote:
> 
> Hello,
> 
> I need to create large files fast without initializing them - in the
> past I used these steps -
> 
> mkfs -t xfs -f -d unwritten=0 /dev/sda1 mount -t xfs -o noatime
> /dev/sda1 /hay touch /hay/foo xfs_io /hay/foo xfs_io> resvsp 0 1024g 
> xfs_io> quit ls -i /hay/foo 131 /hay/foo umount /hay xfs_db -x
> /dev/sda1 xfs_db> inode 131 xfs_db> write core.size 1099511627776 
> core.size = 1099511627776 xfs_db> q

Is there a reason that you don't want the unwritten flag set?  (You know
that not using the unwritten extents feature exposes garbage from the
disk in this case?)

There may well be a legit reason but I just want to make sure you're
doing what you think you're doing :)

Thanks,
-Eric

> But unwritten=0 is failing as unrecognized option now (was it
> deprecated????) so I tried to clear the unwritten extent flag
> directly -
> 
> xfs_db> a u.bmbt.ptrs[1] xfs_db> write recs[1].extentflag 0 
> recs[1].extentflag = 1 xfs_db>
> 
> It just won't change to 0 - any way to do this? Or is there any
> straightforward way to preallocate a large file and set it's file
> size without the unwritten flags being turned on?
> 
> Thanks
> 
> Peter Vajgel
> 
> 
> 
> 
> 
> _______________________________________________ xfs mailing list 
> xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13  0:04 ` Eric Sandeen
@ 2009-05-13  4:34   ` p v
  2009-05-13  5:08     ` Eric Sandeen
  0 siblings, 1 reply; 11+ messages in thread
From: p v @ 2009-05-13  4:34 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs




I want to avoid any metadata modifications while doing O_DIRECT reads (the fs is mounted with noatime). Right now I am doing it mostly for testing - I am seeing a performance degradation going from raw to xfs on a 10TB filesystem - probably due to my application but I am trying to narrow it down so I am starting with running randomio benchmark on raw - then 10TB file, then 10 1TB files, then 100 100GB files, ...

But in general certain applications can definitely take care of the preallocated space (db, FB haystack, ...). What they require is minimal fragmentation so they would prefer to preallocate the space (fill the whole fs with contigous files) and then maintain in-files app specific metadata (such as valid offsets of initialized data, ...). What I would really like is to have vxfs equivalent of setext options -

setext -r <reservation> -f chggsize

And on top of that I would really love to have is vxfs equivalent of "nomtime" mount option. Then with O_DIRECT I have raw-like performance.

With the unwritten mkfs option I could get the setext semantics. So what's the trick (before I dive into the xfs layout)? I am guessing that there is no equivalent for nomtime option?

Thanks

Peter Vajgel





----- Original Message ----
From: Eric Sandeen <sandeen@sandeen.net>
To: p v <pvlogin@yahoo.com>
Cc: xfs@oss.sgi.com
Sent: Tuesday, May 12, 2009 5:04:06 PM
Subject: Re: file preallocation without unwritten flag being set

p v wrote:
> 
> Hello,
> 
> I need to create large files fast without initializing them - in the
> past I used these steps -
> 
> mkfs -t xfs -f -d unwritten=0 /dev/sda1 mount -t xfs -o noatime
> /dev/sda1 /hay touch /hay/foo xfs_io /hay/foo xfs_io> resvsp 0 1024g 
> xfs_io> quit ls -i /hay/foo 131 /hay/foo umount /hay xfs_db -x
> /dev/sda1 xfs_db> inode 131 xfs_db> write core.size 1099511627776 
> core.size = 1099511627776 xfs_db> q

Is there a reason that you don't want the unwritten flag set?  (You know
that not using the unwritten extents feature exposes garbage from the
disk in this case?)

There may well be a legit reason but I just want to make sure you're
doing what you think you're doing :)

Thanks,
-Eric

> But unwritten=0 is failing as unrecognized option now (was it
> deprecated????) so I tried to clear the unwritten extent flag
> directly -
> 
> xfs_db> a u.bmbt.ptrs[1] xfs_db> write recs[1].extentflag 0 
> recs[1].extentflag = 1 xfs_db>
> 
> It just won't change to 0 - any way to do this? Or is there any
> straightforward way to preallocate a large file and set it's file
> size without the unwritten flags being turned on?
> 
> Thanks
> 
> Peter Vajgel
> 
> 
> 
> 
> 
> _______________________________________________ xfs mailing list 
> xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
> 


      

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13  4:34   ` p v
@ 2009-05-13  5:08     ` Eric Sandeen
  2009-05-13 21:05       ` p v
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2009-05-13  5:08 UTC (permalink / raw)
  To: p v; +Cc: xfs

p v wrote:
> 
> 
> I want to avoid any metadata modifications while doing O_DIRECT reads
> (the fs is mounted with noatime). Right now I am doing it mostly for
> testing - I am seeing a performance degradation going from raw to xfs
> on a 10TB filesystem - probably due to my application but I am trying
> to narrow it down so I am starting with running randomio benchmark on
> raw - then 10TB file, then 10 1TB files, then 100 100GB files, ...

you may want to try the inode64 mount option so the allocator is free to
roam your whole 10T ...

> But in general certain applications can definitely take care of the
> preallocated space (db, FB haystack, ...). 

Ok, so it sounds like you do understand the implications and you want to
be able to write into prealloc space without any metadata updates as
they are converted to initialized extents... :)

> What they require is
> minimal fragmentation so they would prefer to preallocate the space
> (fill the whole fs with contigous files) and then maintain in-files
> app specific metadata (such as valid offsets of initialized data,
> ...). What I would really like is to have vxfs equivalent of setext
> options -
> 
> setext -r <reservation> -f chggsize
> 
> And on top of that I would really love to have is vxfs equivalent of
> "nomtime" mount option. Then with O_DIRECT I have raw-like
> performance.
> 
> With the unwritten mkfs option I could get the setext semantics. So
> what's the trick (before I dive into the xfs layout)? I am guessing
> that there is no equivalent for nomtime option?

well, the unwritten=0 option did get removed:
http://git.kernel.org/?p=fs/xfs/xfsprogs-dev.git;a=commitdiff;h=8d537733f52a642d471f6781f32f306241dd4308

TBH I'm not entirely sure why.

The unwritten flag is per-filesystem not per-file; you can still clear
that feature bit:

#define XFS_SB_VERSION_EXTFLGBIT 0x1000

by using xfs_db in -x expert mode to rewrite every superblock's
"versionnum" without that bit set.

The xfs_db "version" command will give you a more textual representation
of what is actually set before & after.

You could script the sb rewrites...

For what it's worth, your xfs_db tricks below to preallocate seem a bit
... tricky.

This should suffice:

xfs_io -f /hay/foo
xfs_io> resvsp 0 1024g
xfs_io> truncate 1024g
xfs_io> quit

Oh and you're right, there's no "nomtime" option AFAIK.

-Eric

> Thanks
> 
> Peter Vajgel

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13  5:08     ` Eric Sandeen
@ 2009-05-13 21:05       ` p v
  2009-05-13 21:48         ` Eric Sandeen
  2009-05-13 22:28         ` Dave Chinner
  0 siblings, 2 replies; 11+ messages in thread
From: p v @ 2009-05-13 21:05 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs


doesn't seem to work - I tried to clear the extflg in the versionnum of the superblock (in every copy of it as well) but it doesn't work. The flag is still set on all extents.

xfs_db> version
versionnum [0xb4a4+0x8] = V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2
xfs_db> version 0xa4a4 0x8
versionnum [0xa4a4+0x8] = V4,NLINK,ALIGN,DIRV2,LOGV2,MOREBITS,ATTR2

typeset -i agcount=$(xfs_db -c "sb" -c "print" /dev/sda | grep agcount)
typeset -i i=0
while [[ $i != $agcount ]]
do      
        xfs_db -x -c "sb $i" -c "write versionnum 0xa4a4" /dev/sda
        i=i+1
done

And once I make the file xfs_repair complains and resets the sb flag - my guess is that in the extent allocation path it is hardcoded for the version 4 - any extent allocated beyond file size will get the flag ...

Also - 2 questions -

1) what is inode64 and where can I find out all of the undocumented mkfs/mount options (it's unfortunate that such a good fs doesnt' have a correspondingly good documentation)

2) why is the largest extent size limited to xxx blocks(can't find out thenumber - when does the inode get finally flushed? ls -i reports 19 as the inode number but even after unmounting inode 19 in xfs_db still shows a free inode - is it still only in the log???) ? I assumed that xfs_bmap gets me the correct number of extents but now looking at the inode with xfs_db it's obvious that xfs_bmap reports contiguous ranges rather than actual extents in the blockmap tree

thx

Peter Vajgel




----- Original Message ----
From: Eric Sandeen <sandeen@sandeen.net>
To: p v <pvlogin@yahoo.com>
Cc: xfs@oss.sgi.com
Sent: Tuesday, May 12, 2009 10:08:48 PM
Subject: Re: file preallocation without unwritten flag being set

p v wrote:
> 
> 
> I want to avoid any metadata modifications while doing O_DIRECT reads
> (the fs is mounted with noatime). Right now I am doing it mostly for
> testing - I am seeing a performance degradation going from raw to xfs
> on a 10TB filesystem - probably due to my application but I am trying
> to narrow it down so I am starting with running randomio benchmark on
> raw - then 10TB file, then 10 1TB files, then 100 100GB files, ...

you may want to try the inode64 mount option so the allocator is free to
roam your whole 10T ...

> But in general certain applications can definitely take care of the
> preallocated space (db, FB haystack, ...). 

Ok, so it sounds like you do understand the implications and you want to
be able to write into prealloc space without any metadata updates as
they are converted to initialized extents... :)

> What they require is
> minimal fragmentation so they would prefer to preallocate the space
> (fill the whole fs with contigous files) and then maintain in-files
> app specific metadata (such as valid offsets of initialized data,
> ...). What I would really like is to have vxfs equivalent of setext
> options -
> 
> setext -r <reservation> -f chggsize
> 
> And on top of that I would really love to have is vxfs equivalent of
> "nomtime" mount option. Then with O_DIRECT I have raw-like
> performance.
> 
> With the unwritten mkfs option I could get the setext semantics. So
> what's the trick (before I dive into the xfs layout)? I am guessing
> that there is no equivalent for nomtime option?

well, the unwritten=0 option did get removed:
http://git.kernel.org/?p=fs/xfs/xfsprogs-dev.git;a=commitdiff;h=8d537733f52a642d471f6781f32f306241dd4308

TBH I'm not entirely sure why.

The unwritten flag is per-filesystem not per-file; you can still clear
that feature bit:

#define XFS_SB_VERSION_EXTFLGBIT 0x1000

by using xfs_db in -x expert mode to rewrite every superblock's
"versionnum" without that bit set.

The xfs_db "version" command will give you a more textual representation
of what is actually set before & after.

You could script the sb rewrites...

For what it's worth, your xfs_db tricks below to preallocate seem a bit
... tricky.

This should suffice:

xfs_io -f /hay/foo
xfs_io> resvsp 0 1024g
xfs_io> truncate 1024g
xfs_io> quit

Oh and you're right, there's no "nomtime" option AFAIK.

-Eric

> Thanks
> 
> Peter Vajgel



      

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13 21:05       ` p v
@ 2009-05-13 21:48         ` Eric Sandeen
  2009-05-13 22:28         ` Dave Chinner
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Sandeen @ 2009-05-13 21:48 UTC (permalink / raw)
  To: p v; +Cc: xfs

p v wrote:
> doesn't seem to work - I tried to clear the extflg in the versionnum
> of the superblock (in every copy of it as well) but it doesn't work.
> The flag is still set on all extents.
> 
> xfs_db> version versionnum [0xb4a4+0x8] =
> V4,NLINK,ALIGN,DIRV2,LOGV2,EXTFLG,MOREBITS,ATTR2 xfs_db> version
> 0xa4a4 0x8 versionnum [0xa4a4+0x8] =
> V4,NLINK,ALIGN,DIRV2,LOGV2,MOREBITS,ATTR2
> 
> typeset -i agcount=$(xfs_db -c "sb" -c "print" /dev/sda | grep
> agcount) typeset -i i=0 while [[ $i != $agcount ]] do xfs_db -x -c
> "sb $i" -c "write versionnum 0xa4a4" /dev/sda i=i+1 done
> 
> And once I make the file xfs_repair complains and resets the sb flag
> - my guess is that in the extent allocation path it is hardcoded for
> the version 4 - any extent allocated beyond file size will get the
> flag ...

Oh, you'd probably need to do this when there are no files already with
the flag, i.e. on a fresh fs I think.

> Also - 2 questions -
> 
> 1) what is inode64 and where can I find out all of the undocumented
> mkfs/mount options (it's unfortunate that such a good fs doesnt' have
> a correspondingly good documentation)

all options for mkfs should be doc'd in the mkfs.xfs manpage

inode64 is also doc'd in my mount manpage:

       inode64
              Indicates  that  XFS  is  allowed to create inodes at any
              location in the filesystem, including  those  which  will
              result  in  inode  numbers occupying more than 32 bits of
              significance.  This is provided for backwards compatibil-
              ity,  but  causes  problems  for backup applications that
              cannot handle large inode numbers.

> 2) why is the largest extent size limited to xxx blocks(can't find
> out thenumber

... various containers that may limit the max size, I don't remember offhand

 - when does the inode get finally flushed? ls -i
> reports 19 as the inode number but even after unmounting inode 19 in
> xfs_db still shows a free inode - is it still only in the log???) ? I
> assumed that xfs_bmap gets me the correct number of extents but now
> looking at the inode with xfs_db it's obvious that xfs_bmap reports
> contiguous ranges rather than actual extents in the blockmap tree

hm, some cut & paste examples might be good here to show us exactly what
you're seeing.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13 21:05       ` p v
  2009-05-13 21:48         ` Eric Sandeen
@ 2009-05-13 22:28         ` Dave Chinner
  2009-05-13 23:51           ` p v
  1 sibling, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2009-05-13 22:28 UTC (permalink / raw)
  To: p v; +Cc: Eric Sandeen, xfs

On Wed, May 13, 2009 at 02:05:16PM -0700, p v wrote:
> 
> doesn't seem to work - I tried to clear the extflg in the
> versionnum of the superblock (in every copy of it as well) but it
> doesn't work. The flag is still set on all extents.

Sure - that xfs_db command only clears it from the superblock so
that *new* preallocations don't have the unwritten bit set. it
doesn't change existing allocations.

> And once I make the file xfs_repair complains and resets the sb
> flag - my guess is that in the extent allocation path it is
> hardcoded for the version 4.

More likely is that repair is seeing an existing unwritten extent
and setting the flag on the superblock.

> - any extent allocated beyond file size will get the flag .

Allocation beyond EOF does not use unwritten extents unless
it is preallocation.

> Also - 2 questions -
> 
> 1) what is inode64 and where can I find out all of the
> undocumented mkfs/mount options (it's unfortunate that such a good
> fs doesnt' have a correspondingly good documentation)

All the options should be documented.  Try 'man mkfs.xfs', 'man 8
mount' and Documentation/filesystems/xfs.txt

> 2) why is the largest extent size limited to xxx blocks

2^21 blocks. Limited to that because there are 21 bits for
the extent size in the on disk extent record.

> (can't find
> out thenumber - when does the inode get finally flushed? ls -i
> reports 19 as the inode number but even after unmounting inode 19
> in xfs_db still shows a free inode - is it still only in the
> log???)

Might be, or you are seeing stale cached block device data
(xfs_db operates in a different address space to a mounted
filesystem). Try dropping the page cache and then re-read.

> ? I assumed that xfs_bmap gets me the correct number of
> extents but now looking at the inode with xfs_db it's obvious that
> xfs_bmap reports contiguous ranges rather than actual extents in
> the blockmap tree.

Sure it does. You can tell how many extents a specific range is from
their maximum size (e.g. one extent per 8GB for a 4k block size
filesystem).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13 22:28         ` Dave Chinner
@ 2009-05-13 23:51           ` p v
  2009-05-14  0:17             ` Eric Sandeen
  0 siblings, 1 reply; 11+ messages in thread
From: p v @ 2009-05-13 23:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, xfs



I did it on a fresh filesystem (of course). It didn't make a difference - sb flags cleared, extent flags set, xfs_repair unhappy. I tried to repro again and do cut/paste of my steps but I lost the machine. The only difference this time was that I was going to do it with the default mkfs and mount options. I created the fs, cleared extflg from the superblocks and run xfs_io to resvsp the space. Then I run truncate and truncate decided to initialize the extents to zero and since it's 10TB it's going to take a while (can't reset as it's a remote machine and xfs_io is looping in the kernel ...). It didn't do it before and if I remember right the only differences were mkfs with 2048 size inodes and mount options with noatime,nodiratime,inode64,allocsize=1g. Anyway - I'll try it again on a different machine and send the steps. However the fact that it did try to zero the reserved space tells me that the extent flags were not set this time - and unfortunatelly
 it also means that it won't work - unless I do the previous workaround and instead of calling truncate from xfs_io I'll do the xfs_db and set the inode size directly - in fact now I remember that was exactly the reason why the original steps were so tricky - truncate up would zero extents but xfs_db will set the inode size to whatever without any problem.

Thanks for the info regarding the max extent size.

The man pages I am looking at (FC4, Centos5) don't have the xfs options like allocsize, inode64. Probably should download the latest versions ...

I am a little bit lost about the comment regarding the page caches. I unmounted the filesystem before running xfs_db. Shouldn't that flush pages, buffers, ...? I assume that xfs_db goes directly to the device so if the fs was unmounted then the device should be up to date?

thx

Peter Vajgel




----- Original Message ----
From: Dave Chinner <david@fromorbit.com>
To: p v <pvlogin@yahoo.com>
Cc: Eric Sandeen <sandeen@sandeen.net>; xfs@oss.sgi.com
Sent: Wednesday, May 13, 2009 3:28:23 PM
Subject: Re: file preallocation without unwritten flag being set

On Wed, May 13, 2009 at 02:05:16PM -0700, p v wrote:
> 
> doesn't seem to work - I tried to clear the extflg in the
> versionnum of the superblock (in every copy of it as well) but it
> doesn't work. The flag is still set on all extents.

Sure - that xfs_db command only clears it from the superblock so
that *new* preallocations don't have the unwritten bit set. it
doesn't change existing allocations.

> And once I make the file xfs_repair complains and resets the sb
> flag - my guess is that in the extent allocation path it is
> hardcoded for the version 4.

More likely is that repair is seeing an existing unwritten extent
and setting the flag on the superblock.

> - any extent allocated beyond file size will get the flag .

Allocation beyond EOF does not use unwritten extents unless
it is preallocation.

> Also - 2 questions -
> 
> 1) what is inode64 and where can I find out all of the
> undocumented mkfs/mount options (it's unfortunate that such a good
> fs doesnt' have a correspondingly good documentation)

All the options should be documented.  Try 'man mkfs.xfs', 'man 8
mount' and Documentation/filesystems/xfs.txt

> 2) why is the largest extent size limited to xxx blocks

2^21 blocks. Limited to that because there are 21 bits for
the extent size in the on disk extent record.

> (can't find
> out thenumber - when does the inode get finally flushed? ls -i
> reports 19 as the inode number but even after unmounting inode 19
> in xfs_db still shows a free inode - is it still only in the
> log???)

Might be, or you are seeing stale cached block device data
(xfs_db operates in a different address space to a mounted
filesystem). Try dropping the page cache and then re-read.

> ? I assumed that xfs_bmap gets me the correct number of
> extents but now looking at the inode with xfs_db it's obvious that
> xfs_bmap reports contiguous ranges rather than actual extents in
> the blockmap tree.

Sure it does. You can tell how many extents a specific range is from
their maximum size (e.g. one extent per 8GB for a 4k block size
filesystem).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com



      

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-13 23:51           ` p v
@ 2009-05-14  0:17             ` Eric Sandeen
  2009-05-14  0:34               ` Dave Chinner
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2009-05-14  0:17 UTC (permalink / raw)
  To: p v; +Cc: xfs

p v wrote:
> 
> I did it on a fresh filesystem (of course). It didn't make a
> difference - sb flags cleared, extent flags set, xfs_repair unhappy.

Strange, I don't see that when I test.

# dd if=/dev/urandom of=fsfile bs=1M count=64
# mkfs.xfs /dev/loop0
# for I in `seq 0 3`; do xfs_db -x /dev/loop0 -c "sb $I" -c "write
versionnum 0xa4a4"; done
# mount /dev/loop0 mnt/
# xfs_io -f -c "truncate 1m" -c "resvsp 0 1m" mnt/file
# hexdump -C mnt/file | more
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
|................|
*
00014000  09 d6 99 0d a7 43 a2 c9  95 ca 88 f6 4a 0c 93 8e
|.....C......J...|
00014010  ab b5 1a 1f c2 f3 2f 39  30 cc 8f 67 04 65 dd f1
|....../90..g.e..|
<morejunk>
# xfs_repair /dev/loop0
# xfs_db -c "version" /dev/loop0
versionnum [0xa4a4+0x8] = V4,NLINK,ALIGN,DIRV2,LOGV2,MOREBITS,ATTR2


> I tried to repro again and do cut/paste of my steps but I lost the
> machine. The only difference this time was that I was going to do it
> with the default mkfs and mount options. I created the fs, cleared
> extflg from the superblocks and run xfs_io to resvsp the space. Then
> I run truncate and truncate decided to initialize the extents to zero
> and since it's 10TB it's going to take a while (can't reset as it's a
> remote machine and xfs_io is looping in the kernel ...). It didn't do
> it before and if I remember right the only differences were mkfs with
> 2048 size inodes and mount options with
> noatime,nodiratime,inode64,allocsize=1g. Anyway - I'll try it again
> on a different machine and send the steps. However the fact that it
> did try to zero the reserved space tells me that the extent flags
> were not set this time - and unfortunatelly it also means that it
> won't work - unless I do the previous workaround and instead of
> calling truncate from xfs_io I'll do the xfs_db and set the inode
> size directly - in fact now I remember that was exactly the reason
> why the original steps were so tricky - truncate up would zero
> extents but xfs_db will set the inode size to whatever without any
> problem.

try truncate then resvsp; TBH not sure why it should matter though :)

> Thanks for the info regarding the max extent size.
> 
> The man pages I am looking at (FC4, Centos5) don't have the xfs
> options like allocsize, inode64. Probably should download the latest
> versions ...

those man pages are pretty old, yup.

> I am a little bit lost about the comment regarding the page caches. I
> unmounted the filesystem before running xfs_db. Shouldn't that flush
> pages, buffers, ...? I assume that xfs_db goes directly to the device
> so if the fs was unmounted then the device should be up to date?

The device is uptodate but the bdev address space may not be.

Unmounting will flush the filesytem address space, but not the block
device address space.

So yes unmount pushes everything to the disk, but the bdev address space
still has other cached data.

echo 3 > /proc/sys/vm/drop_caches

will force you to reread from disk.  (xfs_db uses buffered IO AFAIK)

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-14  0:17             ` Eric Sandeen
@ 2009-05-14  0:34               ` Dave Chinner
  2009-05-14  0:41                 ` Eric Sandeen
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2009-05-14  0:34 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

On Wed, May 13, 2009 at 07:17:41PM -0500, Eric Sandeen wrote:
> p v wrote:
> > with the default mkfs and mount options. I created the fs, cleared
> > extflg from the superblocks and run xfs_io to resvsp the space. Then
> > I run truncate and truncate decided to initialize the extents to zero
> > and since it's 10TB it's going to take a while (can't reset as it's a
> > remote machine and xfs_io is looping in the kernel ...). It didn't do
......
> 
> try truncate then resvsp; TBH not sure why it should matter though :)

Uninitialised extents beyond EOF get zeroed when EOF is moved.
if you set the set before preallocation, then there are no extents
to zero. FWIW, if they have the unwritten flag, this zeroing does not occur.

> > I am a little bit lost about the comment regarding the page caches. I
> > unmounted the filesystem before running xfs_db. Shouldn't that flush
> > pages, buffers, ...? I assume that xfs_db goes directly to the device
> > so if the fs was unmounted then the device should be up to date?
> 
> The device is uptodate but the bdev address space may not be.
> 
> Unmounting will flush the filesytem address space, but not the block
> device address space.

Not exactly the problem, though. XFS opens it's own device address space
when mounting - not the address space you get by opening /dev/sdX.
xfs_db uses the address space associated with /dev/sdX. hence
if you do:

# xfs_db /dev/sdc
....
# mount /dev/sdc
<do some changes>
# unmount /dev/sdc
# xfs_db /dev/sdc

The second invocation of xfs_db will not see any of the changes that
occured to the filesystem because it will read from the buffers
cached on /dev/sdc during the first invocation.

This is the same problem Grub has....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: file preallocation without unwritten flag being set
  2009-05-14  0:34               ` Dave Chinner
@ 2009-05-14  0:41                 ` Eric Sandeen
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Sandeen @ 2009-05-14  0:41 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Dave Chinner wrote:

>> Unmounting will flush the filesytem address space, but not the block
>> device address space.
> 
> Not exactly the problem, though. XFS opens it's own device address space
> when mounting - not the address space you get by opening /dev/sdX.
> xfs_db uses the address space associated with /dev/sdX. hence
> if you do:
> 
> # xfs_db /dev/sdc
> ....
> # mount /dev/sdc
> <do some changes>
> # unmount /dev/sdc
> # xfs_db /dev/sdc
> 
> The second invocation of xfs_db will not see any of the changes that
> occured to the filesystem because it will read from the buffers
> cached on /dev/sdc during the first invocation.
> 
> This is the same problem Grub has....

We meant the same thing, even if I said it wrong ;)

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-05-14  0:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12 23:02 file preallocation without unwritten flag being set p v
2009-05-13  0:04 ` Eric Sandeen
2009-05-13  4:34   ` p v
2009-05-13  5:08     ` Eric Sandeen
2009-05-13 21:05       ` p v
2009-05-13 21:48         ` Eric Sandeen
2009-05-13 22:28         ` Dave Chinner
2009-05-13 23:51           ` p v
2009-05-14  0:17             ` Eric Sandeen
2009-05-14  0:34               ` Dave Chinner
2009-05-14  0:41                 ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.