All of lore.kernel.org
 help / color / mirror / Atom feed
* specify agsize?
@ 2013-07-14  0:11 aurfalien
  2013-07-14  2:13 ` Eric Sandeen
  0 siblings, 1 reply; 15+ messages in thread
From: aurfalien @ 2013-07-14  0:11 UTC (permalink / raw)
  To: xfs

Hello again,

I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.

So I do;

mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data

And I get;

meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=6701716480, imaxpct=5
         =                       sunit=32     swidth=448 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=131072, version=2
         =                       sectsz=512   sunit=32 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


All is fine but I was recently made aware of tweaking agsize.  So I would like to mess around and iozone any diffs between the above agcount of 32 and whatever agcount changes I may do.

I didn't see any mention of agsize/agcount on the XFS FAQ and would like to know, based on the above, why does XFS think I have 32 allocation groups with the corresponding size?  And are these optimal numbers?

Thanks in advance,

- aurf

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14  0:11 specify agsize? aurfalien
@ 2013-07-14  2:13 ` Eric Sandeen
  2013-07-14  4:20   ` aurfalien
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2013-07-14  2:13 UTC (permalink / raw)
  To: aurfalien; +Cc: xfs

On 7/13/13 7:11 PM, aurfalien wrote:
> Hello again,
> 
> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.
> 
> So I do;
> 
> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
> 
> And I get;
> 
> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>          =                       sectsz=512   attr=2, projid32bit=0
> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>          =                       sunit=32     swidth=448 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal log           bsize=4096   blocks=131072, version=2
>          =                       sectsz=512   sunit=32 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> 
> All is fine but I was recently made aware of tweaking agsize.

Made aware by what?  For what reason?

> So I would like to mess around and iozone any diffs between the above
> agcount of 32 and whatever agcount changes I may do.

Unless iozone is your machine's normal workload, that will probably prove to be uninteresting.

> I didn't see any mention of agsize/agcount on the XFS FAQ and would
> like to know, based on the above, why does XFS think I have 32
> allocation groups with the corresponding size?

It doesn't think so, it _knows_ so, because it made them itself.  ;)

> And are these optimal
> numbers?

How high is up?

Here's the appropriate faq entry:

http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E

-Eric
 
> Thanks in advance,
> 
> - aurf
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14  2:13 ` Eric Sandeen
@ 2013-07-14  4:20   ` aurfalien
  2013-07-14  7:06     ` Stan Hoeppner
  2013-07-14 16:14     ` Eric Sandeen
  0 siblings, 2 replies; 15+ messages in thread
From: aurfalien @ 2013-07-14  4:20 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs


On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote:

> On 7/13/13 7:11 PM, aurfalien wrote:
>> Hello again,
>> 
>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.
>> 
>> So I do;
>> 
>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
>> 
>> And I get;
>> 
>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>>         =                       sectsz=512   attr=2, projid32bit=0
>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>>         =                       sunit=32     swidth=448 blks
>> naming   =version 2              bsize=4096   ascii-ci=0
>> log      =internal log           bsize=4096   blocks=131072, version=2
>>         =                       sectsz=512   sunit=32 blks, lazy-count=1
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>> 
>> 
>> All is fine but I was recently made aware of tweaking agsize.
> 
> Made aware by what?  For what reason?

Autodesk has this software called Flame which requires very very fast local storage using XFS.  They have an entire write up on how to calc proper agsize for optimal performance.

I never mess with agsize but it is required when creating the XFS file system for use with Flame.  I realize its tailored for there apps particular IO characteristics, so I'm curious about it. 

>> So I would like to mess around and iozone any diffs between the above
>> agcount of 32 and whatever agcount changes I may do.
> 
> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting.

Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize.

>> I didn't see any mention of agsize/agcount on the XFS FAQ and would
>> like to know, based on the above, why does XFS think I have 32
>> allocation groups with the corresponding size?
> 
> It doesn't think so, it _knows_ so, because it made them itself.  ;)

Yea but based on what?

Why 32 at there current size?

>> And are these optimal
>> numbers?
> 
> How high is up?
> 
> Here's the appropriate faq entry:
> 
> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E

Problem is I run Centos so the line;

"As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. "

... doesn't really apply.

- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14  4:20   ` aurfalien
@ 2013-07-14  7:06     ` Stan Hoeppner
  2013-07-14 16:56       ` aurfalien
  2013-07-15  1:07       ` Dave Chinner
  2013-07-14 16:14     ` Eric Sandeen
  1 sibling, 2 replies; 15+ messages in thread
From: Stan Hoeppner @ 2013-07-14  7:06 UTC (permalink / raw)
  To: aurfalien; +Cc: Eric Sandeen, xfs

On 7/13/2013 11:20 PM, aurfalien wrote:
...
>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
...
>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>>>         =                       sectsz=512   attr=2, projid32bit=0
>>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>>>         =                       sunit=32     swidth=448 blks
>>> naming   =version 2              bsize=4096   ascii-ci=0
>>> log      =internal log           bsize=4096   blocks=131072, version=2
>>>         =                       sectsz=512   sunit=32 blks, lazy-count=1
>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
...
> Autodesk has this software called Flame which requires very very fast local storage using XFS.  

If "Flame" does any random writes then you probably shouldn't be using
RAID6.

> They have an entire write up on how to calc proper agsize for optimal performance.

I think you're confused.  Maximum agsize is 1TB.  Making your AGs
smaller than that won't decrease application performance, so it's
literally impossible to tune agsize to increase performance.  agcount on
the other hand can potentially have an effect if the application is
sufficiently threaded.  But agcount doesn't mean anything in isolation.
 It's tied directly to the characteristics of the RAID level and
hardware.  For example, mkfs.xfs gave you 32 AGs for this 14 spindle
array.  One could make 32 AGs on a single 4TB SATA disk and the
performance difference between the two will be radically different.

...
> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize.

No, it won't.  See above.

> Yea but based on what?

Based on the fact that your XFS is ~26TB.

mkfs.xfs could have given you 26 AGs of ~1TB each.  But it chose to give
you 32 AGs of ~815GB each.  Whether you run bonnie, iozone, or your
Flame application, you won't be able to measure a meaningful difference,
if any difference, between 26 and 32 AGs.

...
> Problem is I run Centos so the line;
> 
> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. "
> 
> ... doesn't really apply.

This makes no sense.  What doesn't apply?

You can change to noop or deadline with a single echo command in a
startup script:

echo noop > /sys/block/sdX/queue/scheduler

where sdX is the name of your RAID device.

-- 
Stan


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14  4:20   ` aurfalien
  2013-07-14  7:06     ` Stan Hoeppner
@ 2013-07-14 16:14     ` Eric Sandeen
  2013-07-14 16:46       ` aurfalien
                         ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Eric Sandeen @ 2013-07-14 16:14 UTC (permalink / raw)
  To: aurfalien; +Cc: xfs

On 7/13/13 11:20 PM, aurfalien wrote:
> 
> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote:
> 
>> On 7/13/13 7:11 PM, aurfalien wrote:
>>> Hello again,
>>>
>>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.
>>>
>>> So I do;
>>>
>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
>>>
>>> And I get;
>>>
>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>>>         =                       sectsz=512   attr=2, projid32bit=0
>>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>>>         =                       sunit=32     swidth=448 blks
>>> naming   =version 2              bsize=4096   ascii-ci=0
>>> log      =internal log           bsize=4096   blocks=131072, version=2
>>>         =                       sectsz=512   sunit=32 blks, lazy-count=1
>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>
>>>
>>> All is fine but I was recently made aware of tweaking agsize.
>>
>> Made aware by what?  For what reason?
> 
> Autodesk has this software called Flame which requires very very fast
> local storage using XFS. They have an entire write up on how to calc
> proper agsize for optimal performance.

http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199

I guess?

That's quite a procedure!  And I have to say, a slightly strange one at first glance.

It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe.

In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other.

> I never mess with agsize but it is require  when creating the XFS
> file system for use with Flame.  I realize its tailored for there
> apps particular IO characteristics, so I'm curious about it.

In general more AGs allow more concurrency for some operations;
it also will generally change how/where files in multiple directories get
allocated.

>>> So I would like to mess around and iozone any diffs between the above
>>> agcount of 32 and whatever agcount changes I may do.
>>
>> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting.
> 
> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize.

Not necessarily, see above; I'm not sure what iozone invocation would
show any effects from more or fewer AGs.  Anyway, iozone != flame, not
by a long shot! :)

>>> I didn't see any mention of agsize/agcount on the XFS FAQ and would
>>> like to know, based on the above, why does XFS think I have 32
>>> allocation groups with the corresponding size?
>>
>> It doesn't think so, it _knows_ so, because it made them itself.  ;)
> 
> Yea but based on what?
> 
> Why 32 at there current size?

see calc_default_ag_geometry()

Since you are in multidisk mode (you have stripe geometry) it uses more AGs for more AGs since it knows you have more spindles:

        } else if (dblocks > GIGABYTES(512, blocklog))
                shift = 5;

2^5 = 32

If you hadn't been in multidisk mode you would have gotten 25 AGs due to the max AG size of 1T.

>>> And are these optimal
>>> numbers?
>>
>> How high is up?
>>
>> Here's the appropriate faq entry:
>>
>> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E
> 
> Problem is I run Centos so the line;
> 
> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. "
> 
> ... doesn't really apply.

Well, my point was that your original question, "are these optimal numbers?" included absolutely no context of your workload, so the best answer is yes - the default mkfs behavior is optimal for a generic, unspecified workload.

I don't have access to Autodesk Flame so I really don't know how it behaves or what an optimal tuning might be.

Anyway, I think the calc_default_ag_geometry() info above answered your original question of "why does XFS think I have 32 allocation groups with the corresponding size?" - that's simply the default mkfs algorithm when in multidisk mode, for a disk of this size.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 16:14     ` Eric Sandeen
@ 2013-07-14 16:46       ` aurfalien
  2013-07-14 17:14       ` aurfalien
  2013-07-14 22:08       ` Stan Hoeppner
  2 siblings, 0 replies; 15+ messages in thread
From: aurfalien @ 2013-07-14 16:46 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs

Sorry to top post.

But this was exactly the kind of info I was hoping for.

Thanks Eric.

- aurf

On Jul 14, 2013, at 9:14 AM, Eric Sandeen wrote:

> On 7/13/13 11:20 PM, aurfalien wrote:
>> 
>> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote:
>> 
>>> On 7/13/13 7:11 PM, aurfalien wrote:
>>>> Hello again,
>>>> 
>>>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.
>>>> 
>>>> So I do;
>>>> 
>>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
>>>> 
>>>> And I get;
>>>> 
>>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>>>>        =                       sectsz=512   attr=2, projid32bit=0
>>>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>>>>        =                       sunit=32     swidth=448 blks
>>>> naming   =version 2              bsize=4096   ascii-ci=0
>>>> log      =internal log           bsize=4096   blocks=131072, version=2
>>>>        =                       sectsz=512   sunit=32 blks, lazy-count=1
>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>> 
>>>> 
>>>> All is fine but I was recently made aware of tweaking agsize.
>>> 
>>> Made aware by what?  For what reason?
>> 
>> Autodesk has this software called Flame which requires very very fast
>> local storage using XFS. They have an entire write up on how to calc
>> proper agsize for optimal performance.
> 
> http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199
> 
> I guess?
> 
> That's quite a procedure!  And I have to say, a slightly strange one at first glance.
> 
> It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe.
> 
> In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other.
> 
>> I never mess with agsize but it is require  when creating the XFS
>> file system for use with Flame.  I realize its tailored for there
>> apps particular IO characteristics, so I'm curious about it.
> 
> In general more AGs allow more concurrency for some operations;
> it also will generally change how/where files in multiple directories get
> allocated.
> 
>>>> So I would like to mess around and iozone any diffs between the above
>>>> agcount of 32 and whatever agcount changes I may do.
>>> 
>>> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting.
>> 
>> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize.
> 
> Not necessarily, see above; I'm not sure what iozone invocation would
> show any effects from more or fewer AGs.  Anyway, iozone != flame, not
> by a long shot! :)
> 
>>>> I didn't see any mention of agsize/agcount on the XFS FAQ and would
>>>> like to know, based on the above, why does XFS think I have 32
>>>> allocation groups with the corresponding size?
>>> 
>>> It doesn't think so, it _knows_ so, because it made them itself.  ;)
>> 
>> Yea but based on what?
>> 
>> Why 32 at there current size?
> 
> see calc_default_ag_geometry()
> 
> Since you are in multidisk mode (you have stripe geometry) it uses more AGs for more AGs since it knows you have more spindles:
> 
>        } else if (dblocks > GIGABYTES(512, blocklog))
>                shift = 5;
> 
> 2^5 = 32
> 
> If you hadn't been in multidisk mode you would have gotten 25 AGs due to the max AG size of 1T.
> 
>>>> And are these optimal
>>>> numbers?
>>> 
>>> How high is up?
>>> 
>>> Here's the appropriate faq entry:
>>> 
>>> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E
>> 
>> Problem is I run Centos so the line;
>> 
>> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. "
>> 
>> ... doesn't really apply.
> 
> Well, my point was that your original question, "are these optimal numbers?" included absolutely no context of your workload, so the best answer is yes - the default mkfs behavior is optimal for a generic, unspecified workload.
> 
> I don't have access to Autodesk Flame so I really don't know how it behaves or what an optimal tuning might be.
> 
> Anyway, I think the calc_default_ag_geometry() info above answered your original question of "why does XFS think I have 32 allocation groups with the corresponding size?" - that's simply the default mkfs algorithm when in multidisk mode, for a disk of this size.
> 
> -Eric
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14  7:06     ` Stan Hoeppner
@ 2013-07-14 16:56       ` aurfalien
  2013-07-15  1:07       ` Dave Chinner
  1 sibling, 0 replies; 15+ messages in thread
From: aurfalien @ 2013-07-14 16:56 UTC (permalink / raw)
  To: stan; +Cc: Eric Sandeen, xfs


On Jul 14, 2013, at 12:06 AM, Stan Hoeppner wrote:

> On 7/13/2013 11:20 PM, aurfalien wrote:
> ...
>>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
> ...
>>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>>>>        =                       sectsz=512   attr=2, projid32bit=0
>>>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>>>>        =                       sunit=32     swidth=448 blks
>>>> naming   =version 2              bsize=4096   ascii-ci=0
>>>> log      =internal log           bsize=4096   blocks=131072, version=2
>>>>        =                       sectsz=512   sunit=32 blks, lazy-count=1
>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
> ...
>> Autodesk has this software called Flame which requires very very fast local storage using XFS.  
> 
> If "Flame" does any random writes then you probably shouldn't be using
> RAID6.
> 
>> They have an entire write up on how to calc proper agsize for optimal performance.
> 
> I think you're confused.  Maximum agsize is 1TB.  Making your AGs
> smaller than that won't decrease application performance, so it's
> literally impossible to tune agsize to increase performance.  agcount on
> the other hand can potentially have an effect if the application is
> sufficiently threaded.  But agcount doesn't mean anything in isolation.
> It's tied directly to the characteristics of the RAID level and
> hardware.  For example, mkfs.xfs gave you 32 AGs for this 14 spindle
> array.  One could make 32 AGs on a single 4TB SATA disk and the
> performance difference between the two will be radically different.
> 
> ...
>> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize.
> 
> No, it won't.  See above.
> 
>> Yea but based on what?
> 
> Based on the fact that your XFS is ~26TB.
> 
> mkfs.xfs could have given you 26 AGs of ~1TB each.  But it chose to give
> you 32 AGs of ~815GB each.  Whether you run bonnie, iozone, or your
> Flame application, you won't be able to measure a meaningful difference,
> if any difference, between 26 and 32 AGs.
> 
> ...
>> Problem is I run Centos so the line;
>> 
>> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. "
>> 
>> ... doesn't really apply.
> 
> This makes no sense.  What doesn't apply?

Well, I had assumed it meant Linux kernel version 3.2.12, were as CentOS is at whatever RHEL is at being 2.6.32.

At any rate, based on what I'm getting from you all, is that leave the agcount alone as agsize will max at 1TB and agcount will adjust depending on volume size.

This volume will encounter a lot of random IO so 32 AGs will suffice at any rate.  Un sure if increasing it to Autodesks 128 will really help my env.  I'm assuming they want a lot of parallelism which again doesn't apply in my case.. 

> You can change to noop or deadline with a single echo command in a
> startup script:
> 
> echo noop > /sys/block/sdX/queue/scheduler
> 
> where sdX is the name of your RAID device.
> 
> -- 
> Stan
> 
> 

- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 16:14     ` Eric Sandeen
  2013-07-14 16:46       ` aurfalien
@ 2013-07-14 17:14       ` aurfalien
  2013-07-15  1:22         ` Dave Chinner
  2013-07-14 22:08       ` Stan Hoeppner
  2 siblings, 1 reply; 15+ messages in thread
From: aurfalien @ 2013-07-14 17:14 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs


On Jul 14, 2013, at 9:14 AM, Eric Sandeen wrote:

> On 7/13/13 11:20 PM, aurfalien wrote:
>> 
>> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote:
>> 
>>> On 7/13/13 7:11 PM, aurfalien wrote:
>>>> Hello again,
>>>> 
>>>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.
>>>> 
>>>> So I do;
>>>> 
>>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
>>>> 
>>>> And I get;
>>>> 
>>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
>>>>        =                       sectsz=512   attr=2, projid32bit=0
>>>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
>>>>        =                       sunit=32     swidth=448 blks
>>>> naming   =version 2              bsize=4096   ascii-ci=0
>>>> log      =internal log           bsize=4096   blocks=131072, version=2
>>>>        =                       sectsz=512   sunit=32 blks, lazy-count=1
>>>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>>> 
>>>> 
>>>> All is fine but I was recently made aware of tweaking agsize.
>>> 
>>> Made aware by what?  For what reason?
>> 
>> Autodesk has this software called Flame which requires very very fast
>> local storage using XFS. They have an entire write up on how to calc
>> proper agsize for optimal performance.
> 
> http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199
> 
> I guess?
> 
> That's quite a procedure!  And I have to say, a slightly strange one at first glance.
> 
> It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe.


Sorry to double reply to the same thread.

But the volume in question (regarding the Autodesk article) is used for very fast playback of image files.  So realtime performance for files of 2048x1556 resolution.  These files are being touched/retouched throughout the day by the person driving the Flame.

The fragmentation on these systems on a heavy day, meaning one were they are running at 98% full is about 5% on avg.  On any given day, the systems are about 80% full.

> In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other.
> 
>> I never mess with agsize but it is require  when creating the XFS
>> file system for use with Flame.  I realize its tailored for there
>> apps particular IO characteristics, so I'm curious about it.
> 
> In general more AGs allow more concurrency for some operations;
> it also will generally change how/where files in multiple directories get
> allocated.
> 
>>>> So I would like to mess around and iozone any diffs between the above
>>>> agcount of 32 and whatever agcount changes I may do.
>>> 
>>> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting.
>> 
>> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize.
> 
> Not necessarily, see above; I'm not sure what iozone invocation would
> show any effects from more or fewer AGs.  Anyway, iozone != flame, not
> by a long shot! :)
> 
>>>> I didn't see any mention of agsize/agcount on the XFS FAQ and would
>>>> like to know, based on the above, why does XFS think I have 32
>>>> allocation groups with the corresponding size?
>>> 
>>> It doesn't think so, it _knows_ so, because it made them itself.  ;)
>> 
>> Yea but based on what?
>> 
>> Why 32 at there current size?
> 
> see calc_default_ag_geometry()
> 
> Since you are in multidisk mode (you have stripe geometry) it uses more AGs for more AGs since it knows you have more spindles:
> 
>        } else if (dblocks > GIGABYTES(512, blocklog))
>                shift = 5;
> 
> 2^5 = 32
> 
> If you hadn't been in multidisk mode you would have gotten 25 AGs due to the max AG size of 1T.
> 
>>>> And are these optimal
>>>> numbers?
>>> 
>>> How high is up?
>>> 
>>> Here's the appropriate faq entry:
>>> 
>>> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E
>> 
>> Problem is I run Centos so the line;
>> 
>> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. "
>> 
>> ... doesn't really apply.
> 
> Well, my point was that your original question, "are these optimal numbers?" included absolutely no context of your workload, so the best answer is yes - the default mkfs behavior is optimal for a generic, unspecified workload.
> 
> I don't have access to Autodesk Flame so I really don't know how it behaves or what an optimal tuning might be.
> 
> Anyway, I think the calc_default_ag_geometry() info above answered your original question of "why does XFS think I have 32 allocation groups with the corresponding size?" - that's simply the default mkfs algorithm when in multidisk mode, for a disk of this size.
> 
> -Eric
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 16:14     ` Eric Sandeen
  2013-07-14 16:46       ` aurfalien
  2013-07-14 17:14       ` aurfalien
@ 2013-07-14 22:08       ` Stan Hoeppner
  2013-07-14 22:42         ` aurfalien
  2 siblings, 1 reply; 15+ messages in thread
From: Stan Hoeppner @ 2013-07-14 22:08 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: xfs, aurfalien

On 7/14/2013 11:14 AM, Eric Sandeen wrote:

> http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199
> 
> I guess?
> 
> That's quite a procedure!  And I have to say, a slightly strange one at first glance.

Agreed.

> It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe.

Again.

> In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other.

Or it's just as likely they are laying out these image frames in a
specific manner across 128 directories, assuming 128 AGs exist, to
achieve some specific "on disk" organization of the files.  It's simply
not possible to know without more information.

Interestingly, on a 14+2 RAID6 array of 7.2K drives, normally 128 AGs
will decrease parallel performance due to a huge increase in head seek
latency.  Thus I'd assume this isn't a parallel workload.  Either that
or Autodesk doesn't know XFS as well as they believe.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 22:08       ` Stan Hoeppner
@ 2013-07-14 22:42         ` aurfalien
  2013-07-14 23:43           ` Stan Hoeppner
  0 siblings, 1 reply; 15+ messages in thread
From: aurfalien @ 2013-07-14 22:42 UTC (permalink / raw)
  To: stan; +Cc: Eric Sandeen, xfs


On Jul 14, 2013, at 3:08 PM, Stan Hoeppner wrote:

> On 7/14/2013 11:14 AM, Eric Sandeen wrote:
> 
>> http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199
>> 
>> I guess?
>> 
>> That's quite a procedure!  And I have to say, a slightly strange one at first glance.
> 
> Agreed.
> 
>> It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe.
> 
> Again.
> 
>> In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other.
> 
> Or it's just as likely they are laying out these image frames in a
> specific manner across 128 directories, assuming 128 AGs exist, to
> achieve some specific "on disk" organization of the files.  It's simply
> not possible to know without more information.
> 
> Interestingly, on a 14+2 RAID6 array of 7.2K drives, normally 128 AGs
> will decrease parallel performance due to a huge increase in head seek
> latency.  Thus I'd assume this isn't a parallel workload.  Either that
> or Autodesk doesn't know XFS as well as they believe.

Now hold on a minute here Stan.

While I don't really like Autodesk as they pretty much atrophy software.  The fact is that they, the finishing suite division, know XFS and realtime 2K performance is realized all day long as long as one follows there guidelines.

After all, SGI developed XFS as well as visual computing stations and back in the day, you had SGIs running Flame vs today which is Linux.

Flame is a visual computing app after all.  Albeit with a front end or GUI tuned to artists but still.

My initial post on this was to try and understand if there mobs make sense to the general XFS community and wether I could benefit from them in applying those mods to general purpose storage.

- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 22:42         ` aurfalien
@ 2013-07-14 23:43           ` Stan Hoeppner
  0 siblings, 0 replies; 15+ messages in thread
From: Stan Hoeppner @ 2013-07-14 23:43 UTC (permalink / raw)
  To: aurfalien; +Cc: Eric Sandeen, xfs

On 7/14/2013 5:42 PM, aurfalien wrote:

> My initial post on this was to try and understand if there mobs make sense to the general XFS community 

They do not.

> and wether I could benefit from them in applying those mods to general purpose storage.

You may or may not.  There's simply not enough information available in
that guide.  Obviously Autodesk has a reason for recommending 128 AGs,
but no such reasoning is provided.  I already explained why, in the
general case, agcount has no relevance in isolation.  Setting agcount
properly for the general XFS case requires knowledge of the underlying
storage device size, geometry, spindle speed, etc.

The Autodesk instructions Eric linked are specific to a select group of
Autodesk certified HP workstation models, Autodesk's own storage arrays,
or unspecified FC SAN storage.  Nowhere in the "storage configuration"
chapter does it mention the number of disks or RAID level required or
recommended backing the LUNs.

Thus, given what I've explained of the relationship between array
capacity, spindle count, RAID level, etc, it simply doesn't make sense
to arbitrarily specify 128 allocation groups, especially when the
storage hardware characteristic are completely ignored.

So if Autodesk is ignoring these critical factors when telling you to
use 128 allocation groups, then they either have some application
specific file layout that benefits from 128 AGs, or, as I said, they
don't know XFS as well as they think they do.  I'm not disparaging
Autodesk here.  There are plenty of vendors who do things with XFS that
aren't necessarily wise, sometimes flat out bad.

Taking a quick glance at the data directory layout on a current Flame
system may get us closer to understanding why they want 128 AGs.  For
instance, if they've created exactly 128 directories on the XFS volume
that would fully answer the question as to why they want 128 AGs.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14  7:06     ` Stan Hoeppner
  2013-07-14 16:56       ` aurfalien
@ 2013-07-15  1:07       ` Dave Chinner
  1 sibling, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2013-07-15  1:07 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: xfs, Eric Sandeen, aurfalien

On Sun, Jul 14, 2013 at 02:06:43AM -0500, Stan Hoeppner wrote:
> On 7/13/2013 11:20 PM, aurfalien wrote:
> ...
> >>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
> ...
> >>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256    agcount=32, agsize=209428640 blks
> >>>         =                       sectsz=512   attr=2, projid32bit=0
> >>> data     =                       bsize=4096   blocks=6701716480, imaxpct=5
> >>>         =                       sunit=32     swidth=448 blks
> >>> naming   =version 2              bsize=4096   ascii-ci=0
> >>> log      =internal log           bsize=4096   blocks=131072, version=2
> >>>         =                       sectsz=512   sunit=32 blks, lazy-count=1
> >>> realtime =none                   extsz=4096   blocks=0, rtextents=0
> ...
> > Autodesk has this software called Flame which requires very very fast local storage using XFS.  
> 
> If "Flame" does any random writes then you probably shouldn't be using
> RAID6.

Oh, we are talking about flame/smoke/lustre rendering environments
here. Go back 5 years, a renderwall compositing effects via smoke
was one of the nastiest small random write workloads you could
throw at a filesystem. It was often used to benchmark file server
performance for renderwalls and still may be. Think of a workload
that reads lots of shared texture files across thousands of
machines, each crunching a single video frame to add an effect and
all doing small random writes to the video frame as it modifies a
small section of each line of the video frame....

Translation: tuning for AG size is a waste of time.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 17:14       ` aurfalien
@ 2013-07-15  1:22         ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2013-07-15  1:22 UTC (permalink / raw)
  To: aurfalien; +Cc: Eric Sandeen, xfs

On Sun, Jul 14, 2013 at 10:14:15AM -0700, aurfalien wrote:
> On Jul 14, 2013, at 9:14 AM, Eric Sandeen wrote:
> > On 7/13/13 11:20 PM, aurfalien wrote:
> >> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote:
> >>> On 7/13/13 7:11 PM, aurfalien wrote:
> >>>> Hello again,
> >>>> 
> >>>> I have a Raid 6 x16 disk array with 128k stripe size and a
> >>>> 512 byte block size.
> >>>> 
> >>>> So I do;
> >>>> 
> >>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14
> >>>> /dev/mapper/vg_doofus_data-lv_data
> >>>> 
> >>>> And I get;
> >>>> 
> >>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256
> >>>> agcount=32, agsize=209428640 blks =
> >>>> sectsz=512   attr=2, projid32bit=0 data     =
> >>>> bsize=4096   blocks=6701716480, imaxpct=5 =
> >>>> sunit=32     swidth=448 blks naming   =version 2
> >>>> bsize=4096   ascii-ci=0 log      =internal log
> >>>> bsize=4096   blocks=131072, version=2 =
> >>>> sectsz=512   sunit=32 blks, lazy-count=1 realtime =none
> >>>> extsz=4096   blocks=0, rtextents=0
> >>>> 
> >>>> 
> >>>> All is fine but I was recently made aware of tweaking agsize.
> >>> 
> >>> Made aware by what?  For what reason?
> >> 
> >> Autodesk has this software called Flame which requires very
> >> very fast local storage using XFS. They have an entire write up
> >> on how to calc proper agsize for optimal performance.
> > 
> > http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199
> > 
> > I guess?
> > 
> > That's quite a procedure!  And I have to say, a slightly strange
> > one at first glance.
> > 
> > It'd be nice if they said what they were trying to accomplish
> > rather than just giving you a long recipe.
> 
> 
> Sorry to double reply to the same thread.
> 
> But the volume in question (regarding the Autodesk article) is
> used for very fast playback of image files.  So realtime
> performance for files of 2048x1556 resolution.  These files are
> being touched/retouched throughout the day by the person driving
> the Flame.

Sure - it's file per frame video that is being used here, and 2k
resolution is generally around 12.5MB per frame. If you are
concerned about playback rates, then it is far more important that
the frames are laid out sequentially on disk than anything else.
Tuning the number of AGs doesn't acheive that - increasing the
number of AGs is more likely to cause them to be written all over
the place, especially as the filesystem ages and AGs fill up.


> The fragmentation on these systems on a heavy day, meaning one
> were they are running at 98% full is about 5% on avg.  On any
> given day, the systems are about 80% full.

If they are running their filesystems to 98% full, they they have
already given up any hope they have of getting reliable layout of
their video files.

If you are concerned about low latency, high throughput playback,
then it's far more important to get the stripe width set up
correctly for the size of the file so each frame is stripe width
aligned and each frame takes a single physical IO to read from disk
and there is minimal seek between the two frames.

The only reason I can see for increasing the number of AGs here is
that they are trying to limit the number of video directories that
share the same AGs as they are specifying the inode64 mount option.
i.e. the assumption is that each video clip is sufficiently large
that with 128AGs it is unlikely that two video clips will end up in
the same AG and hence potentially interleave as they are
modified....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
  2013-07-14 19:45 Richard Scobie
@ 2013-07-14 22:18 ` aurfalien
  0 siblings, 0 replies; 15+ messages in thread
From: aurfalien @ 2013-07-14 22:18 UTC (permalink / raw)
  To: Richard Scobie; +Cc: xfs


On Jul 14, 2013, at 12:45 PM, Richard Scobie wrote:

> aurfalien wrote:
> .............
> 
> So I would like to mess around and iozone any diffs between the above agcount of 32 and whatever agcount changes I may do.
> 
> .............
> 
> There is an Autodesk tool to do this work, sw_io_perf_tool which will give a much more realistic evaluation than iozone.
> 
> Checkout:
> 
> http://usa.autodesk.com/adsk/servlet/ps/dl/item?siteID=123112&id=15486735&linkID=9242618


Brilliant, many thanks!

- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: specify agsize?
@ 2013-07-14 19:45 Richard Scobie
  2013-07-14 22:18 ` aurfalien
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Scobie @ 2013-07-14 19:45 UTC (permalink / raw)
  To: xfs

aurfalien wrote:
.............

So I would like to mess around and iozone any diffs between the above 
agcount of 32 and whatever agcount changes I may do.

.............

There is an Autodesk tool to do this work, sw_io_perf_tool which will 
give a much more realistic evaluation than iozone.

Checkout:

http://usa.autodesk.com/adsk/servlet/ps/dl/item?siteID=123112&id=15486735&linkID=9242618

Regards,

Richard

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-07-15  1:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-14  0:11 specify agsize? aurfalien
2013-07-14  2:13 ` Eric Sandeen
2013-07-14  4:20   ` aurfalien
2013-07-14  7:06     ` Stan Hoeppner
2013-07-14 16:56       ` aurfalien
2013-07-15  1:07       ` Dave Chinner
2013-07-14 16:14     ` Eric Sandeen
2013-07-14 16:46       ` aurfalien
2013-07-14 17:14       ` aurfalien
2013-07-15  1:22         ` Dave Chinner
2013-07-14 22:08       ` Stan Hoeppner
2013-07-14 22:42         ` aurfalien
2013-07-14 23:43           ` Stan Hoeppner
2013-07-14 19:45 Richard Scobie
2013-07-14 22:18 ` aurfalien

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.