All of lore.kernel.org
 help / color / mirror / Atom feed
* XFS status update for August 2010
@ 2010-09-02 14:59 ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-09-02 14:59 UTC (permalink / raw)
  To: xfs, linux-kernel

At the first of August we finally saw the release of Linux 2.6.35,
which includes a large XFS update.  The most prominent feature in
Linux 2.6.35 is the new delayed logging code which provides massive
speedups for metadata-intensive workloads, but there has been
a large amount of other fixes and cleanups, leading to the following
diffstat:

	 67 files changed, 4426 insertions(+), 3835 deletions(-)

Given the early release of Linux 2.6.35 the merge window for the
next release fully fell into the month of August.  The XFS updates
for Linux 2.6.36 include various additional performance improvements
in the delayed logging code, for direct I/O writes and for avoiding
synchronous transactions, as well as various fixed and large amount
of cleanups, including the removal of the remaining dead DMAPI
code.

On the userspace side we saw the 3.1.3 release of xfsprogs, which includes
various smaller fixes, support for the new XFS_IOC_ZERO_RANGE ioctl and
Debian packaging updates.  The xfstests package saw one new test case
and a couple of smaller patches, and xfsdump has not seen any updates at
all.

The XMLified versions of the XFS users guide, training labs and filesystem
structure documentation are now available as on the fly generated html on
the xfs.org website and can be found at:

	http://www.xfs.org/index.php/XFS_Papers_and_Documentation


^ permalink raw reply	[flat|nested] 24+ messages in thread

* XFS status update for August 2010
@ 2010-09-02 14:59 ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2010-09-02 14:59 UTC (permalink / raw)
  To: xfs, linux-kernel

At the first of August we finally saw the release of Linux 2.6.35,
which includes a large XFS update.  The most prominent feature in
Linux 2.6.35 is the new delayed logging code which provides massive
speedups for metadata-intensive workloads, but there has been
a large amount of other fixes and cleanups, leading to the following
diffstat:

	 67 files changed, 4426 insertions(+), 3835 deletions(-)

Given the early release of Linux 2.6.35 the merge window for the
next release fully fell into the month of August.  The XFS updates
for Linux 2.6.36 include various additional performance improvements
in the delayed logging code, for direct I/O writes and for avoiding
synchronous transactions, as well as various fixed and large amount
of cleanups, including the removal of the remaining dead DMAPI
code.

On the userspace side we saw the 3.1.3 release of xfsprogs, which includes
various smaller fixes, support for the new XFS_IOC_ZERO_RANGE ioctl and
Debian packaging updates.  The xfstests package saw one new test case
and a couple of smaller patches, and xfsdump has not seen any updates at
all.

The XMLified versions of the XFS users guide, training labs and filesystem
structure documentation are now available as on the fly generated html on
the xfs.org website and can be found at:

	http://www.xfs.org/index.php/XFS_Papers_and_Documentation

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-02 14:59 ` Christoph Hellwig
@ 2010-09-05  7:44   ` Willy Tarreau
  -1 siblings, 0 replies; 24+ messages in thread
From: Willy Tarreau @ 2010-09-05  7:44 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs, linux-kernel

Hi Christoph,

On Thu, Sep 02, 2010 at 10:59:59AM -0400, Christoph Hellwig wrote:
> At the first of August we finally saw the release of Linux 2.6.35,
> which includes a large XFS update.  The most prominent feature in
> Linux 2.6.35 is the new delayed logging code which provides massive
> speedups for metadata-intensive workloads, but there has been
> a large amount of other fixes and cleanups, leading to the following
> diffstat:
> 
> 	 67 files changed, 4426 insertions(+), 3835 deletions(-)
> 
> Given the early release of Linux 2.6.35 the merge window for the
> next release fully fell into the month of August.  The XFS updates
> for Linux 2.6.36 include various additional performance improvements
> in the delayed logging code, for direct I/O writes and for avoiding
> synchronous transactions, as well as various fixed and large amount
> of cleanups, including the removal of the remaining dead DMAPI
> code.

This is very good news. I have XFS on my laptop and I regret I have
installed it there, because working on the kernel is very painful with
it. A "cp -al $old_dir $new_dir" takes about 1 minute while it takes
approximatively one second on reiserfs. I've just installed 2.6.35.4
and did not notice any improvement. However, I'm clearly interested
in testing any possible pending code that you think should improve
this behaviour by delaying log writes.

So if you have any pointer to recommended patches or a git tree (based
on an kernel reliable enough for a laptop used for work), I'd be
interested in trying them.

Regards,
Willy


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
@ 2010-09-05  7:44   ` Willy Tarreau
  0 siblings, 0 replies; 24+ messages in thread
From: Willy Tarreau @ 2010-09-05  7:44 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, xfs

Hi Christoph,

On Thu, Sep 02, 2010 at 10:59:59AM -0400, Christoph Hellwig wrote:
> At the first of August we finally saw the release of Linux 2.6.35,
> which includes a large XFS update.  The most prominent feature in
> Linux 2.6.35 is the new delayed logging code which provides massive
> speedups for metadata-intensive workloads, but there has been
> a large amount of other fixes and cleanups, leading to the following
> diffstat:
> 
> 	 67 files changed, 4426 insertions(+), 3835 deletions(-)
> 
> Given the early release of Linux 2.6.35 the merge window for the
> next release fully fell into the month of August.  The XFS updates
> for Linux 2.6.36 include various additional performance improvements
> in the delayed logging code, for direct I/O writes and for avoiding
> synchronous transactions, as well as various fixed and large amount
> of cleanups, including the removal of the remaining dead DMAPI
> code.

This is very good news. I have XFS on my laptop and I regret I have
installed it there, because working on the kernel is very painful with
it. A "cp -al $old_dir $new_dir" takes about 1 minute while it takes
approximatively one second on reiserfs. I've just installed 2.6.35.4
and did not notice any improvement. However, I'm clearly interested
in testing any possible pending code that you think should improve
this behaviour by delaying log writes.

So if you have any pointer to recommended patches or a git tree (based
on an kernel reliable enough for a laptop used for work), I'd be
interested in trying them.

Regards,
Willy

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05  7:44   ` Willy Tarreau
  (?)
@ 2010-09-05  9:37   ` Michael Monnerie
  2010-09-05 10:47     ` Willy Tarreau
  2010-09-06  3:22     ` XFS status update for August 2010 Eric Sandeen
  -1 siblings, 2 replies; 24+ messages in thread
From: Michael Monnerie @ 2010-09-05  9:37 UTC (permalink / raw)
  To: xfs; +Cc: Willy Tarreau


[-- Attachment #1.1: Type: Text/Plain, Size: 559 bytes --]

On Sonntag, 5. September 2010 Willy Tarreau wrote:
> I've just installed 2.6.35.4

Try the following mount options: 
relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc,delaylog

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05  9:37   ` Michael Monnerie
@ 2010-09-05 10:47     ` Willy Tarreau
  2010-09-05 13:08       ` Dave Chinner
  2010-09-06  3:22     ` XFS status update for August 2010 Eric Sandeen
  1 sibling, 1 reply; 24+ messages in thread
From: Willy Tarreau @ 2010-09-05 10:47 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs

On Sun, Sep 05, 2010 at 11:37:03AM +0200, Michael Monnerie wrote:
> On Sonntag, 5. September 2010 Willy Tarreau wrote:
> > I've just installed 2.6.35.4
> 
> Try the following mount options: 
> relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc,delaylog

Ah thanks for the info Michael, indeed it's a *lot* better: down from 57s
to 1.3s !

I will experiment with that.

Thanks again,
Willy

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05 10:47     ` Willy Tarreau
@ 2010-09-05 13:08       ` Dave Chinner
  2010-09-05 18:56         ` Willy Tarreau
  2010-09-06  5:49         ` xfs mount/create options (was: XFS status update for August 2010) Michael Monnerie
  0 siblings, 2 replies; 24+ messages in thread
From: Dave Chinner @ 2010-09-05 13:08 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Michael Monnerie, xfs

On Sun, Sep 05, 2010 at 12:47:39PM +0200, Willy Tarreau wrote:
> On Sun, Sep 05, 2010 at 11:37:03AM +0200, Michael Monnerie wrote:
> > On Sonntag, 5. September 2010 Willy Tarreau wrote:
> > > I've just installed 2.6.35.4
> > 
> > Try the following mount options: 
> > relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc,delaylog

FYI:
	- relatime,logbufs=8,attr=2,barrier are all defaults.
	- largeio only affects stat(2) output if you have
	  sunit/swidth set - unlikely on a laptop drive, and has
	  no effect on unlink performance.
	- swalloc only affects allocation if sunit/swidth are set
	  and has no effect on unlink performance.

> Ah thanks for the info Michael, indeed it's a *lot* better: down from 57s
> to 1.3s !

	- delaylog is the option providing that improvement.

You should keep in mind that delaylog is a brand new experimental
feature (as it warns in dmesg output on mount) and as such has the
potential to eat your data. That being said, I've been running
my laptop and my production machines (except for the backup target)
for a couple of months now with it and haven't had any problems...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05 13:08       ` Dave Chinner
@ 2010-09-05 18:56         ` Willy Tarreau
  2010-09-05 23:36           ` Dave Chinner
  2010-09-06  5:49         ` xfs mount/create options (was: XFS status update for August 2010) Michael Monnerie
  1 sibling, 1 reply; 24+ messages in thread
From: Willy Tarreau @ 2010-09-05 18:56 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Michael Monnerie, xfs

On Sun, Sep 05, 2010 at 11:08:09PM +1000, Dave Chinner wrote:
> On Sun, Sep 05, 2010 at 12:47:39PM +0200, Willy Tarreau wrote:
> > On Sun, Sep 05, 2010 at 11:37:03AM +0200, Michael Monnerie wrote:
> > > On Sonntag, 5. September 2010 Willy Tarreau wrote:
> > > > I've just installed 2.6.35.4
> > > 
> > > Try the following mount options: 
> > > relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc,delaylog
> 
> FYI:
> 	- relatime,logbufs=8,attr=2,barrier are all defaults.

in fact I already had noatime and logbsize=256k, and remembered having
played with the other ones in the past.

> 	- largeio only affects stat(2) output if you have
> 	  sunit/swidth set - unlikely on a laptop drive, and has
> 	  no effect on unlink performance.
> 	- swalloc only affects allocation if sunit/swidth are set
> 	  and has no effect on unlink performance.

OK.

> > Ah thanks for the info Michael, indeed it's a *lot* better: down from 57s
> > to 1.3s !
> 
> 	- delaylog is the option providing that improvement.

That's what I deduced from Christoph's initial description.

> You should keep in mind that delaylog is a brand new experimental
> feature (as it warns in dmesg output on mount)

yes, I've noticed the warning in the code then in dmesg. It does not
seem to be considered upon a remount (I did a mount -o remount,delaylog /
and it did nothing).

> and as such has the potential to eat your data.

noted, thanks for the warning.

> That being said, I've been running
> my laptop and my production machines (except for the backup target)
> for a couple of months now with it and haven't had any problems...

Fine, this is typically the type of info I need. Thus I'll be using
it with an eye on any potential FS-related problem.

Are there any plans to use that option by default once it gets enough
testing ? I'm asking because I had to convert from XFS to reseirfs at
least twice due to slow metadata, but I tend to trust XFS a lot more
(especially due to dirty failures I experienced a few years ago with
reiserfs - corrupted file tails upon power cut).

Thanks,
Willy

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05 18:56         ` Willy Tarreau
@ 2010-09-05 23:36           ` Dave Chinner
  2010-09-06  5:19             ` Willy Tarreau
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-09-05 23:36 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Michael Monnerie, xfs

On Sun, Sep 05, 2010 at 08:56:00PM +0200, Willy Tarreau wrote:
> On Sun, Sep 05, 2010 at 11:08:09PM +1000, Dave Chinner wrote:
> > That being said, I've been running
> > my laptop and my production machines (except for the backup target)
> > for a couple of months now with it and haven't had any problems...
> 
> Fine, this is typically the type of info I need. Thus I'll be using
> it with an eye on any potential FS-related problem.

Thanks.

> Are there any plans to use that option by default once it gets enough
> testing ? I'm asking because I had to convert from XFS to reseirfs at
> least twice due to slow metadata, but I tend to trust XFS a lot more
> (especially due to dirty failures I experienced a few years ago with
> reiserfs - corrupted file tails upon power cut).

>From Documentation/filesystems/xfs-delayed-logging-design.txt:

2.6.37 Remove experimental tag from mount option
        => should be roughly 6 months after initial merge
        => enough time to:
                => gain confidence and fix problems reported by early
                   adopters (a.k.a. guinea pigs)
                => address worst performance regressions and undesired
                   behaviours
                => start tuning/optimising code for parallelism
                => start tuning/optimising algorithms consuming
                   excessive CPU time

2.6.39 Switch default mount option to use delayed logging
        => should be roughly 12 months after initial merge
        => enough time to shake out remaining problems before next round of
           enterprise distro kernel rebases

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05  9:37   ` Michael Monnerie
  2010-09-05 10:47     ` Willy Tarreau
@ 2010-09-06  3:22     ` Eric Sandeen
  2010-09-06  5:10       ` Michael Monnerie
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Sandeen @ 2010-09-06  3:22 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: Willy Tarreau, xfs

Michael Monnerie wrote:
> On Sonntag, 5. September 2010 Willy Tarreau wrote:
>> I've just installed 2.6.35.4
> 
> Try the following mount options: 
> relatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc,delaylog

relatime is default
logbufs=8 is default
attr2 is default
barrier is default
largeio is not likely anything you want or need
swalloc is unlikely to be useful on a laptop

People need to read up a little and know what they're tuning;
repeating this kind of suggestion leads to cargo-cultism for
performance "tuning"

IOW don't turn knobs just because they are there ... :)

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-06  3:22     ` XFS status update for August 2010 Eric Sandeen
@ 2010-09-06  5:10       ` Michael Monnerie
  0 siblings, 0 replies; 24+ messages in thread
From: Michael Monnerie @ 2010-09-06  5:10 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 1279 bytes --]

On Montag, 6. September 2010 Eric Sandeen wrote:
> People need to read up a little and know what they're tuning;
> repeating this kind of suggestion leads to cargo-cultism for
> performance "tuning"
> 
> IOW don't turn knobs just because they are there ... :)

Other than most who write here I'm not a developer, but a sysadmin, 
responsible for servers of all kind of ages, with XFS usage back to the 
early 2.6 series. Default mount options use to change sometimes, and I 
can't always check that after a system/kernel upgrade the default 
options are satisfied or not. So specifying everything is safe, and 
doesn't do any harm - right?
And as it was Sunday morning, I wanted to help out Willy quickly, 
without looking specifically which options he would need. I was sure 
some of you who know it would guide him later, but maybe only on Monday, 
so I took that quick path to find a solution.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: XFS status update for August 2010
  2010-09-05 23:36           ` Dave Chinner
@ 2010-09-06  5:19             ` Willy Tarreau
  0 siblings, 0 replies; 24+ messages in thread
From: Willy Tarreau @ 2010-09-06  5:19 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Michael Monnerie, xfs

On Mon, Sep 06, 2010 at 09:36:35AM +1000, Dave Chinner wrote:
> >From Documentation/filesystems/xfs-delayed-logging-design.txt:
(...)

Now that I know there's a doc, I'll check it. Thank you for the details.

Cheers,
Willy

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-05 13:08       ` Dave Chinner
  2010-09-05 18:56         ` Willy Tarreau
@ 2010-09-06  5:49         ` Michael Monnerie
  2010-09-08  5:38           ` Michael Monnerie
  1 sibling, 1 reply; 24+ messages in thread
From: Michael Monnerie @ 2010-09-06  5:49 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 1913 bytes --]

I looked into man mkfs now, which brings up these questions:

On Sonntag, 5. September 2010 Dave Chinner wrote:
>         - relatime,logbufs=8,attr=2,barrier are all defaults.

Why isn't logbsize=256k default, when it's suggested most of the time 
anyway? On machines with 32MiB or more 32k is the default, but most 
machines these days have multi-gigabytes of RAM, so at least for 
RAM>1GiB that could be made default.

>         - largeio only affects stat(2) output if you have
>           sunit/swidth set - unlikely on a laptop drive, and has
>           no effect on unlink performance.
>         - swalloc only affects allocation if sunit/swidth are set
>           and has no effect on unlink performance.

Hm, it seems I don't understand that. I tried now on different servers, 
using
stat -f /disks/db --format '%s %S'
4096 4096

That filesystems were all created with su=64k,swidth=(values 4-8 
depending on RAID). So I retried specifying directly in the mount 
options: sunit=128,swidth=512
and it still reports "4096" for %s - or is %s not the value I should 
look for? Some of the filesystems even have allocsize= specified, still 
always 4096 is given back. Where is my problem?

And while I am at it: Why does "mount" not provide the su=/sw= options 
that we can use to create a filesystem? Would make life easier, as it's 
much easier to read su=64k,sw=7 than sunit=128,swidth=896.

When I defined su/sw on mkfs, is it enough, or would I always have to 
specify sunit/swidth with every mount too?

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-06  5:49         ` xfs mount/create options (was: XFS status update for August 2010) Michael Monnerie
@ 2010-09-08  5:38           ` Michael Monnerie
  2010-09-08 10:58             ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Monnerie @ 2010-09-08  5:38 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 2127 bytes --]

I just found that my questions from Monday were not solved, but this is 
interesting, so I want to warm it up again.

On Montag, 6. September 2010 Michael Monnerie wrote:
 I looked into man mkfs now, which brings up these questions:
 
 On Sonntag, 5. September 2010 Dave Chinner wrote:
 >         - relatime,logbufs=8,attr=2,barrier are all defaults.
 
 Why isn't logbsize=256k default, when it's suggested most of the time
 anyway? On machines with 32MiB or more 32k is the default, but most
 machines these days have multi-gigabytes of RAM, so at least for
 RAM>1GiB that could be made default.
 
 >         - largeio only affects stat(2) output if you have
 >           sunit/swidth set - unlikely on a laptop drive, and has
 >           no effect on unlink performance.
 >         - swalloc only affects allocation if sunit/swidth are set
 >           and has no effect on unlink performance.
 
 Hm, it seems I don't understand that. I tried now on different
  servers, using
 stat -f /disks/db --format '%s %S'
 4096 4096
 
 That filesystems were all created with su=64k,swidth=(values 4-8
 depending on RAID). So I retried specifying directly in the mount
 options: sunit=128,swidth=512
 and it still reports "4096" for %s - or is %s not the value I should
 look for? Some of the filesystems even have allocsize= specified,
  still always 4096 is given back. Where is my problem?
 
 And while I am at it: Why does "mount" not provide the su=/sw=
  options that we can use to create a filesystem? Would make life
  easier, as it's much easier to read su=64k,sw=7 than
  sunit=128,swidth=896.
 
 When I defined su/sw on mkfs, is it enough, or would I always have to
 specify sunit/swidth with every mount too?
 



-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08  5:38           ` Michael Monnerie
@ 2010-09-08 10:58             ` Dave Chinner
  2010-09-08 13:38               ` Michael Monnerie
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-09-08 10:58 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs

On Wed, Sep 08, 2010 at 07:38:54AM +0200, Michael Monnerie wrote:
> I just found that my questions from Monday were not solved, but this is 
> interesting, so I want to warm it up again.
> 
> On Montag, 6. September 2010 Michael Monnerie wrote:
>  I looked into man mkfs now, which brings up these questions:
>  
>  On Sonntag, 5. September 2010 Dave Chinner wrote:
>  >         - relatime,logbufs=8,attr=2,barrier are all defaults.
>  
>  Why isn't logbsize=256k default, when it's suggested most of the time
>  anyway?

It's suggested when people are asking about performance tuning. When
the performance is acceptible with the default value, then you don't
hear about it, do you?

>  On machines with 32MiB or more 32k is the default, but most
>  machines these days have multi-gigabytes of RAM, so at least for
>  RAM>1GiB that could be made default.

That is definitely not true. XFS is widely used in the embedded NAS
space, where memory is very limited and might be configured with
many filesystems.  32k is the default because those sorts of machines
can't afford to burn 2MB RAM per filesystem just in log buffers.

Also, you can go and search the archives or git history as to why we
don't tune the logbsize based on physical memory size anymore, too.


>  >         - largeio only affects stat(2) output if you have
>  >           sunit/swidth set - unlikely on a laptop drive, and has
>  >           no effect on unlink performance.
>  >         - swalloc only affects allocation if sunit/swidth are set
>  >           and has no effect on unlink performance.
>  
>  Hm, it seems I don't understand that. I tried now on different
>   servers, using
>  stat -f /disks/db --format '%s %S'
>  4096 4096

You're getting the wrong information there. largeio affects the
output of the optimal IO size reported by stat(2). 'stat -f" does
a statfs(2) call. Try 'stat /disk/db/<file> --format %o'....

>  And while I am at it: Why does "mount" not provide the su=/sw=
>   options that we can use to create a filesystem? Would make life
>   easier, as it's much easier to read su=64k,sw=7 than
>   sunit=128,swidth=896.

You should never, ever need to use the mount options.

>  When I defined su/sw on mkfs, is it enough, or would I always have to
>  specify sunit/swidth with every mount too?

Yes, no. mkfs.xfs stores sunit/swidth on disk in the superblock.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08 10:58             ` Dave Chinner
@ 2010-09-08 13:38               ` Michael Monnerie
  2010-09-08 14:51                 ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Monnerie @ 2010-09-08 13:38 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 2221 bytes --]

On Mittwoch, 8. September 2010 Dave Chinner wrote:
> >  On machines with 32MiB or more 32k is the default, but most
> >  machines these days have multi-gigabytes of RAM, so at least for
> >  RAM>1GiB that could be made default.
> 
> That is definitely not true. XFS is widely used in the embedded NAS
> space, where memory is very limited and might be configured with
> many filesystems.  32k is the default because those sorts of machines
> can't afford to burn 2MB RAM per filesystem just in log buffers.
>
> Also, you can go and search the archives or git history as to why we
> don't tune the logbsize based on physical memory size anymore, too.

OK, then the man page should be updated to reflect this "newer logic". 
I've got the information directly from there.
 
> You're getting the wrong information there. largeio affects the
> output of the optimal IO size reported by stat(2). 'stat -f" does
> a statfs(2) call. Try 'stat /disk/db/<file> --format %o'....

Ah, that's better, thank you :-)
 
> >  And while I am at it: Why does "mount" not provide the su=/sw=
> >   options that we can use to create a filesystem? Would make life
> >   easier, as it's much easier to read su=64k,sw=7 than
> >   sunit=128,swidth=896.
> 
> You should never, ever need to use the mount options.

..except when a disk is added to the RAID, or it's RAID level gets 
changed. Then sw=7 becomes sw=8 or so - or better said: would become, as 
then you must use the (I call it strange, error prone) semantics of 
sunit/swidth.
 
> >  When I defined su/sw on mkfs, is it enough, or would I always have
> > to specify sunit/swidth with every mount too?
> 
> Yes, no. mkfs.xfs stores sunit/swidth on disk in the superblock.

So when I add a disk, I must only once mount with the new sunit/swidth, 
and that is stored? That's nice.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08 13:38               ` Michael Monnerie
@ 2010-09-08 14:51                 ` Dave Chinner
  2010-09-08 15:24                   ` Emmanuel Florac
  2010-09-08 23:30                   ` Michael Monnerie
  0 siblings, 2 replies; 24+ messages in thread
From: Dave Chinner @ 2010-09-08 14:51 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs

On Wed, Sep 08, 2010 at 03:38:53PM +0200, Michael Monnerie wrote:
> On Mittwoch, 8. September 2010 Dave Chinner wrote:
> > >  On machines with 32MiB or more 32k is the default, but most
> > >  machines these days have multi-gigabytes of RAM, so at least for
> > >  RAM>1GiB that could be made default.
> > 
> > That is definitely not true. XFS is widely used in the embedded NAS
> > space, where memory is very limited and might be configured with
> > many filesystems.  32k is the default because those sorts of machines
> > can't afford to burn 2MB RAM per filesystem just in log buffers.
> >
> > Also, you can go and search the archives or git history as to why we
> > don't tune the logbsize based on physical memory size anymore, too.
> 
> OK, then the man page should be updated to reflect this "newer logic". 
> I've got the information directly from there.
>  
> > You're getting the wrong information there. largeio affects the
> > output of the optimal IO size reported by stat(2). 'stat -f" does
> > a statfs(2) call. Try 'stat /disk/db/<file> --format %o'....
> 
> Ah, that's better, thank you :-)
>  
> > >  And while I am at it: Why does "mount" not provide the su=/sw=
> > >   options that we can use to create a filesystem? Would make life
> > >   easier, as it's much easier to read su=64k,sw=7 than
> > >   sunit=128,swidth=896.
> > 
> > You should never, ever need to use the mount options.
> 
> ..except when a disk is added to the RAID, or it's RAID level gets 
> changed. Then sw=7 becomes sw=8 or so - or better said: would become, as 
> then you must use the (I call it strange, error prone) semantics of 
> sunit/swidth.

Dynamically changing the RAID array geometry is a Bad Idea.  Yes,
you can do it, but if you've got a filesystem full of data and
metadata aligned to the old geometry then after the modification
it won't be aligned anymore.

If you want to do this, then either don't bother about geomtry hints
in the first place, or dump, rebuild the array, mkfs and restore so
everything is properly aligned with the new world order. Hell,
dump/mkfs/restore might even be faster than reshaping a large
array...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08 14:51                 ` Dave Chinner
@ 2010-09-08 15:24                   ` Emmanuel Florac
  2010-09-08 23:34                     ` Michael Monnerie
  2010-09-08 23:30                   ` Michael Monnerie
  1 sibling, 1 reply; 24+ messages in thread
From: Emmanuel Florac @ 2010-09-08 15:24 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Michael Monnerie, xfs

Le Thu, 9 Sep 2010 00:51:48 +1000
Dave Chinner <david@fromorbit.com> écrivait:

>  Hell,
> dump/mkfs/restore might even be faster than reshaping a large
> array...

True, this is incredibly long. Adding two disks to an 8 drives array
easily needs 72 hours.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08 14:51                 ` Dave Chinner
  2010-09-08 15:24                   ` Emmanuel Florac
@ 2010-09-08 23:30                   ` Michael Monnerie
  2010-09-09  7:27                     ` Dave Chinner
  1 sibling, 1 reply; 24+ messages in thread
From: Michael Monnerie @ 2010-09-08 23:30 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 1190 bytes --]

On Mittwoch, 8. September 2010 Dave Chinner wrote:
> Dynamically changing the RAID array geometry is a Bad Idea.  Yes,
> you can do it, but if you've got a filesystem full of data and
> metadata aligned to the old geometry then after the modification
> it won't be aligned anymore.
> 
> If you want to do this, then either don't bother about geomtry hints
> in the first place, or dump, rebuild the array, mkfs and restore so
> everything is properly aligned with the new world order. Hell,
> dump/mkfs/restore might even be faster than reshaping a large
> array...
 
You're right. But there are some customers who don't want to spend the 
money for a 2nd array, and can't afford the downtime of backup, rebuild 
raid (takes 8-48 hours), restore. So an online upgrade is needed. We're 
not in an ideal world.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08 15:24                   ` Emmanuel Florac
@ 2010-09-08 23:34                     ` Michael Monnerie
  0 siblings, 0 replies; 24+ messages in thread
From: Michael Monnerie @ 2010-09-08 23:34 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 903 bytes --]

On Mittwoch, 8. September 2010 Emmanuel Florac wrote:
> True, this is incredibly long. Adding two disks to an 8 drives array
> easily needs 72 hours.
 
Agreed. But how long does the process 
- backup
- build new RAID with more disks
- restore
take on the same storage? It's not that much faster, and during the time 
of RAID setup you can't work, if you are very serious about your data. 
Yes, you can build a raid in background and already move data into it, 
but you're not secured during that time. I don't take that risk.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-08 23:30                   ` Michael Monnerie
@ 2010-09-09  7:27                     ` Dave Chinner
  2010-09-09  8:29                       ` Michael Monnerie
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2010-09-09  7:27 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs

On Thu, Sep 09, 2010 at 01:30:15AM +0200, Michael Monnerie wrote:
> On Mittwoch, 8. September 2010 Dave Chinner wrote:
> > Dynamically changing the RAID array geometry is a Bad Idea.  Yes,
> > you can do it, but if you've got a filesystem full of data and
> > metadata aligned to the old geometry then after the modification
> > it won't be aligned anymore.
> > 
> > If you want to do this, then either don't bother about geomtry hints
> > in the first place, or dump, rebuild the array, mkfs and restore so
> > everything is properly aligned with the new world order. Hell,
> > dump/mkfs/restore might even be faster than reshaping a large
> > array...
>  
> You're right. But there are some customers who don't want to spend the 
> money for a 2nd array, and can't afford the downtime of backup, rebuild 
> raid (takes 8-48 hours), restore. So an online upgrade is needed. We're 
> not in an ideal world.

If you can't afford downtime, then I'd seriously question using
reshaping to expand storage because it is one of the highest risk
methods of increasing storage capacity you can use. That means
you've still got to do the backup before you reshape your raid
device - if reshaping fails, and then you need to rebuild + restore.

Reshaping is a dangerous operation - you can't go back once it has
started, and failures while reshaping can cause data loss. That is,
the risk of catastrophic failure goes up significantly while a
reshape is in progress. This is the same increase in risk of
failures occuring during rebuild after losing a disk - the next disk
failure is most likely to occur while the rebuild is in progress,
simply because of the sustained inrease in load on the drives.

That is, if you have SATA drives then running them for 3 or 4 days
at 100% duty cycle while a reshape takes place is putting them far
outside their design limits. SATA drives are generally designed for
a 20-30% duty cycle for sustained operation. Put disks that are a
couple of years old under this sort of load....

Of even more concern is that reshaping a multi-terabyte array
requires moving the same order of magnitude of bits around as the
BER of the drives. Hence there's every chance of introducing silent
bit errors into your data by reshaping unless you further slow the
reshape down by having it read back all the data to verify it was
reshaped correctly.

IMO, reshaping is not a practise you should be designing your
capacity upgrade processes around, especially if you have uptime and
perforamnce SLA guarantees. It's a very risky operation, and not
something I would suggest anyone uses in production unless they have
absolutely no other option.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-09  7:27                     ` Dave Chinner
@ 2010-09-09  8:29                       ` Michael Monnerie
  0 siblings, 0 replies; 24+ messages in thread
From: Michael Monnerie @ 2010-09-09  8:29 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 1062 bytes --]

On Donnerstag, 9. September 2010 Dave Chinner wrote:
> That is, if you have SATA drives then running them for 3 or 4 days
> at 100% duty cycle while a reshape takes place is putting them far
> outside their design limits.
 
Your arguments are all valid. Our hardware supplier recommended one 
drive type, and we only buy these and really they work very well. For 
the BER, I really hope the controller does read-after-write during 
rebuild. When I see the time it takes to resize an array, I think and 
hope they do it. That said, I've never had a problem on resize, and hope 
it stays like that :-)

There are other reasons to do a resize, which I don't want to discuss in 
public.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
  2010-09-06 22:55 xfs mount/create options (was: XFS status update for August 2010) Richard Scobie
@ 2010-09-06 23:31 ` Michael Monnerie
  0 siblings, 0 replies; 24+ messages in thread
From: Michael Monnerie @ 2010-09-06 23:31 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: Text/Plain, Size: 997 bytes --]

On Dienstag, 7. September 2010 Richard Scobie wrote:
> > When I defined su/sw on mkfs, is it enough, or would I always have
> > to specify sunit/swidth with every mount too?
> 
> Yes. sunit/swidth only needs to be added to your fstab if you either
>  got  the calculation wrong when you initially created the fs and
>  wish to correct it, or if you grow the fs later over more drives.

Thank you.
 
> Note that with recent kernels, mkfs.xfs will choose the optimal 
> sunit/swidth for you if you are using md RAID or LVM (I believe the 
> latter is correct).

I don't use software RAID, but thanks for clarification.

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

****** Aktuelles Radiointerview! ******
http://www.it-podcast.at/aktuelle-sendung.html

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: xfs mount/create options (was: XFS status update for August 2010)
@ 2010-09-06 22:55 Richard Scobie
  2010-09-06 23:31 ` Michael Monnerie
  0 siblings, 1 reply; 24+ messages in thread
From: Richard Scobie @ 2010-09-06 22:55 UTC (permalink / raw)
  To: xfs

Michael Monnerie wrote:

 > When I defined su/sw on mkfs, is it enough, or would I always have to
 > specify sunit/swidth with every mount too?

Yes. sunit/swidth only needs to be added to your fstab if you either got 
the calculation wrong when you initially created the fs and wish to 
correct it, or if you grow the fs later over more drives.

Note that with recent kernels, mkfs.xfs will choose the optimal 
sunit/swidth for you if you are using md RAID or LVM (I believe the 
latter is correct).

Regards,

Richard

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-09-09  8:28 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-02 14:59 XFS status update for August 2010 Christoph Hellwig
2010-09-02 14:59 ` Christoph Hellwig
2010-09-05  7:44 ` Willy Tarreau
2010-09-05  7:44   ` Willy Tarreau
2010-09-05  9:37   ` Michael Monnerie
2010-09-05 10:47     ` Willy Tarreau
2010-09-05 13:08       ` Dave Chinner
2010-09-05 18:56         ` Willy Tarreau
2010-09-05 23:36           ` Dave Chinner
2010-09-06  5:19             ` Willy Tarreau
2010-09-06  5:49         ` xfs mount/create options (was: XFS status update for August 2010) Michael Monnerie
2010-09-08  5:38           ` Michael Monnerie
2010-09-08 10:58             ` Dave Chinner
2010-09-08 13:38               ` Michael Monnerie
2010-09-08 14:51                 ` Dave Chinner
2010-09-08 15:24                   ` Emmanuel Florac
2010-09-08 23:34                     ` Michael Monnerie
2010-09-08 23:30                   ` Michael Monnerie
2010-09-09  7:27                     ` Dave Chinner
2010-09-09  8:29                       ` Michael Monnerie
2010-09-06  3:22     ` XFS status update for August 2010 Eric Sandeen
2010-09-06  5:10       ` Michael Monnerie
2010-09-06 22:55 xfs mount/create options (was: XFS status update for August 2010) Richard Scobie
2010-09-06 23:31 ` Michael Monnerie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.