All of lore.kernel.org
 help / color / mirror / Atom feed
* Strange problems with xfs an SLESS11 SP2
@ 2012-05-11 12:41 Hammer, Marcus
  2012-05-11 16:07 ` Ben Myers
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Hammer, Marcus @ 2012-05-11 12:41 UTC (permalink / raw)
  To: xfs

Hello,

We have upgraded from SLES11 SP1 to SLES11 SP2. We use an exotic ERP System, which stores the data in CISAM Files, which we store on several mounted xfs filesystems ( /disk2, /disk3, /disk4, /disk5 and /disk6)
The machine is a DELL R910 with 256 GB RAM and installed SLES11 SP2 (before we used SLES11 SP1). So we also got the new 3.0 kernel after the upgrade. The xfs mounts are LUNs on a netapp storage mapped via fibre channel to the
Linux host. Also we use multipathd to have several paths to the netapp storage LUNs.

Now after the upgrade to SLES11 SP2 we encountered a strange change on the xfs filesystem /disk5:

The /disk5 is a frequent accessed xfs filesystem by the ERP system. The disk usage increased from 53% to 76-78%. But only the disk usage, the size of the files are completely the same. The defragmentation increased to 96%

linuxsrv1:/disk4/ifax/0000 # xfs_db -c frag -r /dev/mapper/360a98000486e59384b34497248694170
actual 56156, ideal 2014, fragmentation factor 96.41%

linuxsrv1:/disk4/ifax/0000 # xfs_info /dev/mapper/360a98000486e59384b34497248694170
meta-data=/dev/mapper/360a98000486e59384b34497248694170 isize=256    agcount=21, agsize=3276800 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=68157440, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=25600, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0


The fstab for xfs mounts are without any special options or optimizations, here is the snipped from /etc/fstab:

/dev/mapper/360a98000486e59384b3449714a47336c   /disk2  xfs     defaults        0 2
/dev/mapper/360a98000486e59384b34497247514a56   /disk3  xfs     defaults        0 2
/dev/mapper/360a98000486e59384b34497248694170   /disk4  xfs     defaults        0 2
/dev/mapper/360a98000486e59384b344972486f6d4e   /disk5  xfs     defaults        0 2
/dev/mapper/360a98000486e59384b3449724e6f4266   /disk6  xfs     defaults        0 2
/dev/mapper/360a98000486e59384b3449724f326662   /opt/usr        xfs     defaults        0 2

But something must have been changed in xfs, because now the metadata increased so massive, we never had this before with SLES11 SP1.

I did a defragmentation with xfs_fsr and the metadata and usage decreased to 53%. But after 1 hour in production we are agin on 76-78% disk usage and this defragmentation

So my question is what has changed from 2.6 kernels 3.0 kernels, which can explain this massive increase of metadata. (I did a defrag and we had sometimes over 140.000 extends to one inode).

I am completely confused and do now know how to handle this. Perhaps somebody can help me to fix this problem or to understand what happens here….
I also talked with some netapp engineers and they said,  I should ask at xfs.org.

One the filesystem are about 727 CISAM Files (IDX -> Index Files and DAT –> DATA Files). There are ten 15 GB Files on which some small content is often changed by the ERP system. The rest of the files are lower than 400 MB.
We encounter this problem since the upgrade to SLES11 SP2 and the new kernel 3.0. (By the way we had to disable the transparent hugepages support in kernel 3.0, because of kernel crashes ;) - but this is a different story… )

--
Mit freundlichen Grüßen/Kind regards

M.  Hammer
System administration
Information Technology

AUMA Riester GmbH & Co. KG
Aumastr. 1 • 79379 Muellheim/Germany
Tel/Phone +49 7631 809-1620 • Fax +49 7631 809-71620
HammerM@auma.com<mailto:hammerm@auma.com> • www.auma.com<http://www.auma.com/>

Sitz: Müllheim, Registergericht Freiburg HRA 300276
phG: AUMA Riester Verwaltungsgesellschaft mbH, Sitz: Müllheim, Registergericht Freiburg HRB 300424
Geschäftsführer: Matthias Dinse, Henrik Newerla

Registered Office: Muellheim, court of registration: Freiburg HRA 300276
phG: Riester Verwaltungsgesellschaft mbH Registered Office: Muellheim, court of registration: Freiburg HRB 300424
Managing Directors: Matthias Dinse, Henrik Newerla




_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange problems with xfs an SLESS11 SP2
  2012-05-11 12:41 Strange problems with xfs an SLESS11 SP2 Hammer, Marcus
@ 2012-05-11 16:07 ` Ben Myers
  2012-05-11 16:36 ` Stefan Ring
  2012-05-12  4:09 ` Eric Sandeen
  2 siblings, 0 replies; 4+ messages in thread
From: Ben Myers @ 2012-05-11 16:07 UTC (permalink / raw)
  To: Hammer, Marcus; +Cc: xfs

Hey Marcus,

On Fri, May 11, 2012 at 02:41:41PM +0200, Hammer, Marcus wrote:
...
> So my question is what has changed from 2.6 kernels 3.0 kernels, which can
> explain this massive increase of metadata. (I did a defrag and we had
> sometimes over 140.000 extends to one inode).
> 
> I am completely confused and do now know how to handle this. Perhaps somebody
> can help me to fix this problem or to understand what happens here….  I also
> talked with some netapp engineers and they said,  I should ask at xfs.org.

You should also open a case with SuSE.

> One the filesystem are about 727 CISAM Files (IDX -> Index Files and DAT –>
> DATA Files). There are ten 15 GB Files on which some small content is often
> changed by the ERP system. The rest of the files are lower than 400 MB.  We
> encounter this problem since the upgrade to SLES11 SP2 and the new kernel
> 3.0. (By the way we had to disable the transparent hugepages support in
> kernel 3.0, because of kernel crashes ;) - but this is a different story… )

So when you upgraded you left the filesystems in place.  This is not a
dump/restore situation, and the increased fragmentation that you're seeing has
happened with subsequent runs of your application, correct?

-Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange problems with xfs an SLESS11 SP2
  2012-05-11 12:41 Strange problems with xfs an SLESS11 SP2 Hammer, Marcus
  2012-05-11 16:07 ` Ben Myers
@ 2012-05-11 16:36 ` Stefan Ring
  2012-05-12  4:09 ` Eric Sandeen
  2 siblings, 0 replies; 4+ messages in thread
From: Stefan Ring @ 2012-05-11 16:36 UTC (permalink / raw)
  To: Hammer, Marcus; +Cc: xfs

> The /disk5 is a frequent accessed xfs filesystem by the ERP system. The disk usage increased from 53% to 76-78%. But only the disk usage, the size of the files are completely the same.

Is it possible that this is caused by the aggressive pre-allocation?
There is a mount option (allocsize, I think) that allows you to
override this behavior.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange problems with xfs an SLESS11 SP2
  2012-05-11 12:41 Strange problems with xfs an SLESS11 SP2 Hammer, Marcus
  2012-05-11 16:07 ` Ben Myers
  2012-05-11 16:36 ` Stefan Ring
@ 2012-05-12  4:09 ` Eric Sandeen
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Sandeen @ 2012-05-12  4:09 UTC (permalink / raw)
  To: Hammer, Marcus; +Cc: xfs

On 5/11/12 7:41 AM, Hammer, Marcus wrote:
> Hello,
> 
> We have upgraded from SLES11 SP1 to SLES11 SP2. We use an exotic ERP
> System, which stores the data in CISAM Files, which we store on
> several mounted xfs filesystems ( /disk2, /disk3, /disk4, /disk5 and
> /disk6)
> The machine is a DELL R910 with 256 GB RAM and installed SLES11 SP2
> (before we used SLES11 SP1). So we also got the new 3.0 kernel after
> the upgrade. The xfs mounts are LUNs on a netapp storage mapped via
> fibre channel to the
> Linux host. Also we use multipathd to have several paths to the
> netapp storage LUNs.
> 
> Now after the upgrade to SLES11 SP2 we encountered a strange change on the xfs filesystem /disk5:
> 
> The /disk5 is a frequent accessed xfs filesystem by the ERP system.
> The disk usage increased from 53% to 76-78%. 

as measured by df?  This probably is the somewhat aggressive preallocation, as
Stefan suggested in another email.

> But only the disk usage,
> the size of the files are completely the same. The defragmentation
> increased to 96%
> 
> linuxsrv1:/disk4/ifax/0000 # xfs_db -c frag -r /dev/mapper/360a98000486e59384b34497248694170
> actual 56156, ideal 2014, fragmentation factor 96.41%

so on average, about 28 extents per file.  And what was it before?

See also http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25.__Is_that_bad.3F

> linuxsrv1:/disk4/ifax/0000 # xfs_info /dev/mapper/360a98000486e59384b34497248694170
> meta-data=/dev/mapper/360a98000486e59384b34497248694170 isize=256    agcount=21, agsize=3276800 blks
>          =                       sectsz=512   attr=0
> data     =                       bsize=4096   blocks=68157440, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=25600, version=1
>          =                       sectsz=512   sunit=0 blks, lazy-count=0
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Ben, with logv1 and 21 AGs it must be an older, migrated fs :)

> 
> The fstab for xfs mounts are without any special options or optimizations, here is the snipped from /etc/fstab:
> 
> /dev/mapper/360a98000486e59384b3449714a47336c   /disk2  xfs     defaults        0 2
> /dev/mapper/360a98000486e59384b34497247514a56   /disk3  xfs     defaults        0 2
> /dev/mapper/360a98000486e59384b34497248694170   /disk4  xfs     defaults        0 2
> /dev/mapper/360a98000486e59384b344972486f6d4e   /disk5  xfs     defaults        0 2
> /dev/mapper/360a98000486e59384b3449724e6f4266   /disk6  xfs     defaults        0 2
> /dev/mapper/360a98000486e59384b3449724f326662   /opt/usr        xfs     defaults        0 2
> 
> But something must have been changed in xfs, because now the metadata
> increased so massive, we never had this before with SLES11 SP1.

How are you measuring "the metadata increase?" - I'm not sure what you mean by this.

> I did a defragmentation with xfs_fsr and the metadata and usage
> decreased to 53%. But after 1 hour in production we are agin on
> 76-78% disk usage and this defragmentation
> 
> So my question is what has changed from 2.6 kernels 3.0 kernels,
> which can explain this massive increase of metadata. (I did a defrag
> and we had sometimes over 140.000 extends to one inode).

How are the files being written?  Do they grow, are they sparse, direct IO
or buffered, etc?

> I am completely confused and do now know how to handle this. Perhaps
> somebody can help me to fix this problem or to understand what
> happens here….
> I also talked with some netapp engineers and they said, I should ask
> at xfs.org.
> 
> One the filesystem are about 727 CISAM Files (IDX -> Index Files and
> DAT –> DATA Files). There are ten 15 GB Files on which some small
> content is often changed by the ERP system. The rest of the files are
> lower than 400 MB.
> We encounter this problem since the upgrade to SLES11 SP2 and the new
> kernel 3.0. (By the way we had to disable the transparent hugepages
> support in kernel 3.0, because of kernel crashes ;) - but this is a
> different story… )

You can defeat the speculative preallocation by mounting with the
allocsize option, if you want to test that theory.

-Eric

> --
> Mit freundlichen Grüßen/Kind regards
> 
> M.  Hammer
> System administration
> Information Technology
> 
> AUMA Riester GmbH & Co. KG
> Aumastr. 1 • 79379 Muellheim/Germany
> Tel/Phone +49 7631 809-1620 • Fax +49 7631 809-71620
> HammerM@auma.com<mailto:hammerm@auma.com> • www.auma.com<http://www.auma.com/>
> 
> Sitz: Müllheim, Registergericht Freiburg HRA 300276
> phG: AUMA Riester Verwaltungsgesellschaft mbH, Sitz: Müllheim, Registergericht Freiburg HRB 300424
> Geschäftsführer: Matthias Dinse, Henrik Newerla
> 
> Registered Office: Muellheim, court of registration: Freiburg HRA 300276
> phG: Riester Verwaltungsgesellschaft mbH Registered Office: Muellheim, court of registration: Freiburg HRB 300424
> Managing Directors: Matthias Dinse, Henrik Newerla
> 
> 
> 
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-05-12  4:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-11 12:41 Strange problems with xfs an SLESS11 SP2 Hammer, Marcus
2012-05-11 16:07 ` Ben Myers
2012-05-11 16:36 ` Stefan Ring
2012-05-12  4:09 ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.