All of lore.kernel.org
 help / color / mirror / Atom feed
* Help on ext4/xattr linux kernel stability issue / ceph xattr use?
@ 2015-11-09  9:41 Laurent GUERBY
  2015-11-09 13:24 ` Sage Weil
  0 siblings, 1 reply; 3+ messages in thread
From: Laurent GUERBY @ 2015-11-09  9:41 UTC (permalink / raw)
  To: ceph-devel

Hi,

Part of our ceph cluster is using ext4 and we recently hit major kernel
instability in the form of kernel lockups every few hours, issues
opened:

http://tracker.ceph.com/issues/13662
https://bugzilla.kernel.org/show_bug.cgi?id=107301

On kernel.org kernel developpers are asking about ceph usage of xattr,
in particular wether there are lots of common xattr key/value or wether
they are all differents.

I attached a file with various xattr -l outputs:

https://bugzilla.kernel.org/show_bug.cgi?id=107301#c8
https://bugzilla.kernel.org/attachment.cgi?id=192491

Looks like the "big" xattr "user.ceph._" is always different, same for
the intermediate size "user.ceph.hinfo_key".

"user.cephos.spill_out" and "user.ceph.snapset" seem to have small
values, and within a small value set.

Our cluster is used exclusively for virtual machines block devices with
rbd, on replicated (3) and erasure coded pools (4+1 and 8+2).

Could someone knowledgeable add some information on ceph use of xattr in
the kernel.org bugzilla above?

Also I think it is necessary to warn ceph users to avoid ext4 at all
costs until this kernel/ceph issue is sorted out: we went from
relatively stable production for more than a year to crashes everywhere
all the time since two weeks ago, probably after hitting some magic
limit. We migrated our machines to ubuntu trusty, our SSD based
filesystem to XFS but our HDD are still mostly on ext4 (60 TB
of data to move so not that easy...).

Thanks in advance for your help,

Sincerely,

Laurent GUERBY
http://tetaneutral.net



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help on ext4/xattr linux kernel stability issue / ceph xattr use?
  2015-11-09  9:41 Help on ext4/xattr linux kernel stability issue / ceph xattr use? Laurent GUERBY
@ 2015-11-09 13:24 ` Sage Weil
  2015-11-09 14:19   ` Laurent GUERBY
  0 siblings, 1 reply; 3+ messages in thread
From: Sage Weil @ 2015-11-09 13:24 UTC (permalink / raw)
  To: Laurent GUERBY; +Cc: ceph-devel

On Mon, 9 Nov 2015, Laurent GUERBY wrote:
> Hi,
> 
> Part of our ceph cluster is using ext4 and we recently hit major kernel
> instability in the form of kernel lockups every few hours, issues
> opened:
> 
> http://tracker.ceph.com/issues/13662
> https://bugzilla.kernel.org/show_bug.cgi?id=107301
> 
> On kernel.org kernel developpers are asking about ceph usage of xattr,
> in particular wether there are lots of common xattr key/value or wether
> they are all differents.
> 
> I attached a file with various xattr -l outputs:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=107301#c8
> https://bugzilla.kernel.org/attachment.cgi?id=192491
> 
> Looks like the "big" xattr "user.ceph._" is always different, same for
> the intermediate size "user.ceph.hinfo_key".
> 
> "user.cephos.spill_out" and "user.ceph.snapset" seem to have small
> values, and within a small value set.
> 
> Our cluster is used exclusively for virtual machines block devices with
> rbd, on replicated (3) and erasure coded pools (4+1 and 8+2).
> 
> Could someone knowledgeable add some information on ceph use of xattr in
> the kernel.org bugzilla above?

The above is all correct.  The mbcache (didn't know that existed!) is 
definitely not going to be useful here.
 
> Also I think it is necessary to warn ceph users to avoid ext4 at all
> costs until this kernel/ceph issue is sorted out: we went from
> relatively stable production for more than a year to crashes everywhere
> all the time since two weeks ago, probably after hitting some magic
> limit. We migrated our machines to ubuntu trusty, our SSD based
> filesystem to XFS but our HDD are still mostly on ext4 (60 TB
> of data to move so not that easy...).

Was there a ceph upgrade in there somewhere?  The size of the user.ceph._ 
xattr has increased over time, and (somewhat) recently crossed the 255 
byte threshold (on average) which also triggered a performance regression 
on XFS...

sage


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help on ext4/xattr linux kernel stability issue / ceph xattr use?
  2015-11-09 13:24 ` Sage Weil
@ 2015-11-09 14:19   ` Laurent GUERBY
  0 siblings, 0 replies; 3+ messages in thread
From: Laurent GUERBY @ 2015-11-09 14:19 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On Mon, 2015-11-09 at 05:24 -0800, Sage Weil wrote:
> The above is all correct.  The mbcache (didn't know that existed!) is 
> definitely not going to be useful here.
> > Also I think it is necessary to warn ceph users to avoid ext4 at all
> > costs until this kernel/ceph issue is sorted out: we went from
> > relatively stable production for more than a year to crashes everywhere
> > all the time since two weeks ago, probably after hitting some magic
> > limit. We migrated our machines to ubuntu trusty, our SSD based
> > filesystem to XFS but our HDD are still mostly on ext4 (60 TB
> > of data to move so not that easy...).
> 
> Was there a ceph upgrade in there somewhere?  The size of the user.ceph._ 
> xattr has increased over time, and (somewhat) recently crossed the 255 
> byte threshold (on average) which also triggered a performance regression 
> on XFS...


Hi Sage,

Thanks for the confirmation.

The history of our cluster is:
- initial cluster on ceph 0.80.7 (september 2014)
debian ext4 since xfs and btrfs were crashing on debian/ceph 
- upgraded to 0.87 (december 2014)
- upgraded to 0.94.2 (june 2015)
- on october 26 2015 we got two disk failures in one night, we replaced
the disks but we started to have random machine freeze during
and after the recovery. We upgraded to 0.94.5 to be able to restart
two of our OSD due to:
http://tracker.ceph.com/issues/13594
- after changing various hardware part, adding new machine
we started to suspect ceph/ext4 so we migrated all
our machines to ubuntu trusty and all SSD to XFS leaving
60 TB of data on rotational ext4 (too long to migrate)

During the whole time cluster and data kept expanding
from 4 machines and 2 TB to 11 machines now and 60TB of data
(~ 75% full).

I have lightly tested a rebuild of the ubuntu trusty 3.19
kernel with the ext4 mbcache code removed, patch here:
https://bugzilla.kernel.org/show_bug.cgi?id=107301#c6

But now we have to decide wether to go live with it.

Sincerely,

Laurent


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-11-09 14:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-09  9:41 Help on ext4/xattr linux kernel stability issue / ceph xattr use? Laurent GUERBY
2015-11-09 13:24 ` Sage Weil
2015-11-09 14:19   ` Laurent GUERBY

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.