All of lore.kernel.org
 help / color / mirror / Atom feed
* Filesystems for ceph
@ 2011-07-27 17:52 Christian Brunner
  2011-07-27 22:21 ` Fyodor Ustinov
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Christian Brunner @ 2011-07-27 17:52 UTC (permalink / raw)
  To: ceph-devel

We are having quite some problems with the underlying filesystem for
the cosd's and I would like to hear about other experiences. Here is
what we have gone through so far:

btrfs with 2.6.38:

- good performance
- frequently hitting of various BUG_ON conditions

btrfs with 2.6.39:

- big performance problems after a few days uptime
- occasionally hitting BUG_ON conditions

btrfs with 3.0:

- big performance problems after a few days uptime
- occasionally hitting a deadlock in the btrfs filesystem (cosd is in D-state)

ext4 with a RHEL6.0 kernel (don't remember exactly):

- almost immediate blowup of the kernel (OOPS)

From what I read in Fyodors emails ext4 in 2.6.39 isn't much better.

So, what filesystem would you recommend?

Thanks,
Christian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-27 17:52 Filesystems for ceph Christian Brunner
@ 2011-07-27 22:21 ` Fyodor Ustinov
  2011-07-27 23:33   ` Sage Weil
  2011-07-28  0:04 ` Gregory Farnum
  2011-07-28  4:39 ` Sage Weil
  2 siblings, 1 reply; 10+ messages in thread
From: Fyodor Ustinov @ 2011-07-27 22:21 UTC (permalink / raw)
  To: ceph-devel

Christian Brunner <chb <at> muc.de> writes:

> 
> From what I read in Fyodors emails ext4 in 2.6.39 isn't much better.
> 
> So, what filesystem would you recommend?
> 

Christian, IMHO ext3/ext4/btrfs unusable for ceph. I tested some time xfs, and 
think to use it again. But I think that the ceph developers do not like xfs:)

WBR,
    Fyodor.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-27 22:21 ` Fyodor Ustinov
@ 2011-07-27 23:33   ` Sage Weil
  2011-07-28  4:23     ` Fyodor Ustinov
  0 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2011-07-27 23:33 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Wed, 27 Jul 2011, Fyodor Ustinov wrote:
> Christian Brunner <chb <at> muc.de> writes:
> 
> > 
> > From what I read in Fyodors emails ext4 in 2.6.39 isn't much better.
> > 
> > So, what filesystem would you recommend?
> > 
> 
> Christian, IMHO ext3/ext4/btrfs unusable for ceph. I tested some time xfs, and 
> think to use it again. But I think that the ceph developers do not like xfs:)

What problems have you seen with ext4?

sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-27 17:52 Filesystems for ceph Christian Brunner
  2011-07-27 22:21 ` Fyodor Ustinov
@ 2011-07-28  0:04 ` Gregory Farnum
  2011-07-28  4:39 ` Sage Weil
  2 siblings, 0 replies; 10+ messages in thread
From: Gregory Farnum @ 2011-07-28  0:04 UTC (permalink / raw)
  To: chb; +Cc: ceph-devel

Our largest installation hasn't seen any problems with btrfs on
2.6.38.6, although it may not be as busy as the clusters you're
running on.

I would recommend trying to use a newish stable release though, since
the major releases I think are still introducing new features into
btrfs, and code churn always creates bugs.
-Greg

On Wed, Jul 27, 2011 at 10:52 AM, Christian Brunner <chb@muc.de> wrote:
> We are having quite some problems with the underlying filesystem for
> the cosd's and I would like to hear about other experiences. Here is
> what we have gone through so far:
>
> btrfs with 2.6.38:
>
> - good performance
> - frequently hitting of various BUG_ON conditions
>
> btrfs with 2.6.39:
>
> - big performance problems after a few days uptime
> - occasionally hitting BUG_ON conditions
>
> btrfs with 3.0:
>
> - big performance problems after a few days uptime
> - occasionally hitting a deadlock in the btrfs filesystem (cosd is in D-state)
>
> ext4 with a RHEL6.0 kernel (don't remember exactly):
>
> - almost immediate blowup of the kernel (OOPS)
>
> From what I read in Fyodors emails ext4 in 2.6.39 isn't much better.
>
> So, what filesystem would you recommend?
>
> Thanks,
> Christian
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-27 23:33   ` Sage Weil
@ 2011-07-28  4:23     ` Fyodor Ustinov
  2011-07-28  4:35       ` Sage Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Fyodor Ustinov @ 2011-07-28  4:23 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 07/28/2011 02:33 AM, Sage Weil wrote:
> On Wed, 27 Jul 2011, Fyodor Ustinov wrote:
>> Christian Brunner<chb<at>  muc.de>  writes:
>>
>>>  From what I read in Fyodors emails ext4 in 2.6.39 isn't much better.
>>>
>>> So, what filesystem would you recommend?
>>>
>> Christian, IMHO ext3/ext4/btrfs unusable for ceph. I tested some time xfs, and
>> think to use it again. But I think that the ceph developers do not like xfs:)
> What problems have you seen with ext4?
Look to my message to Greg. In a nutshell - I again have broken ext4 fs 
on OSD servers with 2.6.39 kernel.

WBR,
     Fyodor.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-28  4:23     ` Fyodor Ustinov
@ 2011-07-28  4:35       ` Sage Weil
  2011-07-28 18:27         ` Fyodor Ustinov
  0 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2011-07-28  4:35 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

On Thu, 28 Jul 2011, Fyodor Ustinov wrote:
> On 07/28/2011 02:33 AM, Sage Weil wrote:
> > On Wed, 27 Jul 2011, Fyodor Ustinov wrote:
> > > Christian Brunner<chb<at>  muc.de>  writes:
> > > 
> > > >  From what I read in Fyodors emails ext4 in 2.6.39 isn't much better.
> > > > 
> > > > So, what filesystem would you recommend?
> > > > 
> > > Christian, IMHO ext3/ext4/btrfs unusable for ceph. I tested some time xfs,
> > > and
> > > think to use it again. But I think that the ceph developers do not like
> > > xfs:)
> > What problems have you seen with ext4?
>
> Look to my message to Greg. In a nutshell - I again have broken ext4 fs 
> on OSD servers with 2.6.39 kernel.

Oh, you mean the fsck errors?  Got it.  If you see it again (on another 
disk/machine, with a reasonably fresh fs) the ext4 guys will be interested 
in hearing about it.  It's probably in the xattr code, which ceph uses 
heavily but most people only set once per file (for selinux labels), if 
at all.

sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-27 17:52 Filesystems for ceph Christian Brunner
  2011-07-27 22:21 ` Fyodor Ustinov
  2011-07-28  0:04 ` Gregory Farnum
@ 2011-07-28  4:39 ` Sage Weil
  2011-07-28  7:35   ` srimugunthan dhandapani
  2 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2011-07-28  4:39 UTC (permalink / raw)
  To: Christian Brunner; +Cc: ceph-devel

On Wed, 27 Jul 2011, Christian Brunner wrote:
> We are having quite some problems with the underlying filesystem for
> the cosd's and I would like to hear about other experiences. Here is
> what we have gone through so far:
> 
> btrfs with 2.6.38:
> 
> - good performance
> - frequently hitting of various BUG_ON conditions
> 
> btrfs with 2.6.39:
> 
> - big performance problems after a few days uptime
> - occasionally hitting BUG_ON conditions
> 
> btrfs with 3.0:
> 
> - big performance problems after a few days uptime
> - occasionally hitting a deadlock in the btrfs filesystem (cosd is in D-state)

You might try 3.0 + the latest stuff Chris just sent to Linus (at least 
for the D-state problem).  I'm eager to see whether the latencytop info 
you sent is helpful for the rest.

FWIW we're running 2.6.38+ at the moment.  We get some warnings with the 
orphan code (fixed in .39), but it's otherwise been reasonably stable.  

sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-28  4:39 ` Sage Weil
@ 2011-07-28  7:35   ` srimugunthan dhandapani
  2011-07-28 16:00     ` Gregory Farnum
  0 siblings, 1 reply; 10+ messages in thread
From: srimugunthan dhandapani @ 2011-07-28  7:35 UTC (permalink / raw)
  To: Sage Weil; +Cc: Christian Brunner, ceph-devel

On Thu, Jul 28, 2011 at 10:09 AM, Sage Weil <sage@newdream.net> wrote:
> On Wed, 27 Jul 2011, Christian Brunner wrote:
>> We are having quite some problems with the underlying filesystem for
>> the cosd's and I would like to hear about other experiences. Here is
>> what we have gone through so far:
>>
>> btrfs with 2.6.38:
>>
>> - good performance
>> - frequently hitting of various BUG_ON conditions
>>
>> btrfs with 2.6.39:
>>
>> - big performance problems after a few days uptime
>> - occasionally hitting BUG_ON conditions
>>
>> btrfs with 3.0:
>>
>> - big performance problems after a few days uptime
>> - occasionally hitting a deadlock in the btrfs filesystem (cosd is in D-state)
>
> You might try 3.0 + the latest stuff Chris just sent to Linus (at least
> for the D-state problem).  I'm eager to see whether the latencytop info
> you sent is helpful for the rest.
>
> FWIW we're running 2.6.38+ at the moment.  We get some warnings with the
> orphan code (fixed in .39), but it's otherwise been reasonably stable.

I thought that ext3/4 and btrfs are the only recommended underlying
filesystem for ceph. I didnt know we can use xfs.
With Glusterfs, theoretically, we can use any filesystem with extended
attributes support as backend filesystem. Is that the same case with
ceph too?
Though it may not be optimal, theoretically can i use a "filesystem X"
with extended attributes support for ceph?
thanks,
Mugunthan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-28  7:35   ` srimugunthan dhandapani
@ 2011-07-28 16:00     ` Gregory Farnum
  0 siblings, 0 replies; 10+ messages in thread
From: Gregory Farnum @ 2011-07-28 16:00 UTC (permalink / raw)
  To: srimugunthan dhandapani; +Cc: ceph-devel

On Thu, Jul 28, 2011 at 12:35 AM, srimugunthan dhandapani
<srimugunthan.dhandapani@gmail.com> wrote:
> I thought that ext3/4 and btrfs are the only recommended underlying
> filesystem for ceph. I didnt know we can use xfs.
> With Glusterfs, theoretically, we can use any filesystem with extended
> attributes support as backend filesystem. Is that the same case with
> ceph too?
> Though it may not be optimal, theoretically can i use a "filesystem X"
> with extended attributes support for ceph?
> thanks,
> Mugunthan

These days you can use any filesystem with xattr support as the
backing store for Ceph, yes. btrfs is our primary target for a few
reasons (unlimited xattrs, built-in snapshots make snapshots and
consistency easier, etc) but you can stick whatever you like back
there and Ceph will handle it with writeahead journaling, manual
copying for snapshots, etc etc. :)
-Greg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Filesystems for ceph
  2011-07-28  4:35       ` Sage Weil
@ 2011-07-28 18:27         ` Fyodor Ustinov
  0 siblings, 0 replies; 10+ messages in thread
From: Fyodor Ustinov @ 2011-07-28 18:27 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 07/28/2011 07:35 AM, Sage Weil wrote:
> Oh, you mean the fsck errors?  Got it.  If you see it again (on another
> disk/machine, with a reasonably fresh fs) the ext4 guys will be interested
> in hearing about it.  It's probably in the xattr code, which ceph uses
> heavily but most people only set once per file (for selinux labels), if
> at all.
I now see it on _all_ of 12 osd (8 physical servers, each osd have own fs).

I plan to upgrade the kernel on the osd servers to 3.0 and if  do not 
get better - then time to shake the guys from ext4 team (with your help)

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-07-28 18:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-27 17:52 Filesystems for ceph Christian Brunner
2011-07-27 22:21 ` Fyodor Ustinov
2011-07-27 23:33   ` Sage Weil
2011-07-28  4:23     ` Fyodor Ustinov
2011-07-28  4:35       ` Sage Weil
2011-07-28 18:27         ` Fyodor Ustinov
2011-07-28  0:04 ` Gregory Farnum
2011-07-28  4:39 ` Sage Weil
2011-07-28  7:35   ` srimugunthan dhandapani
2011-07-28 16:00     ` Gregory Farnum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.