cephfs: status of directory fragmentation and multiple filesystems

All of lore.kernel.org
 help / color / mirror / Atom feed

* cephfs: status of directory fragmentation and multiple filesystems
@ 2016-06-29 17:47 Radoslaw Zarzynski
  2016-06-29 17:53 ` Gregory Farnum
  0 siblings, 1 reply; 5+ messages in thread
From: Radoslaw Zarzynski @ 2016-06-29 17:47 UTC (permalink / raw)
  To: Ceph Development

Hello,

I saw the recent question about having more than 1 active
MDS in a cluster. I would like to ask similar ones but in
the matter of 1) directory fragmentation and 2) running
multiple filesystems within same cluster.

When those features are expected to be production-ready?
What do we need to achieve this status?

Best regards,
Radoslaw Zarzynski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: cephfs: status of directory fragmentation and multiple filesystems
  2016-06-29 17:47 cephfs: status of directory fragmentation and multiple filesystems Radoslaw Zarzynski
@ 2016-06-29 17:53 ` Gregory Farnum
  2016-07-01  2:17   ` Xiaoxi Chen
  0 siblings, 1 reply; 5+ messages in thread
From: Gregory Farnum @ 2016-06-29 17:53 UTC (permalink / raw)
  To: Radoslaw Zarzynski; +Cc: Ceph Development

On Wed, Jun 29, 2016 at 10:47 AM, Radoslaw Zarzynski
<rzarzynski@mirantis.com> wrote:
> Hello,
>
> I saw the recent question about having more than 1 active
> MDS in a cluster. I would like to ask similar ones but in
> the matter of 1) directory fragmentation and 2) running
> multiple filesystems within same cluster.
>
> When those features are expected to be production-ready?
> What do we need to achieve this status?

These are both all about the QA work of just making sure they're
well-tested in the nightlies, and demonstrating that the functionality
isn't broken. We're expecting dirfrags to be enabled in Kraken; I
don't know if there's a target timeline around multi-fs.

There are available tickets in the tracker that basically come out to
"demonstrate dirfrags actually get exercised in the nightlies", etc if
you're interested in contributing!
-Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: cephfs: status of directory fragmentation and multiple filesystems
  2016-06-29 17:53 ` Gregory Farnum
@ 2016-07-01  2:17   ` Xiaoxi Chen
  2016-07-01  3:19     ` Gregory Farnum
  2016-07-01 11:13     ` John Spray
  0 siblings, 2 replies; 5+ messages in thread
From: Xiaoxi Chen @ 2016-07-01  2:17 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Radoslaw Zarzynski, Ceph Development

Greg,
    Would you mind sharing your insight about potential issue of
multiple FS?  It looks to me like we can separate every component of
the cluster --- i.e , each FS has its own data pool ,meta pool, and
even the pools can be mapped to different osds. Seperate MDS nodes so
they dont  compete for memory .  The only sharing part is Monitor , so
seems it is simpler than the dir frag and very likely to work ?

     As single MDS can only sustain 1~2 CPU cores and provide less
than 2000 TPS for most of the operation (tested with mds_log = false ,
which is the upper bound of performance. op including rename, utime,
open).  So multi-fs is some kind of *must have* for  one like to
provide FS service.

Xiaoxi

2016-06-30 1:53 GMT+08:00 Gregory Farnum <gfarnum@redhat.com>:
> On Wed, Jun 29, 2016 at 10:47 AM, Radoslaw Zarzynski
> <rzarzynski@mirantis.com> wrote:
>> Hello,
>>
>> I saw the recent question about having more than 1 active
>> MDS in a cluster. I would like to ask similar ones but in
>> the matter of 1) directory fragmentation and 2) running
>> multiple filesystems within same cluster.
>>
>> When those features are expected to be production-ready?
>> What do we need to achieve this status?
>
> These are both all about the QA work of just making sure they're
> well-tested in the nightlies, and demonstrating that the functionality
> isn't broken. We're expecting dirfrags to be enabled in Kraken; I
> don't know if there's a target timeline around multi-fs.
>
> There are available tickets in the tracker that basically come out to
> "demonstrate dirfrags actually get exercised in the nightlies", etc if
> you're interested in contributing!
> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: cephfs: status of directory fragmentation and multiple filesystems
  2016-07-01  2:17   ` Xiaoxi Chen
@ 2016-07-01  3:19     ` Gregory Farnum
  2016-07-01 11:13     ` John Spray
  1 sibling, 0 replies; 5+ messages in thread
From: Gregory Farnum @ 2016-07-01  3:19 UTC (permalink / raw)
  To: Xiaoxi Chen; +Cc: Radoslaw Zarzynski, Ceph Development

On Thu, Jun 30, 2016 at 7:17 PM, Xiaoxi Chen <superdebuger@gmail.com> wrote:
> Greg,
>     Would you mind sharing your insight about potential issue of
> multiple FS?  It looks to me like we can separate every component of
> the cluster --- i.e , each FS has its own data pool ,meta pool, and
> even the pools can be mapped to different osds. Seperate MDS nodes so
> they dont  compete for memory .  The only sharing part is Monitor , so
> seems it is simpler than the dir frag and very likely to work ?

I actually expect it does work. John may have more thoughts since he
wrote it, but the code merged pretty late in the Jewel cycle and there
are a few things to think about
1) comparative lack of testing, due to newness of code,
2) possibility that some interfaces may change,
3) lack of multi-fs awareness in many of the fsck systems.

This comes up especially if, for instance, you want to give each
tenant their own FS. Providing them an MDS daemon may be feasible, but
giving each user their own pool probably isn't. We'll be enabling (or
have enabled? I forget) you to put each FS in a different namespace,
rather than separate pools, but none of the fsck tools are prepared to
see multiple inodes of the same number distinguished only by
namespace.
-Greg

>
>      As single MDS can only sustain 1~2 CPU cores and provide less
> than 2000 TPS for most of the operation (tested with mds_log = false ,
> which is the upper bound of performance. op including rename, utime,
> open).  So multi-fs is some kind of *must have* for  one like to
> provide FS service.
>
> Xiaoxi
>
> 2016-06-30 1:53 GMT+08:00 Gregory Farnum <gfarnum@redhat.com>:
>> On Wed, Jun 29, 2016 at 10:47 AM, Radoslaw Zarzynski
>> <rzarzynski@mirantis.com> wrote:
>>> Hello,
>>>
>>> I saw the recent question about having more than 1 active
>>> MDS in a cluster. I would like to ask similar ones but in
>>> the matter of 1) directory fragmentation and 2) running
>>> multiple filesystems within same cluster.
>>>
>>> When those features are expected to be production-ready?
>>> What do we need to achieve this status?
>>
>> These are both all about the QA work of just making sure they're
>> well-tested in the nightlies, and demonstrating that the functionality
>> isn't broken. We're expecting dirfrags to be enabled in Kraken; I
>> don't know if there's a target timeline around multi-fs.
>>
>> There are available tickets in the tracker that basically come out to
>> "demonstrate dirfrags actually get exercised in the nightlies", etc if
>> you're interested in contributing!
>> -Greg
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: cephfs: status of directory fragmentation and multiple filesystems
  2016-07-01  2:17   ` Xiaoxi Chen
  2016-07-01  3:19     ` Gregory Farnum
@ 2016-07-01 11:13     ` John Spray
  1 sibling, 0 replies; 5+ messages in thread
From: John Spray @ 2016-07-01 11:13 UTC (permalink / raw)
  To: Xiaoxi Chen; +Cc: Gregory Farnum, Radoslaw Zarzynski, Ceph Development

On Fri, Jul 1, 2016 at 3:17 AM, Xiaoxi Chen <superdebuger@gmail.com> wrote:
> Greg,
>     Would you mind sharing your insight about potential issue of
> multiple FS?  It looks to me like we can separate every component of
> the cluster --- i.e , each FS has its own data pool ,meta pool, and
> even the pools can be mapped to different osds. Seperate MDS nodes so
> they dont  compete for memory .  The only sharing part is Monitor , so
> seems it is simpler than the dir frag and very likely to work ?
>
>      As single MDS can only sustain 1~2 CPU cores and provide less
> than 2000 TPS for most of the operation (tested with mds_log = false ,
> which is the upper bound of performance. op including rename, utime,
> open).  So multi-fs is some kind of *must have* for  one like to
> provide FS service.

One of the reasons for the multi fs capability was for cases like Manila ;-)

The multi-filesystem support is indeed simpler than the other
experimental features, although that's not to say it's necessarily not
buggy.  For example, in the 10.2.0 release we had a bug where if you
were mounting a non-default filesystem the clients would fail to pick
up on MDS failover: http://tracker.ceph.com/issues/16022

The main thing I'd say about the multi-fs support is that if it seems
like it's working for you, it's probably going to keep working (unlike
some other things which might work fine for weeks and then one day hit
an issue almost at random).

We are currently a bit behind on backporting fixes to jewel, so if
you're developing against things like multi-fs, I would suggest that
you develop your integration with master, and then make sure the right
bits have made it into a jewel release before deploying to production.

John

> Xiaoxi
>
> 2016-06-30 1:53 GMT+08:00 Gregory Farnum <gfarnum@redhat.com>:
>> On Wed, Jun 29, 2016 at 10:47 AM, Radoslaw Zarzynski
>> <rzarzynski@mirantis.com> wrote:
>>> Hello,
>>>
>>> I saw the recent question about having more than 1 active
>>> MDS in a cluster. I would like to ask similar ones but in
>>> the matter of 1) directory fragmentation and 2) running
>>> multiple filesystems within same cluster.
>>>
>>> When those features are expected to be production-ready?
>>> What do we need to achieve this status?
>>
>> These are both all about the QA work of just making sure they're
>> well-tested in the nightlies, and demonstrating that the functionality
>> isn't broken. We're expecting dirfrags to be enabled in Kraken; I
>> don't know if there's a target timeline around multi-fs.
>>
>> There are available tickets in the tracker that basically come out to
>> "demonstrate dirfrags actually get exercised in the nightlies", etc if
>> you're interested in contributing!
>> -Greg
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-01 11:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-29 17:47 cephfs: status of directory fragmentation and multiple filesystems Radoslaw Zarzynski
2016-06-29 17:53 ` Gregory Farnum
2016-07-01  2:17   ` Xiaoxi Chen
2016-07-01  3:19     ` Gregory Farnum
2016-07-01 11:13     ` John Spray

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.