All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
       [not found] <CAOu2ZZUT8hqUmatSWTX7RxzhS1oo_5R0NmZ49+k0D91QPHTOjQ@mail.gmail.com>
@ 2016-05-17 12:42 ` Chandan Kumar Singh
  2016-05-17 13:52   ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Chandan Kumar Singh @ 2016-05-17 12:42 UTC (permalink / raw)
  To: ceph-devel

Hi

While migrating to EC pools, I came to know that it does not support
omaps but it allows thousands of xattrs (XFS). Are these xattrs being
stored in a key-value store or in XFS file system?

Regards
Chandan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
  2016-05-17 12:42 ` Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps? Chandan Kumar Singh
@ 2016-05-17 13:52   ` Sage Weil
  2016-05-17 13:58     ` Chandan Kumar Singh
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-05-17 13:52 UTC (permalink / raw)
  To: Chandan Kumar Singh; +Cc: ceph-devel

On Tue, 17 May 2016, Chandan Kumar Singh wrote:
> Hi
> 
> While migrating to EC pools, I came to know that it does not support
> omaps but it allows thousands of xattrs (XFS). Are these xattrs being
> stored in a key-value store or in XFS file system?

They are stored in XFS, until there are more than a handful, after which 
point they get stored in leveldb.  But they are *also* stored in every pg 
log event that modifies the object, so you should definitely not (ab)use 
xattrs the way you would use omap and expect the system to behave/perform!
They are meant to be small and few.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
  2016-05-17 13:52   ` Sage Weil
@ 2016-05-17 13:58     ` Chandan Kumar Singh
  2016-05-17 20:41       ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Chandan Kumar Singh @ 2016-05-17 13:58 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Thanks. Why are omaps not allowed for objects in EC pools?

On Tue, May 17, 2016 at 7:22 PM, Sage Weil <sage@newdream.net> wrote:
> On Tue, 17 May 2016, Chandan Kumar Singh wrote:
>> Hi
>>
>> While migrating to EC pools, I came to know that it does not support
>> omaps but it allows thousands of xattrs (XFS). Are these xattrs being
>> stored in a key-value store or in XFS file system?
>
> They are stored in XFS, until there are more than a handful, after which
> point they get stored in leveldb.  But they are *also* stored in every pg
> log event that modifies the object, so you should definitely not (ab)use
> xattrs the way you would use omap and expect the system to behave/perform!
> They are meant to be small and few.
>
> sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
  2016-05-17 13:58     ` Chandan Kumar Singh
@ 2016-05-17 20:41       ` Sage Weil
  2016-05-18  3:12         ` Chandan Kumar Singh
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-05-17 20:41 UTC (permalink / raw)
  To: Chandan Kumar Singh; +Cc: ceph-devel

On Tue, 17 May 2016, Chandan Kumar Singh wrote:
> Thanks. Why are omaps not allowed for objects in EC pools?

Because it doesn't make much sense to erasure code small key=value pairs 
over lots of nodes.  Values are too small to be individually encoded 
sensibly, and packing them together would require a layer of complexity.  
It could presumably be done, but we didn't do it, and have yet to hear 
from someone who really needs it.

sage



> 
> On Tue, May 17, 2016 at 7:22 PM, Sage Weil <sage@newdream.net> wrote:
> > On Tue, 17 May 2016, Chandan Kumar Singh wrote:
> >> Hi
> >>
> >> While migrating to EC pools, I came to know that it does not support
> >> omaps but it allows thousands of xattrs (XFS). Are these xattrs being
> >> stored in a key-value store or in XFS file system?
> >
> > They are stored in XFS, until there are more than a handful, after which
> > point they get stored in leveldb.  But they are *also* stored in every pg
> > log event that modifies the object, so you should definitely not (ab)use
> > xattrs the way you would use omap and expect the system to behave/perform!
> > They are meant to be small and few.
> >
> > sage
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
  2016-05-17 20:41       ` Sage Weil
@ 2016-05-18  3:12         ` Chandan Kumar Singh
  2016-05-18  6:24           ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Chandan Kumar Singh @ 2016-05-18  3:12 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

I can understand the complexities now. Still, there is a little bit of
surprise element when large number of xattrs can be stored in leveldb
but not  omaps. Are these xattrs not being erasure coded and
distributed over nodes? When EC pools are space efficient alternative
to replicated pools, not having omaps defeats the purpose for anyone
who uses omaps extensively. I can guess that some users might be
storing the key-value kind of metadata in some external store.

On Wed, May 18, 2016 at 2:11 AM, Sage Weil <sage@newdream.net> wrote:
> On Tue, 17 May 2016, Chandan Kumar Singh wrote:
>> Thanks. Why are omaps not allowed for objects in EC pools?
>
> Because it doesn't make much sense to erasure code small key=value pairs
> over lots of nodes.  Values are too small to be individually encoded
> sensibly, and packing them together would require a layer of complexity.
> It could presumably be done, but we didn't do it, and have yet to hear
> from someone who really needs it.
>
> sage
>
>
>
>>
>> On Tue, May 17, 2016 at 7:22 PM, Sage Weil <sage@newdream.net> wrote:
>> > On Tue, 17 May 2016, Chandan Kumar Singh wrote:
>> >> Hi
>> >>
>> >> While migrating to EC pools, I came to know that it does not support
>> >> omaps but it allows thousands of xattrs (XFS). Are these xattrs being
>> >> stored in a key-value store or in XFS file system?
>> >
>> > They are stored in XFS, until there are more than a handful, after which
>> > point they get stored in leveldb.  But they are *also* stored in every pg
>> > log event that modifies the object, so you should definitely not (ab)use
>> > xattrs the way you would use omap and expect the system to behave/perform!
>> > They are meant to be small and few.
>> >
>> > sage
>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
  2016-05-18  3:12         ` Chandan Kumar Singh
@ 2016-05-18  6:24           ` Sage Weil
  2016-05-18  9:19             ` Chandan Kumar Singh
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-05-18  6:24 UTC (permalink / raw)
  To: Chandan Kumar Singh; +Cc: ceph-devel

On Wed, 18 May 2016, Chandan Kumar Singh wrote:
> I can understand the complexities now. Still, there is a little bit of
> surprise element when large number of xattrs can be stored in leveldb
> but not  omaps. Are these xattrs not being erasure coded and
> distributed over nodes? When EC pools are space efficient alternative
> to replicated pools, not having omaps defeats the purpose for anyone
> who uses omaps extensively. I can guess that some users might be
> storing the key-value kind of metadata in some external store.

That's exactly the issue: xattrs are replicated across all OSDs in the PG 
(and also appear in pg log entries).  We didn't implement a way to 
erasure code key/value data (and it's not obvious how one should do so).

For now, heavy omap users should just stick to replication.

sage


> 
> On Wed, May 18, 2016 at 2:11 AM, Sage Weil <sage@newdream.net> wrote:
> > On Tue, 17 May 2016, Chandan Kumar Singh wrote:
> >> Thanks. Why are omaps not allowed for objects in EC pools?
> >
> > Because it doesn't make much sense to erasure code small key=value pairs
> > over lots of nodes.  Values are too small to be individually encoded
> > sensibly, and packing them together would require a layer of complexity.
> > It could presumably be done, but we didn't do it, and have yet to hear
> > from someone who really needs it.
> >
> > sage
> >
> >
> >
> >>
> >> On Tue, May 17, 2016 at 7:22 PM, Sage Weil <sage@newdream.net> wrote:
> >> > On Tue, 17 May 2016, Chandan Kumar Singh wrote:
> >> >> Hi
> >> >>
> >> >> While migrating to EC pools, I came to know that it does not support
> >> >> omaps but it allows thousands of xattrs (XFS). Are these xattrs being
> >> >> stored in a key-value store or in XFS file system?
> >> >
> >> > They are stored in XFS, until there are more than a handful, after which
> >> > point they get stored in leveldb.  But they are *also* stored in every pg
> >> > log event that modifies the object, so you should definitely not (ab)use
> >> > xattrs the way you would use omap and expect the system to behave/perform!
> >> > They are meant to be small and few.
> >> >
> >> > sage
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps?
  2016-05-18  6:24           ` Sage Weil
@ 2016-05-18  9:19             ` Chandan Kumar Singh
  0 siblings, 0 replies; 7+ messages in thread
From: Chandan Kumar Singh @ 2016-05-18  9:19 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Thank You.

On Wed, May 18, 2016 at 11:54 AM, Sage Weil <sage@newdream.net> wrote:
> On Wed, 18 May 2016, Chandan Kumar Singh wrote:
>> I can understand the complexities now. Still, there is a little bit of
>> surprise element when large number of xattrs can be stored in leveldb
>> but not  omaps. Are these xattrs not being erasure coded and
>> distributed over nodes? When EC pools are space efficient alternative
>> to replicated pools, not having omaps defeats the purpose for anyone
>> who uses omaps extensively. I can guess that some users might be
>> storing the key-value kind of metadata in some external store.
>
> That's exactly the issue: xattrs are replicated across all OSDs in the PG
> (and also appear in pg log entries).  We didn't implement a way to
> erasure code key/value data (and it's not obvious how one should do so).
>
> For now, heavy omap users should just stick to replication.
>
> sage
>
>
>>
>> On Wed, May 18, 2016 at 2:11 AM, Sage Weil <sage@newdream.net> wrote:
>> > On Tue, 17 May 2016, Chandan Kumar Singh wrote:
>> >> Thanks. Why are omaps not allowed for objects in EC pools?
>> >
>> > Because it doesn't make much sense to erasure code small key=value pairs
>> > over lots of nodes.  Values are too small to be individually encoded
>> > sensibly, and packing them together would require a layer of complexity.
>> > It could presumably be done, but we didn't do it, and have yet to hear
>> > from someone who really needs it.
>> >
>> > sage
>> >
>> >
>> >
>> >>
>> >> On Tue, May 17, 2016 at 7:22 PM, Sage Weil <sage@newdream.net> wrote:
>> >> > On Tue, 17 May 2016, Chandan Kumar Singh wrote:
>> >> >> Hi
>> >> >>
>> >> >> While migrating to EC pools, I came to know that it does not support
>> >> >> omaps but it allows thousands of xattrs (XFS). Are these xattrs being
>> >> >> stored in a key-value store or in XFS file system?
>> >> >
>> >> > They are stored in XFS, until there are more than a handful, after which
>> >> > point they get stored in leveldb.  But they are *also* stored in every pg
>> >> > log event that modifies the object, so you should definitely not (ab)use
>> >> > xattrs the way you would use omap and expect the system to behave/perform!
>> >> > They are meant to be small and few.
>> >> >
>> >> > sage
>> >>
>> >>
>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-18  9:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAOu2ZZUT8hqUmatSWTX7RxzhS1oo_5R0NmZ49+k0D91QPHTOjQ@mail.gmail.com>
2016-05-17 12:42 ` Fwd: How does EC pools support thousands of xattrs (XFS) but no omaps? Chandan Kumar Singh
2016-05-17 13:52   ` Sage Weil
2016-05-17 13:58     ` Chandan Kumar Singh
2016-05-17 20:41       ` Sage Weil
2016-05-18  3:12         ` Chandan Kumar Singh
2016-05-18  6:24           ` Sage Weil
2016-05-18  9:19             ` Chandan Kumar Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.