All of lore.kernel.org
 help / color / mirror / Atom feed
* per-pool bluestore options
@ 2016-08-13 18:56 Sage Weil
  2016-08-15 13:48 ` Igor Fedotov
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-08-13 18:56 UTC (permalink / raw)
  To: ifedotov; +Cc: ceph-devel

Hi Igor,

I took another look at

	https://github.com/ceph/ceph/pull/10556

You define three settings:

 compress_hint - determines if pool contains compressible / incompressible data
 compress_algorithm - permits to specify different compression algorithm
 compress_ratio - specifies maximum compression ratio

I think we should extend this to include csum-related options.  And use a 
consistent naming scheme that aligns with the config options where we just 
strip off the bluestore_ prefix.  The relevant options are:

 bluestore_csum = {true, false}
 bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
 bluestore_csum_min_chunk_size = 4k      (*)
 bluestore_csum_max_chunk_size = 64k     (*)

 bluestore_compression = {force, aggressive, passive, none}
 bluestore_compression_algorithm = {snappy, zlib, ...}
 bluestore_compression_min_blob_size = 256k
 bluestore_compression_max_blob_size = 4M
 bluestore_compression_required_ratio = .875

(*) These currently have a different name but aren't used yet.  Working on 
a PR to change that.

What's missing is your 'compress_hint'.  We can call that 
'compression_hint' to align with the names above?

 compression_hint = {compressible, incompressible, ...}

The main changes from your PR that I think we need to make are:

* These options should be part of the pool_opts_t structure in pg_pool_t 
(which is a set of optional key/value-like parameters for the pool).

* We can add a new ObjectStore operation that passes down parameters for a 
collection, and have the OSD pass these all in for each PG collection when 
the pool properties change.  That way ObjectStore doesn't need to persist 
these options at all--just store the ones it understands in memory, and 
the OSD will always reset them on startup etc.

What do you think?

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: per-pool bluestore options
  2016-08-13 18:56 per-pool bluestore options Sage Weil
@ 2016-08-15 13:48 ` Igor Fedotov
  2016-08-17 16:27   ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Igor Fedotov @ 2016-08-15 13:48 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Sage,

see inline.


On 13.08.2016 21:56, Sage Weil wrote:
> Hi Igor,
>
> I took another look at
>
> 	https://github.com/ceph/ceph/pull/10556
>
> You define three settings:
>
>   compress_hint - determines if pool contains compressible / incompressible data
>   compress_algorithm - permits to specify different compression algorithm
>   compress_ratio - specifies maximum compression ratio
>
> I think we should extend this to include csum-related options.  And use a
> consistent naming scheme that aligns with the config options where we just
> strip off the bluestore_ prefix.  The relevant options are:
Sounds good!
The major questions here is how do these per-pool options correlate with 
corresponding per-store ones?
One might suggest per-pool options to have higher priority over 
per-store one. But I'm not sure that the best option.  Sometimes user 
might want to disable/alter corresponding option without enumerating all 
the pools by simple switching at per-store level. Hence we need to 
consider some means for that.
>   bluestore_csum = {true, false}
>   bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
>   bluestore_csum_min_chunk_size = 4k      (*)
>   bluestore_csum_max_chunk_size = 64k     (*)
>   bluestore_compression = {force, aggressive, passive, none}
Actually this option along with compression_hint result in a single 
flag: compress or not. Any rationale for not using that simple flag?

My original idea here was to have bluestore_compression on per-store 
basis and compression_hint of per pool one. This way one can receive 
pretty flexible control at both storage and pool level - see my question 
above.

>   bluestore_compression_algorithm = {snappy, zlib, ...}
>   bluestore_compression_min_blob_size = 256k
>   bluestore_compression_max_blob_size = 4M
>   bluestore_compression_required_ratio = .875
>
> (*) These currently have a different name but aren't used yet.  Working on
> a PR to change that.
>
> What's missing is your 'compress_hint'.  We can call that
> 'compression_hint' to align with the names above?
>
>   compression_hint = {compressible, incompressible, ...}
>
> The main changes from your PR that I think we need to make are:
>
> * These options should be part of the pool_opts_t structure in pg_pool_t
> (which is a set of optional key/value-like parameters for the pool).
>
> * We can add a new ObjectStore operation that passes down parameters for a
> collection, and have the OSD pass these all in for each PG collection when
> the pool properties change.  That way ObjectStore doesn't need to persist
> these options at all--just store the ones it understands in memory, and
> the OSD will always reset them on startup etc.
One, probably silly, question here - do pool and collection have 1 to 1 
relation? It seemed to me that they don't and hence we can't store 
per-pool settings at collection level without some additional mapping: 
pool -> setting. Also this requires some means to remove pool settings 
entry when pool goes away...

And another question - are there any means to receive notification when 
pool setting changes. Similar to the one we have for OSD settings. We 
need that to trigger that new Object Store operation to update pool 
settings.
>
> What do you think?
>
> sage


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: per-pool bluestore options
  2016-08-15 13:48 ` Igor Fedotov
@ 2016-08-17 16:27   ` Sage Weil
  2016-08-18 13:18     ` Igor Fedotov
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-08-17 16:27 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: ceph-devel

On Mon, 15 Aug 2016, Igor Fedotov wrote:
> On 13.08.2016 21:56, Sage Weil wrote:
> > Hi Igor,
> > 
> > I took another look at
> > 
> > 	https://github.com/ceph/ceph/pull/10556
> > 
> > You define three settings:
> > 
> >   compress_hint - determines if pool contains compressible / incompressible
> > data
> >   compress_algorithm - permits to specify different compression algorithm
> >   compress_ratio - specifies maximum compression ratio
> > 
> > I think we should extend this to include csum-related options.  And use a
> > consistent naming scheme that aligns with the config options where we just
> > strip off the bluestore_ prefix.  The relevant options are:
> Sounds good!
> The major questions here is how do these per-pool options correlate with
> corresponding per-store ones?
> One might suggest per-pool options to have higher priority over per-store one.
> But I'm not sure that the best option.  Sometimes user might want to
> disable/alter corresponding option without enumerating all the pools by simple
> switching at per-store level. Hence we need to consider some means for that.


> >   bluestore_csum = {true, false}
> >   bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
> >   bluestore_csum_min_chunk_size = 4k      (*)
> >   bluestore_csum_max_chunk_size = 64k     (*)
> >   bluestore_compression = {force, aggressive, passive, none}
> Actually this option along with compression_hint result in a single flag:
> compress or not. Any rationale for not using that simple flag?

There are currently two persistent hints: compressible and incompressible.  
Aggressive will compress unless incompressible, whereas passive will 
compress if compressible.  Either way it varies per object, though, so a 
single flag isn't sufficient.

> My original idea here was to have bluestore_compression on per-store basis and
> compression_hint of per pool one. This way one can receive pretty flexible
> control at both storage and pool level - see my question above.

Yeah, I'm not sure how complex it's worth getting.  To get complete 
control, we probably need a default *and* the min/max allowed range for 
each option in order to bound what you can choose per-pool, but I think 
that is probably overkill.

A bit easier would be to have

 bluestore_csum_override = {,true,false}
 bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
 bluestore_compression_override = {, force, aggressive, passive, none}

and have them blank by default (no override).  For the numeric options 
we'd need to use 0 to mean 'no override'.

What do you think?

> >   bluestore_compression_algorithm = {snappy, zlib, ...}
> >   bluestore_compression_min_blob_size = 256k
> >   bluestore_compression_max_blob_size = 4M
> >   bluestore_compression_required_ratio = .875
> > 
> > (*) These currently have a different name but aren't used yet.  Working on
> > a PR to change that.
> > 
> > What's missing is your 'compress_hint'.  We can call that
> > 'compression_hint' to align with the names above?
> > 
> >   compression_hint = {compressible, incompressible, ...}
> > 
> > The main changes from your PR that I think we need to make are:
> > 
> > * These options should be part of the pool_opts_t structure in pg_pool_t
> > (which is a set of optional key/value-like parameters for the pool).
> > 
> > * We can add a new ObjectStore operation that passes down parameters for a
> > collection, and have the OSD pass these all in for each PG collection when
> > the pool properties change.  That way ObjectStore doesn't need to persist
> > these options at all--just store the ones it understands in memory, and
> > the OSD will always reset them on startup etc.
> One, probably silly, question here - do pool and collection have 1 to 1
> relation? It seemed to me that they don't and hence we can't store per-pool
> settings at collection level without some additional mapping: pool -> setting.
> Also this requires some means to remove pool settings entry when pool goes
> away...

It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when 
the OSD gets a pool change, it'll set the same flags for all local PGs in 
that pool.  A bit of duplication, but it keeps the interface 
uncomplicated (no need to teach ObjectStore about pools).

> And another question - are there any means to receive notification when pool
> setting changes. Similar to the one we have for OSD settings. We need that to
> trigger that new Object Store operation to update pool settings.

There is already some machinery to do this (the pg_pool_t struct has an 
epoch field).  See void PGPool::update(OSDMapRef map) in PG.cc.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: per-pool bluestore options
  2016-08-17 16:27   ` Sage Weil
@ 2016-08-18 13:18     ` Igor Fedotov
  2016-08-18 15:26       ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Igor Fedotov @ 2016-08-18 13:18 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel


On 17.08.2016 19:27, Sage Weil wrote:
> On Mon, 15 Aug 2016, Igor Fedotov wrote:
>> On 13.08.2016 21:56, Sage Weil wrote:
>>> Hi Igor,
>>>
>>> I took another look at
>>>
>>> 	https://github.com/ceph/ceph/pull/10556
>>>
>>> You define three settings:
>>>
>>>    compress_hint - determines if pool contains compressible / incompressible
>>> data
>>>    compress_algorithm - permits to specify different compression algorithm
>>>    compress_ratio - specifies maximum compression ratio
>>>
>>> I think we should extend this to include csum-related options.  And use a
>>> consistent naming scheme that aligns with the config options where we just
>>> strip off the bluestore_ prefix.  The relevant options are:
>> Sounds good!
>> The major questions here is how do these per-pool options correlate with
>> corresponding per-store ones?
>> One might suggest per-pool options to have higher priority over per-store one.
>> But I'm not sure that the best option.  Sometimes user might want to
>> disable/alter corresponding option without enumerating all the pools by simple
>> switching at per-store level. Hence we need to consider some means for that.
>
>>>    bluestore_csum = {true, false}
>>>    bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
>>>    bluestore_csum_min_chunk_size = 4k      (*)
>>>    bluestore_csum_max_chunk_size = 64k     (*)
>>>    bluestore_compression = {force, aggressive, passive, none}
>> Actually this option along with compression_hint result in a single flag:
>> compress or not. Any rationale for not using that simple flag?
> There are currently two persistent hints: compressible and incompressible.
> Aggressive will compress unless incompressible, whereas passive will
> compress if compressible.  Either way it varies per object, though, so a
> single flag isn't sufficient.
So final approach is to have bluestore_compression switch at pool level 
(and per-store too) and compression_hint at object one, right? Hence we 
don't need compression_hint per pool. In fact my original point was that 
the latter(along with bluestore_compression) is IMHO an overkill and can 
be substituted with simple enable_compression flag.

>> My original idea here was to have bluestore_compression on per-store basis and
>> compression_hint of per pool one. This way one can receive pretty flexible
>> control at both storage and pool level - see my question above.
> Yeah, I'm not sure how complex it's worth getting.  To get complete
> control, we probably need a default *and* the min/max allowed range for
> each option in order to bound what you can choose per-pool, but I think
> that is probably overkill.
>
> A bit easier would be to have
>
>   bluestore_csum_override = {,true,false}
>   bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
>   bluestore_compression_override = {, force, aggressive, passive, none}
>
> and have them blank by default (no override).  For the numeric options
> we'd need to use 0 to mean 'no override'.
>
> What do you think?
IMO per-pool settings (if set) have to override corresponding per-store 
one by default. But one should be able to turn off that behavior with 
'one-click' and force specific per-store setting to prevail.
Hence we don't need 'override' settings at per-pool level but should 
have one at per-store one. E.g.

Pool A has:
bluestore_csum = true
Pool B:
bluestore_csum=<not set>
OSD:
bluestore_csum=  false [/force]

if /'force' is omitted - actual settings are
Pool A:
bluestore_csum=true
Pool B:
bluestore_csum=false

else
Pool A:
bluestore_csum=false
Pool B:
bluestore_csum=false

An additional 'bluestore_force_all_settings=true' can be introduced to 
force all per-store settings to prevail too.
>>>    bluestore_compression_algorithm = {snappy, zlib, ...}
>>>    bluestore_compression_min_blob_size = 256k
>>>    bluestore_compression_max_blob_size = 4M
>>>    bluestore_compression_required_ratio = .875
>>>
>>> (*) These currently have a different name but aren't used yet.  Working on
>>> a PR to change that.
>>>
>>> What's missing is your 'compress_hint'.  We can call that
>>> 'compression_hint' to align with the names above?
>>>
>>>    compression_hint = {compressible, incompressible, ...}
>>>
>>> The main changes from your PR that I think we need to make are:
>>>
>>> * These options should be part of the pool_opts_t structure in pg_pool_t
>>> (which is a set of optional key/value-like parameters for the pool).
>>>
>>> * We can add a new ObjectStore operation that passes down parameters for a
>>> collection, and have the OSD pass these all in for each PG collection when
>>> the pool properties change.  That way ObjectStore doesn't need to persist
>>> these options at all--just store the ones it understands in memory, and
>>> the OSD will always reset them on startup etc.
>> One, probably silly, question here - do pool and collection have 1 to 1
>> relation? It seemed to me that they don't and hence we can't store per-pool
>> settings at collection level without some additional mapping: pool -> setting.
>> Also this requires some means to remove pool settings entry when pool goes
>> away...
> It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when
> the OSD gets a pool change, it'll set the same flags for all local PGs in
> that pool.  A bit of duplication, but it keeps the interface
> uncomplicated (no need to teach ObjectStore about pools).
And PG belongs to single pool only, right?
>
>> And another question - are there any means to receive notification when pool
>> setting changes. Similar to the one we have for OSD settings. We need that to
>> trigger that new Object Store operation to update pool settings.
> There is already some machinery to do this (the pg_pool_t struct has an
> epoch field).  See void PGPool::update(OSDMapRef map) in PG.cc.
Will do, thanks.
> sage
Thanks,
Igor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: per-pool bluestore options
  2016-08-18 13:18     ` Igor Fedotov
@ 2016-08-18 15:26       ` Sage Weil
  2016-08-18 16:16         ` Igor Fedotov
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2016-08-18 15:26 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: ceph-devel

On Thu, 18 Aug 2016, Igor Fedotov wrote:
> On 17.08.2016 19:27, Sage Weil wrote:
> > On Mon, 15 Aug 2016, Igor Fedotov wrote:
> > > On 13.08.2016 21:56, Sage Weil wrote:
> > > > Hi Igor,
> > > > 
> > > > I took another look at
> > > > 
> > > > 	https://github.com/ceph/ceph/pull/10556
> > > > 
> > > > You define three settings:
> > > > 
> > > >    compress_hint - determines if pool contains compressible /
> > > > incompressible
> > > > data
> > > >    compress_algorithm - permits to specify different compression
> > > > algorithm
> > > >    compress_ratio - specifies maximum compression ratio
> > > > 
> > > > I think we should extend this to include csum-related options.  And use
> > > > a
> > > > consistent naming scheme that aligns with the config options where we
> > > > just
> > > > strip off the bluestore_ prefix.  The relevant options are:
> > > Sounds good!
> > > The major questions here is how do these per-pool options correlate with
> > > corresponding per-store ones?
> > > One might suggest per-pool options to have higher priority over per-store
> > > one.
> > > But I'm not sure that the best option.  Sometimes user might want to
> > > disable/alter corresponding option without enumerating all the pools by
> > > simple
> > > switching at per-store level. Hence we need to consider some means for
> > > that.
> > 
> > > >    bluestore_csum = {true, false}
> > > >    bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
> > > >    bluestore_csum_min_chunk_size = 4k      (*)
> > > >    bluestore_csum_max_chunk_size = 64k     (*)
> > > >    bluestore_compression = {force, aggressive, passive, none}
> > > Actually this option along with compression_hint result in a single flag:
> > > compress or not. Any rationale for not using that simple flag?
> > There are currently two persistent hints: compressible and incompressible.
> > Aggressive will compress unless incompressible, whereas passive will
> > compress if compressible.  Either way it varies per object, though, so a
> > single flag isn't sufficient.
> So final approach is to have bluestore_compression switch at pool level (and
> per-store too) and compression_hint at object one, right? Hence we don't need
> compression_hint per pool. In fact my original point was that the latter(along
> with bluestore_compression) is IMHO an overkill and can be substituted with
> simple enable_compression flag.

I think so.  Having a pool compression_hint = compressible will accomplish 
the same thing as setting the pool compression = aggressive.

> > > My original idea here was to have bluestore_compression on per-store basis
> > > and
> > > compression_hint of per pool one. This way one can receive pretty flexible
> > > control at both storage and pool level - see my question above.
> > Yeah, I'm not sure how complex it's worth getting.  To get complete
> > control, we probably need a default *and* the min/max allowed range for
> > each option in order to bound what you can choose per-pool, but I think
> > that is probably overkill.
> > 
> > A bit easier would be to have
> > 
> >   bluestore_csum_override = {,true,false}
> >   bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
> >   bluestore_compression_override = {, force, aggressive, passive, none}
> > 
> > and have them blank by default (no override).  For the numeric options
> > we'd need to use 0 to mean 'no override'.
> > 
> > What do you think?
> IMO per-pool settings (if set) have to override corresponding per-store 
> one by default. But one should be able to turn off that behavior with 
> 'one-click' and force specific per-store setting to prevail.
> Hence we don't need 'override' settings at per-pool level but should have one
> at per-store one. E.g.
> 
> Pool A has:
> bluestore_csum = true
> Pool B:
> bluestore_csum=<not set>
> OSD:
> bluestore_csum=  false [/force]
> 
> if /'force' is omitted - actual settings are
> Pool A:
> bluestore_csum=true
> Pool B:
> bluestore_csum=false
> 
> else
> Pool A:
> bluestore_csum=false
> Pool B:
> bluestore_csum=false

Hmm, this makes parsing more complicated but is probably less confusing.  
But what about numeric values?  

 bluestore_compression_max_blob_size=1048576/force?

This is partly why I like the (implementation) simplicity of

 bluestore_compression_max_blob_size_force=1048576

(or _override, whatever).

> An additional 'bluestore_force_all_settings=true' can be introduced to force
> all per-store settings to prevail too.

Or maybe

 bluestore_ignore_per_collection_settings=true
or the converse
 bluestore_per_collection_settings=false

to be precise?

> > > >    bluestore_compression_algorithm = {snappy, zlib, ...}
> > > >    bluestore_compression_min_blob_size = 256k
> > > >    bluestore_compression_max_blob_size = 4M
> > > >    bluestore_compression_required_ratio = .875
> > > > 
> > > > (*) These currently have a different name but aren't used yet.  Working
> > > > on
> > > > a PR to change that.
> > > > 
> > > > What's missing is your 'compress_hint'.  We can call that
> > > > 'compression_hint' to align with the names above?
> > > > 
> > > >    compression_hint = {compressible, incompressible, ...}
> > > > 
> > > > The main changes from your PR that I think we need to make are:
> > > > 
> > > > * These options should be part of the pool_opts_t structure in pg_pool_t
> > > > (which is a set of optional key/value-like parameters for the pool).
> > > > 
> > > > * We can add a new ObjectStore operation that passes down parameters for
> > > > a
> > > > collection, and have the OSD pass these all in for each PG collection
> > > > when
> > > > the pool properties change.  That way ObjectStore doesn't need to
> > > > persist
> > > > these options at all--just store the ones it understands in memory, and
> > > > the OSD will always reset them on startup etc.
> > > One, probably silly, question here - do pool and collection have 1 to 1
> > > relation? It seemed to me that they don't and hence we can't store
> > > per-pool
> > > settings at collection level without some additional mapping: pool ->
> > > setting.
> > > Also this requires some means to remove pool settings entry when pool goes
> > > away...
> > It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when
> > the OSD gets a pool change, it'll set the same flags for all local PGs in
> > that pool.  A bit of duplication, but it keeps the interface
> > uncomplicated (no need to teach ObjectStore about pools).
> And PG belongs to single pool only, right?

Right.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: per-pool bluestore options
  2016-08-18 15:26       ` Sage Weil
@ 2016-08-18 16:16         ` Igor Fedotov
  2016-08-18 16:18           ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Igor Fedotov @ 2016-08-18 16:16 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel


On 18.08.2016 18:26, Sage Weil wrote:
> On Thu, 18 Aug 2016, Igor Fedotov wrote:
>> On 17.08.2016 19:27, Sage Weil wrote:
>>> On Mon, 15 Aug 2016, Igor Fedotov wrote:
>>>> On 13.08.2016 21:56, Sage Weil wrote:
>>>>> Hi Igor,
>>>>>
>>>>> I took another look at
>>>>>
>>>>> 	https://github.com/ceph/ceph/pull/10556
>>>>>
>>>>> You define three settings:
>>>>>
>>>>>     compress_hint - determines if pool contains compressible /
>>>>> incompressible
>>>>> data
>>>>>     compress_algorithm - permits to specify different compression
>>>>> algorithm
>>>>>     compress_ratio - specifies maximum compression ratio
>>>>>
>>>>> I think we should extend this to include csum-related options.  And use
>>>>> a
>>>>> consistent naming scheme that aligns with the config options where we
>>>>> just
>>>>> strip off the bluestore_ prefix.  The relevant options are:
>>>> Sounds good!
>>>> The major questions here is how do these per-pool options correlate with
>>>> corresponding per-store ones?
>>>> One might suggest per-pool options to have higher priority over per-store
>>>> one.
>>>> But I'm not sure that the best option.  Sometimes user might want to
>>>> disable/alter corresponding option without enumerating all the pools by
>>>> simple
>>>> switching at per-store level. Hence we need to consider some means for
>>>> that.
>>>>>     bluestore_csum = {true, false}
>>>>>     bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
>>>>>     bluestore_csum_min_chunk_size = 4k      (*)
>>>>>     bluestore_csum_max_chunk_size = 64k     (*)
>>>>>     bluestore_compression = {force, aggressive, passive, none}
>>>> Actually this option along with compression_hint result in a single flag:
>>>> compress or not. Any rationale for not using that simple flag?
>>> There are currently two persistent hints: compressible and incompressible.
>>> Aggressive will compress unless incompressible, whereas passive will
>>> compress if compressible.  Either way it varies per object, though, so a
>>> single flag isn't sufficient.
>> So final approach is to have bluestore_compression switch at pool level (and
>> per-store too) and compression_hint at object one, right? Hence we don't need
>> compression_hint per pool. In fact my original point was that the latter(along
>> with bluestore_compression) is IMHO an overkill and can be substituted with
>> simple enable_compression flag.
> I think so.  Having a pool compression_hint = compressible will accomplish
> the same thing as setting the pool compression = aggressive.
OK, Sounds good!
Still there is an open question how to assign such a hint to objects 
from user perspective....
>
>>>> My original idea here was to have bluestore_compression on per-store basis
>>>> and
>>>> compression_hint of per pool one. This way one can receive pretty flexible
>>>> control at both storage and pool level - see my question above.
>>> Yeah, I'm not sure how complex it's worth getting.  To get complete
>>> control, we probably need a default *and* the min/max allowed range for
>>> each option in order to bound what you can choose per-pool, but I think
>>> that is probably overkill.
>>>
>>> A bit easier would be to have
>>>
>>>    bluestore_csum_override = {,true,false}
>>>    bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
>>>    bluestore_compression_override = {, force, aggressive, passive, none}
>>>
>>> and have them blank by default (no override).  For the numeric options
>>> we'd need to use 0 to mean 'no override'.
>>>
>>> What do you think?
>> IMO per-pool settings (if set) have to override corresponding per-store
>> one by default. But one should be able to turn off that behavior with
>> 'one-click' and force specific per-store setting to prevail.
>> Hence we don't need 'override' settings at per-pool level but should have one
>> at per-store one. E.g.
>>
>> Pool A has:
>> bluestore_csum = true
>> Pool B:
>> bluestore_csum=<not set>
>> OSD:
>> bluestore_csum=  false [/force]
>>
>> if /'force' is omitted - actual settings are
>> Pool A:
>> bluestore_csum=true
>> Pool B:
>> bluestore_csum=false
>>
>> else
>> Pool A:
>> bluestore_csum=false
>> Pool B:
>> bluestore_csum=false
> Hmm, this makes parsing more complicated but is probably less confusing.
> But what about numeric values?
>
>   bluestore_compression_max_blob_size=1048576/force?
>
> This is partly why I like the (implementation) simplicity of
>
>   bluestore_compression_max_blob_size_force=1048576
>
> (or _override, whatever).
This way one gets parameter name duplication for each per-store 
parameter that has per-pool clone. E.g.

  bluestore_compression_max_blob_size for default settings

and
  bluestore_compression_max_blob_size_force for override ones.

Can't say that's pretty elegant..
What's about a single generic parameter that denotes per-collection params to be banned, e.g.

bluestore_force_per_store_settings = *
bluestore_force_per_store_settings = bluestore_compression
bluestore_force_per_store_settings = bluestore_compression, bluestore_csum, bluestore_csum_type


or bluestore_disable_per_collection_settings = a,b,c ?

Alternative per-collection enable option seems a bit less convenient here IMHO.

>> An additional 'bluestore_force_all_settings=true' can be introduced to force
>> all per-store settings to prevail too.
> Or maybe
>
>   bluestore_ignore_per_collection_settings=true
> or the converse
>   bluestore_per_collection_settings=false
>
> to be precise?
>
>>>>>     bluestore_compression_algorithm = {snappy, zlib, ...}
>>>>>     bluestore_compression_min_blob_size = 256k
>>>>>     bluestore_compression_max_blob_size = 4M
>>>>>     bluestore_compression_required_ratio = .875
>>>>>
>>>>> (*) These currently have a different name but aren't used yet.  Working
>>>>> on
>>>>> a PR to change that.
>>>>>
>>>>> What's missing is your 'compress_hint'.  We can call that
>>>>> 'compression_hint' to align with the names above?
>>>>>
>>>>>     compression_hint = {compressible, incompressible, ...}
>>>>>
>>>>> The main changes from your PR that I think we need to make are:
>>>>>
>>>>> * These options should be part of the pool_opts_t structure in pg_pool_t
>>>>> (which is a set of optional key/value-like parameters for the pool).
>>>>>
>>>>> * We can add a new ObjectStore operation that passes down parameters for
>>>>> a
>>>>> collection, and have the OSD pass these all in for each PG collection
>>>>> when
>>>>> the pool properties change.  That way ObjectStore doesn't need to
>>>>> persist
>>>>> these options at all--just store the ones it understands in memory, and
>>>>> the OSD will always reset them on startup etc.
>>>> One, probably silly, question here - do pool and collection have 1 to 1
>>>> relation? It seemed to me that they don't and hence we can't store
>>>> per-pool
>>>> settings at collection level without some additional mapping: pool ->
>>>> setting.
>>>> Also this requires some means to remove pool settings entry when pool goes
>>>> away...
>>> It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when
>>> the OSD gets a pool change, it'll set the same flags for all local PGs in
>>> that pool.  A bit of duplication, but it keeps the interface
>>> uncomplicated (no need to teach ObjectStore about pools).
>> And PG belongs to single pool only, right?
> Right.
>
> sage


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: per-pool bluestore options
  2016-08-18 16:16         ` Igor Fedotov
@ 2016-08-18 16:18           ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2016-08-18 16:18 UTC (permalink / raw)
  To: Igor Fedotov; +Cc: ceph-devel

On Thu, 18 Aug 2016, Igor Fedotov wrote:
> On 18.08.2016 18:26, Sage Weil wrote:
> > On Thu, 18 Aug 2016, Igor Fedotov wrote:
> > > On 17.08.2016 19:27, Sage Weil wrote:
> > > > On Mon, 15 Aug 2016, Igor Fedotov wrote:
> > > > > On 13.08.2016 21:56, Sage Weil wrote:
> > > > > > Hi Igor,
> > > > > > 
> > > > > > I took another look at
> > > > > > 
> > > > > > 	https://github.com/ceph/ceph/pull/10556
> > > > > > 
> > > > > > You define three settings:
> > > > > > 
> > > > > >     compress_hint - determines if pool contains compressible /
> > > > > > incompressible
> > > > > > data
> > > > > >     compress_algorithm - permits to specify different compression
> > > > > > algorithm
> > > > > >     compress_ratio - specifies maximum compression ratio
> > > > > > 
> > > > > > I think we should extend this to include csum-related options.  And
> > > > > > use
> > > > > > a
> > > > > > consistent naming scheme that aligns with the config options where
> > > > > > we
> > > > > > just
> > > > > > strip off the bluestore_ prefix.  The relevant options are:
> > > > > Sounds good!
> > > > > The major questions here is how do these per-pool options correlate
> > > > > with
> > > > > corresponding per-store ones?
> > > > > One might suggest per-pool options to have higher priority over
> > > > > per-store
> > > > > one.
> > > > > But I'm not sure that the best option.  Sometimes user might want to
> > > > > disable/alter corresponding option without enumerating all the pools
> > > > > by
> > > > > simple
> > > > > switching at per-store level. Hence we need to consider some means for
> > > > > that.
> > > > > >     bluestore_csum = {true, false}
> > > > > >     bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
> > > > > >     bluestore_csum_min_chunk_size = 4k      (*)
> > > > > >     bluestore_csum_max_chunk_size = 64k     (*)
> > > > > >     bluestore_compression = {force, aggressive, passive, none}
> > > > > Actually this option along with compression_hint result in a single
> > > > > flag:
> > > > > compress or not. Any rationale for not using that simple flag?
> > > > There are currently two persistent hints: compressible and
> > > > incompressible.
> > > > Aggressive will compress unless incompressible, whereas passive will
> > > > compress if compressible.  Either way it varies per object, though, so a
> > > > single flag isn't sufficient.
> > > So final approach is to have bluestore_compression switch at pool level
> > > (and
> > > per-store too) and compression_hint at object one, right? Hence we don't
> > > need
> > > compression_hint per pool. In fact my original point was that the
> > > latter(along
> > > with bluestore_compression) is IMHO an overkill and can be substituted
> > > with
> > > simple enable_compression flag.
> > I think so.  Having a pool compression_hint = compressible will accomplish
> > the same thing as setting the pool compression = aggressive.
> OK, Sounds good!
> Still there is an open question how to assign such a hint to objects from user
> perspective....

https://github.com/ceph/ceph/blob/master/src/include/rados/librados.h#L145
and
https://github.com/ceph/ceph/blob/master/src/include/rados/librados.h#L2438

> > > > > My original idea here was to have bluestore_compression on per-store
> > > > > basis
> > > > > and
> > > > > compression_hint of per pool one. This way one can receive pretty
> > > > > flexible
> > > > > control at both storage and pool level - see my question above.
> > > > Yeah, I'm not sure how complex it's worth getting.  To get complete
> > > > control, we probably need a default *and* the min/max allowed range for
> > > > each option in order to bound what you can choose per-pool, but I think
> > > > that is probably overkill.
> > > > 
> > > > A bit easier would be to have
> > > > 
> > > >    bluestore_csum_override = {,true,false}
> > > >    bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
> > > >    bluestore_compression_override = {, force, aggressive, passive, none}
> > > > 
> > > > and have them blank by default (no override).  For the numeric options
> > > > we'd need to use 0 to mean 'no override'.
> > > > 
> > > > What do you think?
> > > IMO per-pool settings (if set) have to override corresponding per-store
> > > one by default. But one should be able to turn off that behavior with
> > > 'one-click' and force specific per-store setting to prevail.
> > > Hence we don't need 'override' settings at per-pool level but should have
> > > one
> > > at per-store one. E.g.
> > > 
> > > Pool A has:
> > > bluestore_csum = true
> > > Pool B:
> > > bluestore_csum=<not set>
> > > OSD:
> > > bluestore_csum=  false [/force]
> > > 
> > > if /'force' is omitted - actual settings are
> > > Pool A:
> > > bluestore_csum=true
> > > Pool B:
> > > bluestore_csum=false
> > > 
> > > else
> > > Pool A:
> > > bluestore_csum=false
> > > Pool B:
> > > bluestore_csum=false
> > Hmm, this makes parsing more complicated but is probably less confusing.
> > But what about numeric values?
> > 
> >   bluestore_compression_max_blob_size=1048576/force?
> > 
> > This is partly why I like the (implementation) simplicity of
> > 
> >   bluestore_compression_max_blob_size_force=1048576
> > 
> > (or _override, whatever).
> This way one gets parameter name duplication for each per-store parameter that
> has per-pool clone. E.g.
> 
>  bluestore_compression_max_blob_size for default settings
> 
> and
>  bluestore_compression_max_blob_size_force for override ones.
> 
> Can't say that's pretty elegant..
> What's about a single generic parameter that denotes per-collection params to
> be banned, e.g.
> 
> bluestore_force_per_store_settings = *
> bluestore_force_per_store_settings = bluestore_compression
> bluestore_force_per_store_settings = bluestore_compression, bluestore_csum,
> bluestore_csum_type
> 
> 
> or bluestore_disable_per_collection_settings = a,b,c ?

I like this.  Or rather the converse:

 bluestore_allow_per_collection_settings = bluestore_compression,bluestore_csum, ...

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-08-19  2:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-13 18:56 per-pool bluestore options Sage Weil
2016-08-15 13:48 ` Igor Fedotov
2016-08-17 16:27   ` Sage Weil
2016-08-18 13:18     ` Igor Fedotov
2016-08-18 15:26       ` Sage Weil
2016-08-18 16:16         ` Igor Fedotov
2016-08-18 16:18           ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.