All of lore.kernel.org
 help / color / mirror / Atom feed
* storing multiple weights in crush
@ 2017-03-27 16:45 Loic Dachary
  2017-03-27 23:16 ` Sage Weil
  2017-04-06  7:10 ` Loic Dachary
  0 siblings, 2 replies; 6+ messages in thread
From: Loic Dachary @ 2017-03-27 16:45 UTC (permalink / raw)
  To: Sage Weil, Ceph Development

Hi Sage,

I added a proposal[1] to the next CDM[2] to be able to store and use multiple weights for each item in the crushmap. 

I'm not sure about the naming (probabilities is a long word...) but it would be good to not have a third variable named weight with yet another meaning. Although we could modify the builder functions, I'm not sure this is a good idea. They are not complete enough to hide the crush data structures and it is probably enough to document the new data members. And the caller can allocate it and fill it after creating a bucket. In practice that is going to be used by CrushWrapper or python-crush which will define an API.

It is probably best if the crush modification is the bare minimum:

- allowing the choose function to pick a probability from a table indexed by the round instead of relying on a single weight
- storing the probability table in each item

Feel free to modify the pad if you have other ideas, this is a first draft done today and I won't be offended if it is discarded ;-)

Cheers

[1] http://pad.ceph.com/p/crush-multiweight
[2] http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: storing multiple weights in crush
  2017-03-27 16:45 storing multiple weights in crush Loic Dachary
@ 2017-03-27 23:16 ` Sage Weil
  2017-03-28 10:27   ` Loic Dachary
  2017-04-06  7:10 ` Loic Dachary
  1 sibling, 1 reply; 6+ messages in thread
From: Sage Weil @ 2017-03-27 23:16 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3935 bytes --]

On Mon, 27 Mar 2017, Loic Dachary wrote:
> Hi Sage,
> 
> I added a proposal[1] to the next CDM[2] to be able to store and use 
> multiple weights for each item in the crushmap.
> 
> I'm not sure about the naming (probabilities is a long word...) but it 
> would be good to not have a third variable named weight with yet another 
> meaning. Although we could modify the builder functions, I'm not sure 
> this is a good idea. They are not complete enough to hide the crush data 
> structures and it is probably enough to document the new data members. 
> And the caller can allocate it and fill it after creating a bucket. In 
> practice that is going to be used by CrushWrapper or python-crush which 
> will define an API.
> 
> It is probably best if the crush modification is the bare minimum:
> 
> - allowing the choose function to pick a probability from a table 
> indexed by the round instead of relying on a single weight
> - storing the probability table in each item
> 
> Feel free to modify the pad if you have other ideas, this is a first 
> draft done today and I won't be offended if it is discarded ;-)

What about a new concept called a "weight set" that represents an 
alternative set of weights for the entire map (or perhaps a subset of it, 
like a particular hierarchy of interest).  We allow do_rule to be called 
with no weight_set specified, in which case it uses the 
legacy/default/canonical weights.  Or if an alternative weight set is 
specified, those weights are used instead.

I think we have a couple interesting options for the representation:

1) It could be as simple as

      vector<int> bucket_weights, item_weights;

which would lose the current (mis?)feature where the weigth for an item 
may be different in different buckets.  Or,

2) it could mirror the bucket item_weights (let's call it vector<int> 
item_weights even though it's really a C array).  Something like

      vector<int,vector<int>> bucket_item_weights;

where it's indexed first by bucket id, then the position in the bucket.

I guess #2 is most general and mirrors what we have.  That would give us 
something like

struct weight_set {
  vector<int,vector<int>> bucket_item_weights;
};

for a simple replacement weight. But we actually want 
per-position weights, so instead we can do

struct weight_set {
  vector<int,vector<vector<int>>> bucket_item_position_weights;
};

where it's bucket id, item position, current position.  In reality, 
for each bucket, we probably want a flat 2-d array, something like

struct weight_set {
  struct bucket_weights {
    unt32_t num_positions, num_items;
    uint32_t *data;
  };
  vector<int,bucket_weights> bucket_weights;
};

or whatever.

The last thing is that the number of per-position weights we need is 
a property of the crush rule we're using, not the bucket itself.  So we 
should probably calculate it out for however many we think we need, and if 
we are doing a placement past that point, keep using the last weight.  
That means the degenerate case of just 1 position means we'll use it for 
all positions, which is what we'd want in the simple case anyway.

The last thing I think we should do while we're making this change is 
also let you specify an alternative id for the bucket when doing the hash.  
This is currently an annoying property of crush where if the bucket id 
changes all data shuffles.  But we could add an int32_t id to the 
bucket_weights above so that you could have a new weight set using any id 
you want.. that'll be useful for lots of stuff I suspect.

sage




> 
> Cheers
> 
> [1] http://pad.ceph.com/p/crush-multiweight
> [2] http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: storing multiple weights in crush
  2017-03-27 23:16 ` Sage Weil
@ 2017-03-28 10:27   ` Loic Dachary
  2017-04-04 10:10     ` Loic Dachary
  0 siblings, 1 reply; 6+ messages in thread
From: Loic Dachary @ 2017-03-28 10:27 UTC (permalink / raw)
  To: Sage Weil; +Cc: Ceph Development



On 03/28/2017 01:16 AM, Sage Weil wrote:
> On Mon, 27 Mar 2017, Loic Dachary wrote:
>> Hi Sage,
>>
>> I added a proposal[1] to the next CDM[2] to be able to store and use 
>> multiple weights for each item in the crushmap.
>>
>> I'm not sure about the naming (probabilities is a long word...) but it 
>> would be good to not have a third variable named weight with yet another 
>> meaning. Although we could modify the builder functions, I'm not sure 
>> this is a good idea. They are not complete enough to hide the crush data 
>> structures and it is probably enough to document the new data members. 
>> And the caller can allocate it and fill it after creating a bucket. In 
>> practice that is going to be used by CrushWrapper or python-crush which 
>> will define an API.
>>
>> It is probably best if the crush modification is the bare minimum:
>>
>> - allowing the choose function to pick a probability from a table 
>> indexed by the round instead of relying on a single weight
>> - storing the probability table in each item
>>
>> Feel free to modify the pad if you have other ideas, this is a first 
>> draft done today and I won't be offended if it is discarded ;-)
> 
> What about a new concept called a "weight set" that represents an 
> alternative set of weights for the entire map (or perhaps a subset of it, 
> like a particular hierarchy of interest).  We allow do_rule to be called 
> with no weight_set specified, in which case it uses the 
> legacy/default/canonical weights.  Or if an alternative weight set is 
> specified, those weights are used instead.
> 
> I think we have a couple interesting options for the representation:
> 
> 1) It could be as simple as
> 
>       vector<int> bucket_weights, item_weights;
> 
> which would lose the current (mis?)feature where the weigth for an item 
> may be different in different buckets.  Or,
> 
> 2) it could mirror the bucket item_weights (let's call it vector<int> 
> item_weights even though it's really a C array).  Something like
> 
>       vector<int,vector<int>> bucket_item_weights;
> 
> where it's indexed first by bucket id, then the position in the bucket.
> 
> I guess #2 is most general and mirrors what we have.  That would give us 
> something like
> 
> struct weight_set {
>   vector<int,vector<int>> bucket_item_weights;
> };
> 
> for a simple replacement weight. But we actually want 
> per-position weights, so instead we can do
> 
> struct weight_set {
>   vector<int,vector<vector<int>>> bucket_item_position_weights;
> };
> 
> where it's bucket id, item position, current position.  In reality, 
> for each bucket, we probably want a flat 2-d array, something like
> 
> struct weight_set {
>   struct bucket_weights {
>     unt32_t num_positions, num_items;
>     uint32_t *data;
>   };
>   vector<int,bucket_weights> bucket_weights;
> };
> 
> or whatever.
> 
> The last thing is that the number of per-position weights we need is 
> a property of the crush rule we're using, not the bucket itself.  So we 
> should probably calculate it out for however many we think we need, and if 
> we are doing a placement past that point, keep using the last weight.  
> That means the degenerate case of just 1 position means we'll use it for 
> all positions, which is what we'd want in the simple case anyway.

I like this. The crush_do_rule could have an additional argument, a pointer to the weight set. If NULL it falls back to what we currently have. It is then up to the caller to build such a weight set. It can be addressing only part of the bucket tree or all of it, with num_positions matching the outsize. It is better seen as an argument to crush_do_rule rather than an integral part of the crushmap.

Does that sound sensible ?

> 
> The last thing I think we should do while we're making this change is 
> also let you specify an alternative id for the bucket when doing the hash.  
> This is currently an annoying property of crush where if the bucket id 
> changes all data shuffles.  But we could add an int32_t id to the 
> bucket_weights above so that you could have a new weight set using any id 
> you want.. that'll be useful for lots of stuff I suspect.

Ok.

Cheers

> 
> sage
> 
> 
> 
> 
>>
>> Cheers
>>
>> [1] http://pad.ceph.com/p/crush-multiweight
>> [2] http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017
>>
>> -- 
>> Loïc Dachary, Artisan Logiciel Libre
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: storing multiple weights in crush
  2017-03-28 10:27   ` Loic Dachary
@ 2017-04-04 10:10     ` Loic Dachary
  0 siblings, 0 replies; 6+ messages in thread
From: Loic Dachary @ 2017-04-04 10:10 UTC (permalink / raw)
  To: Sage Weil; +Cc: Ceph Development

Hi Sage,

I drafted an implementation[0] based on the idea that the weight set is just an argument to crush_do_rule and updated the pad[1] with information about it. Looking forward to discussing this further tomorrow during CDM[2] :-)

Cheers

[0] draft implementation http://libcrush.org/main/libcrush/merge_requests/30/diffs
[1] http://pad.ceph.com/p/crush-multiweight
[2] CDM http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017

On 03/28/2017 12:27 PM, Loic Dachary wrote:
> 
> 
> On 03/28/2017 01:16 AM, Sage Weil wrote:
>> On Mon, 27 Mar 2017, Loic Dachary wrote:
>>> Hi Sage,
>>>
>>> I added a proposal[1] to the next CDM[2] to be able to store and use 
>>> multiple weights for each item in the crushmap.
>>>
>>> I'm not sure about the naming (probabilities is a long word...) but it 
>>> would be good to not have a third variable named weight with yet another 
>>> meaning. Although we could modify the builder functions, I'm not sure 
>>> this is a good idea. They are not complete enough to hide the crush data 
>>> structures and it is probably enough to document the new data members. 
>>> And the caller can allocate it and fill it after creating a bucket. In 
>>> practice that is going to be used by CrushWrapper or python-crush which 
>>> will define an API.
>>>
>>> It is probably best if the crush modification is the bare minimum:
>>>
>>> - allowing the choose function to pick a probability from a table 
>>> indexed by the round instead of relying on a single weight
>>> - storing the probability table in each item
>>>
>>> Feel free to modify the pad if you have other ideas, this is a first 
>>> draft done today and I won't be offended if it is discarded ;-)
>>
>> What about a new concept called a "weight set" that represents an 
>> alternative set of weights for the entire map (or perhaps a subset of it, 
>> like a particular hierarchy of interest).  We allow do_rule to be called 
>> with no weight_set specified, in which case it uses the 
>> legacy/default/canonical weights.  Or if an alternative weight set is 
>> specified, those weights are used instead.
>>
>> I think we have a couple interesting options for the representation:
>>
>> 1) It could be as simple as
>>
>>       vector<int> bucket_weights, item_weights;
>>
>> which would lose the current (mis?)feature where the weigth for an item 
>> may be different in different buckets.  Or,
>>
>> 2) it could mirror the bucket item_weights (let's call it vector<int> 
>> item_weights even though it's really a C array).  Something like
>>
>>       vector<int,vector<int>> bucket_item_weights;
>>
>> where it's indexed first by bucket id, then the position in the bucket.
>>
>> I guess #2 is most general and mirrors what we have.  That would give us 
>> something like
>>
>> struct weight_set {
>>   vector<int,vector<int>> bucket_item_weights;
>> };
>>
>> for a simple replacement weight. But we actually want 
>> per-position weights, so instead we can do
>>
>> struct weight_set {
>>   vector<int,vector<vector<int>>> bucket_item_position_weights;
>> };
>>
>> where it's bucket id, item position, current position.  In reality, 
>> for each bucket, we probably want a flat 2-d array, something like
>>
>> struct weight_set {
>>   struct bucket_weights {
>>     unt32_t num_positions, num_items;
>>     uint32_t *data;
>>   };
>>   vector<int,bucket_weights> bucket_weights;
>> };
>>
>> or whatever.
>>
>> The last thing is that the number of per-position weights we need is 
>> a property of the crush rule we're using, not the bucket itself.  So we 
>> should probably calculate it out for however many we think we need, and if 
>> we are doing a placement past that point, keep using the last weight.  
>> That means the degenerate case of just 1 position means we'll use it for 
>> all positions, which is what we'd want in the simple case anyway.
> 
> I like this. The crush_do_rule could have an additional argument, a pointer to the weight set. If NULL it falls back to what we currently have. It is then up to the caller to build such a weight set. It can be addressing only part of the bucket tree or all of it, with num_positions matching the outsize. It is better seen as an argument to crush_do_rule rather than an integral part of the crushmap.
> 
> Does that sound sensible ?
> 
>>
>> The last thing I think we should do while we're making this change is 
>> also let you specify an alternative id for the bucket when doing the hash.  
>> This is currently an annoying property of crush where if the bucket id 
>> changes all data shuffles.  But we could add an int32_t id to the 
>> bucket_weights above so that you could have a new weight set using any id 
>> you want.. that'll be useful for lots of stuff I suspect.
> 
> Ok.
> 
> Cheers
> 
>>
>> sage
>>
>>
>>
>>
>>>
>>> Cheers
>>>
>>> [1] http://pad.ceph.com/p/crush-multiweight
>>> [2] http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017
>>>
>>> -- 
>>> Loïc Dachary, Artisan Logiciel Libre
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: storing multiple weights in crush
  2017-03-27 16:45 storing multiple weights in crush Loic Dachary
  2017-03-27 23:16 ` Sage Weil
@ 2017-04-06  7:10 ` Loic Dachary
  2017-04-06 13:20   ` Sage Weil
  1 sibling, 1 reply; 6+ messages in thread
From: Loic Dachary @ 2017-04-06  7:10 UTC (permalink / raw)
  To: Sage Weil, Ceph Development

Hi Sage,

To summarize the next action item, the current draft 

http://libcrush.org/main/libcrush/merge_requests/30/diffs

should be updated with:

struct crush_bucket_weight_set {
  __u32 bucket_id;  /* used as input to hash in place of bucket id */
  __u32 num_positions, num_items;
  __u32 *data; // index like [item*num_items+pos]
};

struct crush_weight_set {
  struct crush_bucket_weights *bucket_weights;
  __u32 size;
};

and the logic updated so that bucket_id is used if it is != 0. If it is == 0 then the bucket id from the crushmap is used.

An effort should also be made to move to move the "if (weight_set == NULL) then use the original weights else use the weight set" logic in crush_do_rule instead of the choose function. The list,uniform and straw2 function all have different ways of use the weights (and therefore the weight set) and it is unclear if this would be beneficial.

Did I forget something ?

Cheers

On 03/27/2017 06:45 PM, Loic Dachary wrote:
> Hi Sage,
> 
> I added a proposal[1] to the next CDM[2] to be able to store and use multiple weights for each item in the crushmap. 
> 
> I'm not sure about the naming (probabilities is a long word...) but it would be good to not have a third variable named weight with yet another meaning. Although we could modify the builder functions, I'm not sure this is a good idea. They are not complete enough to hide the crush data structures and it is probably enough to document the new data members. And the caller can allocate it and fill it after creating a bucket. In practice that is going to be used by CrushWrapper or python-crush which will define an API.
> 
> It is probably best if the crush modification is the bare minimum:
> 
> - allowing the choose function to pick a probability from a table indexed by the round instead of relying on a single weight
> - storing the probability table in each item
> 
> Feel free to modify the pad if you have other ideas, this is a first draft done today and I won't be offended if it is discarded ;-)
> 
> Cheers
> 
> [1] http://pad.ceph.com/p/crush-multiweight
> [2] http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: storing multiple weights in crush
  2017-04-06  7:10 ` Loic Dachary
@ 2017-04-06 13:20   ` Sage Weil
  0 siblings, 0 replies; 6+ messages in thread
From: Sage Weil @ 2017-04-06 13:20 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2753 bytes --]

On Thu, 6 Apr 2017, Loic Dachary wrote:
> Hi Sage,
> 
> To summarize the next action item, the current draft 
> 
> http://libcrush.org/main/libcrush/merge_requests/30/diffs
> 
> should be updated with:
> 
> struct crush_bucket_weight_set {
>   __u32 bucket_id;  /* used as input to hash in place of bucket id */
>   __u32 num_positions, num_items;
>   __u32 *data; // index like [item*num_items+pos]
> };
> 
> struct crush_weight_set {
>   struct crush_bucket_weights *bucket_weights;
>   __u32 size;
> };
> 
> and the logic updated so that bucket_id is used if it is != 0. If it is == 0 then the bucket id from the crushmap is used.

Sounds good!
 
> An effort should also be made to move to move the "if (weight_set == 
> NULL) then use the original weights else use the weight set" logic in 
> crush_do_rule instead of the choose function. The list,uniform and 
> straw2 function all have different ways of use the weights (and 
> therefore the weight set) and it is unclear if this would be beneficial.

Only if it makes sense; just an idea.  :)

> Did I forget something ?

Next steps after that will be handling the encoding and such.  Probably 
a CrushWrapper::WeightSet class with encode/decode or somethign?

sage

> 
> Cheers
> 
> On 03/27/2017 06:45 PM, Loic Dachary wrote:
> > Hi Sage,
> > 
> > I added a proposal[1] to the next CDM[2] to be able to store and use multiple weights for each item in the crushmap. 
> > 
> > I'm not sure about the naming (probabilities is a long word...) but it would be good to not have a third variable named weight with yet another meaning. Although we could modify the builder functions, I'm not sure this is a good idea. They are not complete enough to hide the crush data structures and it is probably enough to document the new data members. And the caller can allocate it and fill it after creating a bucket. In practice that is going to be used by CrushWrapper or python-crush which will define an API.
> > 
> > It is probably best if the crush modification is the bare minimum:
> > 
> > - allowing the choose function to pick a probability from a table indexed by the round instead of relying on a single weight
> > - storing the probability table in each item
> > 
> > Feel free to modify the pad if you have other ideas, this is a first draft done today and I won't be offended if it is discarded ;-)
> > 
> > Cheers
> > 
> > [1] http://pad.ceph.com/p/crush-multiweight
> > [2] http://tracker.ceph.com/projects/ceph/wiki/CDM_05-APR-2017
> > 
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-04-06 13:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-27 16:45 storing multiple weights in crush Loic Dachary
2017-03-27 23:16 ` Sage Weil
2017-03-28 10:27   ` Loic Dachary
2017-04-04 10:10     ` Loic Dachary
2017-04-06  7:10 ` Loic Dachary
2017-04-06 13:20   ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.