All of lore.kernel.org
 help / color / mirror / Atom feed
* CRUSH. What wrong?
@ 2011-05-29 13:21 Fyodor Ustinov
  2011-05-30  3:54 ` Sage Weil
  2011-05-30  9:50 ` Jeff Wu
  0 siblings, 2 replies; 7+ messages in thread
From: Fyodor Ustinov @ 2011-05-29 13:21 UTC (permalink / raw)
  To: ceph-devel

Hi!

Now, I made an attempt to create my own CRUSH.

After I applied it to existing cluster everything stopped (rebalance not 
start, unable to mount cluster)

"Ok", I said and created new cluster with this crush.

Cluster turned out as the real thing. But id did not work (initial 
scrubbing not start, unable to mount).

What I did wrong?

My crush map:

# begin crush map

# devices
device 0 device0
device 1 device1
device 2 device2
device 3 device3
device 4 device4
device 5 device5
device 6 device6
device 7 device7
device 8 device8
device 9 device9
device 10 device10

# types
type 0 device
type 1 host
type 2 rack
type 3 root

# buckets
host host0 {
         id -1           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device0 weight 3.842
}
host host1 {
         id -2           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device1 weight 3.842
}
host host2 {
         id -3           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device2 weight 3.842
}
host host3 {
         id -4           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device3 weight 3.842
}
host host4 {
         id -5           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device4 weight 3.842
}
host host5 {
         id -6           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device5 weight 3.842
}
host host6 {
         id -7           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
}
host host7 {
         id -8           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item device7 weight 3.842
         item device8 weight 3.905
         item device9 weight 3.905
         item device10 weight 3.905
}
rack rack0 {
         id -9           # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host0 weight 3.844
}
rack rack1 {
         id -10          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host1 weight 3.844
}
rack rack2 {
         id -11          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host2 weight 3.842
}
rack rack3 {
         id -12          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host3 weight 3.842
}
rack rack4 {
         id -13          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host4 weight 3.842
}
rack rack5 {
         id -14          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host5 weight 3.842
}
rack rack6 {
         id -15          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host6 weight 3.842
}
rack rack7 {
         id -16          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item host7 weight 15.557
}
root root {
         id -17          # do not change unnecessarily
         alg straw
         hash 0  # rjenkins1
         item rack0 weight 3.844
         item rack1 weight 3.844
         item rack2 weight 3.842
         item rack3 weight 3.842
         item rack4 weight 3.842
         item rack5 weight 3.842
         item rack6 weight 3.842
         item rack7 weight 15.557
}

# rules
rule data {
         ruleset 0
         type replicated
         min_size 1
         max_size 10
         step take root
         step choose firstn 0 type rack
         step emit
}
rule metadata {
         ruleset 1
         type replicated
         min_size 1
         max_size 10
         step take root
         step choose firstn 0 type rack
         step emit
}
rule rbd {
         ruleset 2
         type replicated
         min_size 1
         max_size 10
         step take root
         step choose firstn 0 type rack
         step emit
}

# end crush map

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CRUSH. What wrong?
  2011-05-29 13:21 CRUSH. What wrong? Fyodor Ustinov
@ 2011-05-30  3:54 ` Sage Weil
  2011-05-30  9:34   ` Fyodor Ustinov
  2011-05-30  9:44   ` Fyodor Ustinov
  2011-05-30  9:50 ` Jeff Wu
  1 sibling, 2 replies; 7+ messages in thread
From: Sage Weil @ 2011-05-30  3:54 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

Hi Fyodor,

The problem is your rules:

On Sun, 29 May 2011, Fyodor Ustinov wrote:
> # rules
> rule data {
>         ruleset 0
>         type replicated
>         min_size 1
>         max_size 10
>         step take root
>         step choose firstn 0 type rack

This is giving you N _racks_.  You probably want

         step chooseleaf firstn 0 type rack

which will give you N devices from distinct racks.

BTW, you can verify that the map is behaving with something like

$ crushtool -i crushfile --test
devices weights (hex): 
[10000,10000,10000,10000,10000,10000,10000,10000,10000,10000,10000]
rule 0 (data), x = 0..9999
 device 0:      2391
 device 1:      2446
 device 2:      2338
 device 3:      2332
 device 4:      2360
 device 5:      2450
 device 6:      0
 device 7:      1368
 device 8:      1460
 device 9:      1446
 device 10:     1409
 num results 2: 10000
rule 1 (metadata), x = 0..9999
 device 0:      2391
 device 1:      2446
 device 2:      2338
 device 3:      2332
 device 4:      2360
 device 5:      2450
 device 6:      0
 device 7:      1368
 device 8:      1460
 device 9:      1446
 device 10:     1409
 num results 2: 10000
rule 2 (rbd), x = 0..9999
 device 0:      2391
 device 1:      2446
 device 2:      2338
 device 3:      2332
 device 4:      2360
 device 5:      2450
 device 6:      0
 device 7:      1368
 device 8:      1460
 device 9:      1446
 device 10:     1409
 num results 2: 10000

(That's with 'choose' changed to 'chooseleaf'.  Your current map gives 0 
objects on every device.)

Cheers-
sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CRUSH. What wrong?
  2011-05-30  9:50 ` Jeff Wu
@ 2011-05-30  4:43   ` Sage Weil
  2011-05-30 12:56     ` Jeff Wu
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2011-05-30  4:43 UTC (permalink / raw)
  To: Jeff Wu; +Cc: Fyodor Ustinov, ceph-devel

Hi Jeff,

On Mon, 30 May 2011, Jeff Wu wrote:
> Hi ,
> Could you attach the detail reproducing steps?
> Are you using the following steps to get crush.new.txt and base on it to
> modify new crush ?
> $ ceph osd getcrushmap -o crush.new
> $ crushtool -d crush.new -o crush.new.txt
> 
> Maybe , you could refer to the following commands to create crush.new
> file,for instance:
> $ crushtool --num_osds 8 -o crush.new --build host straw 8 rack straw 8
> root straw 0
> $ crushtool -d crush.new -o crush.new.txt
> 
> I had filed two bug at :
> http://tracker.newdream.net/issues/1016
> http://tracker.newdream.net/issues/1017

Sorry, these fell through the cracks somehow!

For both of these it looks like the same problem Fyodor had: you need to 
use chooseleaf instead of choose in your crush rule.  Currently you're 
getting back rack bucket ids instead of devices.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CRUSH. What wrong?
  2011-05-30  3:54 ` Sage Weil
@ 2011-05-30  9:34   ` Fyodor Ustinov
  2011-05-30  9:44   ` Fyodor Ustinov
  1 sibling, 0 replies; 7+ messages in thread
From: Fyodor Ustinov @ 2011-05-30  9:34 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 05/30/2011 06:54 AM, Sage Weil wrote:
> Hi Fyodor,
>
> The problem is your rules:
>
> On Sun, 29 May 2011, Fyodor Ustinov wrote:
>> # rules
>> rule data {
>>          ruleset 0
>>          type replicated
>>          min_size 1
>>          max_size 10
>>          step take root
>>          step choose firstn 0 type rack
> This is giving you N _racks_.  You probably want
>
>           step chooseleaf firstn 0 type rack
>
> which will give you N devices from distinct racks.
Mea culpa. :(

Thnx!

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CRUSH. What wrong?
  2011-05-30  3:54 ` Sage Weil
  2011-05-30  9:34   ` Fyodor Ustinov
@ 2011-05-30  9:44   ` Fyodor Ustinov
  1 sibling, 0 replies; 7+ messages in thread
From: Fyodor Ustinov @ 2011-05-30  9:44 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hi!

BTW, Sage, in rest of map - it's correct?

WBR,
     Fyodor.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CRUSH. What wrong?
  2011-05-29 13:21 CRUSH. What wrong? Fyodor Ustinov
  2011-05-30  3:54 ` Sage Weil
@ 2011-05-30  9:50 ` Jeff Wu
  2011-05-30  4:43   ` Sage Weil
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff Wu @ 2011-05-30  9:50 UTC (permalink / raw)
  To: Fyodor Ustinov; +Cc: ceph-devel

Hi ,
Could you attach the detail reproducing steps?
Are you using the following steps to get crush.new.txt and base on it to
modify new crush ?
$ ceph osd getcrushmap -o crush.new
$ crushtool -d crush.new -o crush.new.txt

Maybe , you could refer to the following commands to create crush.new
file,for instance:
$ crushtool --num_osds 8 -o crush.new --build host straw 8 rack straw 8
root straw 0
$ crushtool -d crush.new -o crush.new.txt

I had filed two bug at :
http://tracker.newdream.net/issues/1016
http://tracker.newdream.net/issues/1017


Jeff ,Wu


On Sun, 2011-05-29 at 21:21 +0800, Fyodor Ustinov wrote:
> Hi!
> 
> Now, I made an attempt to create my own CRUSH.
> 
> After I applied it to existing cluster everything stopped (rebalance not 
> start, unable to mount cluster)
> 
> "Ok", I said and created new cluster with this crush.
> 
> Cluster turned out as the real thing. But id did not work (initial 
> scrubbing not start, unable to mount).
> 
> What I did wrong?
> 
> My crush map:
> 
> # begin crush map
> 
> # devices
> device 0 device0
> device 1 device1
> device 2 device2
> device 3 device3
> device 4 device4
> device 5 device5
> device 6 device6
> device 7 device7
> device 8 device8
> device 9 device9
> device 10 device10
> 
> # types
> type 0 device
> type 1 host
> type 2 rack
> type 3 root
> 
> # buckets
> host host0 {
>          id -1           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device0 weight 3.842
> }
> host host1 {
>          id -2           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device1 weight 3.842
> }
> host host2 {
>          id -3           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device2 weight 3.842
> }
> host host3 {
>          id -4           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device3 weight 3.842
> }
> host host4 {
>          id -5           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device4 weight 3.842
> }
> host host5 {
>          id -6           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device5 weight 3.842
> }
> host host6 {
>          id -7           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
> }
> host host7 {
>          id -8           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item device7 weight 3.842
>          item device8 weight 3.905
>          item device9 weight 3.905
>          item device10 weight 3.905
> }
> rack rack0 {
>          id -9           # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host0 weight 3.844
> }
> rack rack1 {
>          id -10          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host1 weight 3.844
> }
> rack rack2 {
>          id -11          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host2 weight 3.842
> }
> rack rack3 {
>          id -12          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host3 weight 3.842
> }
> rack rack4 {
>          id -13          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host4 weight 3.842
> }
> rack rack5 {
>          id -14          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host5 weight 3.842
> }
> rack rack6 {
>          id -15          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host6 weight 3.842
> }
> rack rack7 {
>          id -16          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item host7 weight 15.557
> }
> root root {
>          id -17          # do not change unnecessarily
>          alg straw
>          hash 0  # rjenkins1
>          item rack0 weight 3.844
>          item rack1 weight 3.844
>          item rack2 weight 3.842
>          item rack3 weight 3.842
>          item rack4 weight 3.842
>          item rack5 weight 3.842
>          item rack6 weight 3.842
>          item rack7 weight 15.557
> }
> 
> # rules
> rule data {
>          ruleset 0
>          type replicated
>          min_size 1
>          max_size 10
>          step take root
>          step choose firstn 0 type rack
>          step emit
> }
> rule metadata {
>          ruleset 1
>          type replicated
>          min_size 1
>          max_size 10
>          step take root
>          step choose firstn 0 type rack
>          step emit
> }
> rule rbd {
>          ruleset 2
>          type replicated
>          min_size 1
>          max_size 10
>          step take root
>          step choose firstn 0 type rack
>          step emit
> }
> 
> # end crush map
> 
> WBR,
>      Fyodor.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CRUSH. What wrong?
  2011-05-30  4:43   ` Sage Weil
@ 2011-05-30 12:56     ` Jeff Wu
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Wu @ 2011-05-30 12:56 UTC (permalink / raw)
  To: Sage Weil; +Cc: Fyodor Ustinov, ceph-devel

Hi sage,

Thanks!

Jeff Wu

On Mon, 2011-05-30 at 12:43 +0800, Sage Weil wrote:
> Hi Jeff,
> 
> On Mon, 30 May 2011, Jeff Wu wrote:
> > Hi ,
> > Could you attach the detail reproducing steps?
> > Are you using the following steps to get crush.new.txt and base on it to
> > modify new crush ?
> > $ ceph osd getcrushmap -o crush.new
> > $ crushtool -d crush.new -o crush.new.txt
> > 
> > Maybe , you could refer to the following commands to create crush.new
> > file,for instance:
> > $ crushtool --num_osds 8 -o crush.new --build host straw 8 rack straw 8
> > root straw 0
> > $ crushtool -d crush.new -o crush.new.txt
> > 
> > I had filed two bug at :
> > http://tracker.newdream.net/issues/1016
> > http://tracker.newdream.net/issues/1017
> 
> Sorry, these fell through the cracks somehow!
> 
> For both of these it looks like the same problem Fyodor had: you need to 
> use chooseleaf instead of choose in your crush rule.  Currently you're 
> getting back rack bucket ids instead of devices.
> 
> sage


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-05-30  9:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-29 13:21 CRUSH. What wrong? Fyodor Ustinov
2011-05-30  3:54 ` Sage Weil
2011-05-30  9:34   ` Fyodor Ustinov
2011-05-30  9:44   ` Fyodor Ustinov
2011-05-30  9:50 ` Jeff Wu
2011-05-30  4:43   ` Sage Weil
2011-05-30 12:56     ` Jeff Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.