All of lore.kernel.org
 help / color / mirror / Atom feed
* v0.38 released
@ 2011-11-11  5:14 Sage Weil
  2011-11-15 16:42 ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Sage Weil @ 2011-11-11  5:14 UTC (permalink / raw)
  To: ceph-devel

It's a week delayed, but v0.38 is ready.  The highlights:

 * osd: some peering refactoring
 * osd: 'replay' period is per-pool (now only affects fs data pool)
 * osd: clean up old osdmaps
 * osd: allow admin to revert lost objects to prior versions (or delete)
 * mkcephfs: generate reasonable crush map based on 'host' and 'rack' 
   fields in [osd.NN] sections of ceph.conf
 * radosgw: bucket index improvements
 * radosgw: improved swift support
 * rbd: misc command line tool fixes
 * debian: misc packaging fixes (including dependency breakage on upgrades)
 * ceph: query daemon perfcounters via command line tool

The big upcoming items for v0.39 are RBD layering (image cloning), further 
improvements to radosgw's Swift support, and some monitor failure recovery 
and bootstrapping improvements.  We're also continuing work on the 
automation bits that the Chef cookbooks and Juju charms will use, and a 
Crowbar barclamp was also just posted on github.  Several patches are 
still working their way into libvirt and qemu to improve support for RBD 
authentication.

You can get v0.38 from the usual places:

 * Git at git://github.com/NewDreamNetwork/ceph.git
 * Tarball at http://ceph.newdream.net/download/ceph-0.38.tar.gz
 * For Debian/Ubuntu packages see http://ceph.newdream.net/docs/latest/ops/install/mkcephfs/#installing-the-packages
 * For RPMs see https://build.opensuse.org/project/show?project=home%3Ahmacht%3Astorage

sage

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-11  5:14 v0.38 released Sage Weil
@ 2011-11-15 16:42 ` Andre Noll
  2011-11-15 19:53   ` Gregory Farnum
  0 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-15 16:42 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 2962 bytes --]

On Thu, Nov 10, 21:14, Sage Weil wrote:
>  * osd: some peering refactoring
>  * osd: 'replay' period is per-pool (now only affects fs data pool)
>  * osd: clean up old osdmaps
>  * osd: allow admin to revert lost objects to prior versions (or delete)
>  * mkcephfs: generate reasonable crush map based on 'host' and 'rack' 
>    fields in [osd.NN] sections of ceph.conf
>  * radosgw: bucket index improvements
>  * radosgw: improved swift support
>  * rbd: misc command line tool fixes
>  * debian: misc packaging fixes (including dependency breakage on upgrades)
>  * ceph: query daemon perfcounters via command line tool
> 
> The big upcoming items for v0.39 are RBD layering (image cloning), further 
> improvements to radosgw's Swift support, and some monitor failure recovery 
> and bootstrapping improvements.  We're also continuing work on the 
> automation bits that the Chef cookbooks and Juju charms will use, and a 
> Crowbar barclamp was also just posted on github.  Several patches are 
> still working their way into libvirt and qemu to improve support for RBD 
> authentication.

Any plans to address the ENOSPC issue? I gave v0.38 a try and the
file system behaves like the older (<= 0.36) versions I've tried
before when it fills up: The ceph mounts hang on all clients.

But there is progress: Sync is now interuptable (it used to block
in D state so that it could not be killed even with SIGKILL), and
umount works even if the file system is full. However, subsequent
mount attempts then fail with "mount error 5 = Input/output error".

Our test setup consists of one mds, one monitor and 8 osds. mds and
monitor are on the same node, and this node is not not an osd. All
nodes are running Linux-3.0.9 ATM, but I would be willing to upgrade
to 3.1.1 if this is expected to make a difference.

Here's some output of "ceph -w". Funny enough it reports 770G of free
disk space space although the writing process terminated with ENOSPC.

2011-11-15 12:12:45.388535    pg v38805: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
2011-11-15 12:12:45.589228   mds e4: 1/1/1 up {0=0=up:active}
2011-11-15 12:12:45.589326   osd e11: 8 osds: 8 up, 8 in full
2011-11-15 12:12:45.589908   log 2011-11-15 12:12:19.599894 osd.326 192.168.3.26:6800/1673 168 : [INF] 0.593 scrub ok
2011-11-15 12:12:45.590000   mon e1: 1 mons at {0=192.168.3.34:6789/0}
2011-11-15 12:12:49.554163    pg v38806: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
2011-11-15 12:12:54.526661    pg v38807: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
2011-11-15 12:12:56.309292    pg v38808: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-15 16:42 ` Andre Noll
@ 2011-11-15 19:53   ` Gregory Farnum
  2011-11-16  9:56     ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Gregory Farnum @ 2011-11-15 19:53 UTC (permalink / raw)
  To: Andre Noll; +Cc: Sage Weil, ceph-devel

On Tue, Nov 15, 2011 at 8:42 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Thu, Nov 10, 21:14, Sage Weil wrote:
>>  * osd: some peering refactoring
>>  * osd: 'replay' period is per-pool (now only affects fs data pool)
>>  * osd: clean up old osdmaps
>>  * osd: allow admin to revert lost objects to prior versions (or delete)
>>  * mkcephfs: generate reasonable crush map based on 'host' and 'rack'
>>    fields in [osd.NN] sections of ceph.conf
>>  * radosgw: bucket index improvements
>>  * radosgw: improved swift support
>>  * rbd: misc command line tool fixes
>>  * debian: misc packaging fixes (including dependency breakage on upgrades)
>>  * ceph: query daemon perfcounters via command line tool
>>
>> The big upcoming items for v0.39 are RBD layering (image cloning), further
>> improvements to radosgw's Swift support, and some monitor failure recovery
>> and bootstrapping improvements.  We're also continuing work on the
>> automation bits that the Chef cookbooks and Juju charms will use, and a
>> Crowbar barclamp was also just posted on github.  Several patches are
>> still working their way into libvirt and qemu to improve support for RBD
>> authentication.
>
> Any plans to address the ENOSPC issue? I gave v0.38 a try and the
> file system behaves like the older (<= 0.36) versions I've tried
> before when it fills up: The ceph mounts hang on all clients.

This is something we hope to address in the future, but we haven't
come up with a good solution yet. (I haven't seen a good solution in
other distributed systems either...)

> But there is progress: Sync is now interuptable (it used to block
> in D state so that it could not be killed even with SIGKILL), and
> umount works even if the file system is full. However, subsequent
> mount attempts then fail with "mount error 5 = Input/output error".
Yay!

> Our test setup consists of one mds, one monitor and 8 osds. mds and
> monitor are on the same node, and this node is not not an osd. All
> nodes are running Linux-3.0.9 ATM, but I would be willing to upgrade
> to 3.1.1 if this is expected to make a difference.
>
> Here's some output of "ceph -w". Funny enough it reports 770G of free
> disk space space although the writing process terminated with ENOSPC.
Right now RADOS (the object store under the Ceph FS) is pretty
conservative about reporting ENOSPC. Since btrfs is also pretty
unhappy when its disk fills up, an OSD marks itself as "full" once
it's reached 95% of its capacity, and once a single OSD goes full then
RADOS marks itself that way so you don't overfill a disk and have
really bad things happen. (Hung mounts suck but are a lot better than
mysterious data loss.)

Looking at your ceph -s I'm surprised by a few things, though...
1) Why do you have so many PGs? 8k/OSD is rather a lot
2) I wouldn't expect your OSDs to have become so unbalanced that one
of them hits 95% full when the cluster's only at 84% capacity.

What is this cluster used for? Are you running anything besides the
Ceph FS on it? (radosgw, maybe?)
-Greg

> 2011-11-15 12:12:45.388535    pg v38805: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
> 2011-11-15 12:12:45.589228   mds e4: 1/1/1 up {0=0=up:active}
> 2011-11-15 12:12:45.589326   osd e11: 8 osds: 8 up, 8 in full
> 2011-11-15 12:12:45.589908   log 2011-11-15 12:12:19.599894 osd.326 192.168.3.26:6800/1673 168 : [INF] 0.593 scrub ok
> 2011-11-15 12:12:45.590000   mon e1: 1 mons at {0=192.168.3.34:6789/0}
> 2011-11-15 12:12:49.554163    pg v38806: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
> 2011-11-15 12:12:54.526661    pg v38807: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
> 2011-11-15 12:12:56.309292    pg v38808: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-15 19:53   ` Gregory Farnum
@ 2011-11-16  9:56     ` Andre Noll
  2011-11-16 18:04       ` Tommi Virtanen
  0 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-16  9:56 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 5030 bytes --]

On Tue, Nov 15, 11:53, Gregory Farnum wrote:

> > Any plans to address the ENOSPC issue? I gave v0.38 a try and the
> > file system behaves like the older (<= 0.36) versions I've tried
> > before when it fills up: The ceph mounts hang on all clients.
> 
> This is something we hope to address in the future, but we haven't
> come up with a good solution yet. (I haven't seen a good solution in
> other distributed systems either...)

Glad to hear the problem is known and will be addressed. We'd love to
use ceph as a global tmp file system on our cluster, so users *will*
fill it up..

> > But there is progress: Sync is now interuptable (it used to block
> > in D state so that it could not be killed even with SIGKILL), and
> > umount works even if the file system is full. However, subsequent
> > mount attempts then fail with "mount error 5 = Input/output error".
> Yay!
> 
> > Our test setup consists of one mds, one monitor and 8 osds. mds and
> > monitor are on the same node, and this node is not not an osd. All
> > nodes are running Linux-3.0.9 ATM, but I would be willing to upgrade
> > to 3.1.1 if this is expected to make a difference.
> >
> > Here's some output of "ceph -w". Funny enough it reports 770G of free
> > disk space space although the writing process terminated with ENOSPC.
> Right now RADOS (the object store under the Ceph FS) is pretty
> conservative about reporting ENOSPC. Since btrfs is also pretty
> unhappy when its disk fills up, an OSD marks itself as "full" once
> it's reached 95% of its capacity, and once a single OSD goes full then
> RADOS marks itself that way so you don't overfill a disk and have
> really bad things happen. (Hung mounts suck but are a lot better than
> mysterious data loss.)

Six of the eight underlying btrfs for ceph are 500G large, the other
two are 800G. Used disk space varies between 459G and 476G. The peak
476G is on a 500G fs, so this one is 98% full.

The data was written by a single client using stress, which simply
created 5G files in an endless loop. All these files are in the top
level directory.

> Looking at your ceph -s I'm surprised by a few things, though...
> 1) Why do you have so many PGs? 8k/OSD is rather a lot

I can't answer this question, but please have a look at the ceph
config file below. Maybe you can spot something odd in it.

> 2) I wouldn't expect your OSDs to have become so unbalanced that one
> of them hits 95% full when the cluster's only at 84% capacity.

This seems to be due to the fact that roughly the same amount of data
was written to each file system despite of the different file system
sizes. Hence only 60% disk space is used on the two 800G file systems.

> What is this cluster used for? Are you running anything besides the
> Ceph FS on it? (radosgw, maybe?)

Besides the ceph daemons only sshd and sge_execd (for executing
cluster jobs) is running there. Job submission was disabled on these
nodes during the tests, so all systems were completely idle.

Thanks for your help
Andre
---
[global]
	; enable secure authentication
	;auth supported = cephx
	;osd journal size = 100    ; measured in MB 

[client] ; userspace client
	debug ms = 1
	debug client = 10

; You need at least one monitor. You need at least three if you want to
; tolerate any node failures. Always create an odd number.
[mon]
	mon data = /var/ceph/mon$id
	; some minimal logging (just message traffic) to aid debugging
	; debug ms = 1
	; debug auth = 20 ;authentication code

[mon.0]
	host = node334
	mon addr = 192.168.3.34:6789

; You need at least one mds. Define two to get a standby.
[mds]
	; where the mds keeps it's secret encryption keys
	keyring = /var/ceph/keyring.$name
	; debug mds = 20
[mds.0]
	host = node334

; osd
;  You need at least one.  Two if you want data to be replicated.
;  Define as many as you like.
[osd]
	; This is where the btrfs volume will be mounted.
	osd data = /var/ceph/osd$id

	keyring = /etc/ceph/keyring.$name

	; Ideally, make this a separate disk or partition.  A few GB
 	; is usually enough; more if you have fast disks.  You can use
 	; a file under the osd data dir if need be
 	; (e.g. /data/osd$id/journal), but it will be slower than a
 	; separate disk or partition.
	osd journal = /var/ceph/osd$id/journal
	; If the OSD journal is a file, you need to specify the size. This is specified in MB.
        osd journal size = 512

[osd.325]
	host = node325
	btrfs devs = /dev/ceph/data
[osd.326]
	host = node326
	btrfs devs = /dev/ceph/data
[osd.327]
	host = node327
	btrfs devs = /dev/ceph/data
[osd.328]
	host = node328
	btrfs devs = /dev/ceph/data
[osd.329]
	host = node329
	btrfs devs = /dev/ceph/data
[osd.330]
	host = node330
	btrfs devs = /dev/ceph/data
[osd.331]
	host = node331
	btrfs devs = /dev/ceph/data
[osd.333]
	host = node333
	btrfs devs = /dev/ceph/data
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-16  9:56     ` Andre Noll
@ 2011-11-16 18:04       ` Tommi Virtanen
  2011-11-17 10:35         ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Tommi Virtanen @ 2011-11-16 18:04 UTC (permalink / raw)
  To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel

On Wed, Nov 16, 2011 at 01:56, Andre Noll <maan@systemlinux.org> wrote:
>> 2) I wouldn't expect your OSDs to have become so unbalanced that one
>> of them hits 95% full when the cluster's only at 84% capacity.
>
> This seems to be due to the fact that roughly the same amount of data
> was written to each file system despite of the different file system
> sizes. Hence only 60% disk space is used on the two 800G file systems.

That would be it. You probably want to set the weights of your OSDs
according to their storage capacity, otherwise the smaller ones will
get filled first.

http://ceph.newdream.net/wiki/Monitor_commands#reweight

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-16 18:04       ` Tommi Virtanen
@ 2011-11-17 10:35         ` Andre Noll
  2011-11-17 18:01           ` Tommi Virtanen
  0 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-17 10:35 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]

On Wed, Nov 16, 10:04, Tommi Virtanen wrote:
> On Wed, Nov 16, 2011 at 01:56, Andre Noll <maan@systemlinux.org> wrote:
> >> 2) I wouldn't expect your OSDs to have become so unbalanced that one
> >> of them hits 95% full when the cluster's only at 84% capacity.
> >
> > This seems to be due to the fact that roughly the same amount of data
> > was written to each file system despite of the different file system
> > sizes. Hence only 60% disk space is used on the two 800G file systems.
> 
> That would be it. You probably want to set the weights of your OSDs
> according to their storage capacity, otherwise the smaller ones will
> get filled first.
> 
> http://ceph.newdream.net/wiki/Monitor_commands#reweight

I was under the impression that equal weights on all osds means to
fill up all file systems by the same percentage, i.e. that file system
sizes are already taken care of.

But apparently this is not the case.  So one has to set the weights
manually according to the available disk space.

Thanks for enlightening me.
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-17 10:35         ` Andre Noll
@ 2011-11-17 18:01           ` Tommi Virtanen
  2011-11-18 15:01             ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Tommi Virtanen @ 2011-11-17 18:01 UTC (permalink / raw)
  To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel

On Thu, Nov 17, 2011 at 02:35, Andre Noll <maan@systemlinux.org> wrote:
> I was under the impression that equal weights on all osds means to
> fill up all file systems by the same percentage, i.e. that file system
> sizes are already taken care of.
>
> But apparently this is not the case.  So one has to set the weights
> manually according to the available disk space.

The weight is actually a combination of all the factors that would go
in: storage size, disk IO speed, network link bandwidth, heat in that
part of the data center, future expansion plans, .. We could automate
more of it, but it really is a fundamentally holistic number, and
setting it based on just one aspect of reality will lead to someone
else being unhappy. So it goes something like this:

Step 1: improve documentation
Step 2: have a monitoring system be able to feed back information to
use as osd weights, with admin customazability
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-17 18:01           ` Tommi Virtanen
@ 2011-11-18 15:01             ` Andre Noll
  2011-11-18 18:47               ` Tommi Virtanen
  0 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-18 15:01 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]

On Thu, Nov 17, 10:01, Tommi Virtanen wrote:
> On Thu, Nov 17, 2011 at 02:35, Andre Noll <maan@systemlinux.org> wrote:
> > I was under the impression that equal weights on all osds means to
> > fill up all file systems by the same percentage, i.e. that file system
> > sizes are already taken care of.
> >
> > But apparently this is not the case.  So one has to set the weights
> > manually according to the available disk space.
> 
> The weight is actually a combination of all the factors that would go
> in: storage size, disk IO speed, network link bandwidth, heat in that
> part of the data center, future expansion plans, ..

True. But as we all know, perfect is the enemy of good ;)

> We could automate more of it, but it really is a fundamentally holistic
> number, and setting it based on just one aspect of reality will lead to
> someone else being unhappy. So it goes something like this:
> 
> Step 1: improve documentation

For starters, it would be nice to include the ceph osd subcommands
in the man pages. To my knowledge they are only documented on the
(old) wiki

	http://ceph.newdream.net/wiki/Monitor_commands

at the moment. Would a patch that adds the subcommands and descriptions
to the man pages be accepted?

If so, I'd be willing to do this work. However, the files in man/
of the ceph git repo seem to be generated by docutils, so I suspect
they are not meant to be edited directly. What's the preferred way
to patch the man pages?

> Step 2: have a monitoring system be able to feed back information to
> use as osd weights, with admin customazability

How could such a monitoring system be implemented? In particular if
abstract criteria like "future extension plans" have to be considered.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-18 15:01             ` Andre Noll
@ 2011-11-18 18:47               ` Tommi Virtanen
  2011-11-21 17:32                 ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Tommi Virtanen @ 2011-11-18 18:47 UTC (permalink / raw)
  To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel

On Fri, Nov 18, 2011 at 07:01, Andre Noll <maan@systemlinux.org> wrote:
> For starters, it would be nice to include the ceph osd subcommands
> in the man pages. To my knowledge they are only documented on the
> (old) wiki
>
>        http://ceph.newdream.net/wiki/Monitor_commands
>
> at the moment. Would a patch that adds the subcommands and descriptions
> to the man pages be accepted?

I'm not sure if the man page are the best for that; there's a lot of
subcommands, and man forces it into a big list of things. I'd
personally go for putting a reference under
http://ceph.newdream.net/docs/latest/ops/ and using the structure for
separating osd/mon/mds etc into slightly more manageable chunks.

> If so, I'd be willing to do this work. However, the files in man/
> of the ceph git repo seem to be generated by docutils, so I suspect
> they are not meant to be edited directly. What's the preferred way
> to patch the man pages?

The content comes from doc/man/ and is built with ./admin/build-doc

That puts the whole html into build-doc/output/html/ and the *roff in
build-doc/output/man/ and from there it is migrated to man/ "by need"
(there's too much noise in the changes to keep doing that all the
time, and there's too many toolchain dependencies to generate docs on
every build).

>> Step 2: have a monitoring system be able to feed back information to
>> use as osd weights, with admin customazability
> How could such a monitoring system be implemented? In particular if
> abstract criteria like "future extension plans" have to be considered.

Going back to my initial list: storage size, disk IO speed, network
link bandwidth, heat in that
part of the data center, future expansion plans, ..

That divides into 3 groups:
- things that are more about the capability of the hardware (= change
very seldomly)
- things that are monitored outside of ceph
- plans

Hence, it seems to me that a sysadmin would do something like look at
the node data gathered by something like Ohai/Chef, combine that with
collectd/munin-style monitoring of the data center, optionally do
something like "increase weights of rack 7 by 40%", and then spit out
a mapping of osd id -> weight.

Our chef cookbooks will probably provide a skeleton for that in the
future, but that's not a short term need; most installations will
probably set the weights once when the hardware is new, and I'd expect
practically all clusters <6 months old to have fairly homogenous
hardware, and thus identical weights.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-18 18:47               ` Tommi Virtanen
@ 2011-11-21 17:32                 ` Andre Noll
  2011-11-21 17:36                   ` Tommi Virtanen
  0 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-21 17:32 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 3843 bytes --]

On Fri, Nov 18, 10:47, Tommi Virtanen wrote:
> On Fri, Nov 18, 2011 at 07:01, Andre Noll <maan@systemlinux.org> wrote:
> > For starters, it would be nice to include the ceph osd subcommands
> > in the man pages. To my knowledge they are only documented on the
> > (old) wiki
> >
> >        http://ceph.newdream.net/wiki/Monitor_commands
> >
> > at the moment. Would a patch that adds the subcommands and descriptions
> > to the man pages be accepted?
> 
> I'm not sure if the man page are the best for that; there's a lot of
> subcommands, and man forces it into a big list of things. I'd
> personally go for putting a reference under
> http://ceph.newdream.net/docs/latest/ops/ and using the structure for
> separating osd/mon/mds etc into slightly more manageable chunks.

I believe that code and documentation should be located as close as
possible, and I'd also prefer to edit and access the documentation
locally via command line tools rather than through a browser. But
I don't have a strong opinion on this, so let's go for the web
documentation.

Should I prepare something and post a request for inclusion to the
web pages on this mailing list, or do you want me to edit the web
documentation directly?

> > If so, I'd be willing to do this work. However, the files in man/
> > of the ceph git repo seem to be generated by docutils, so I suspect
> > they are not meant to be edited directly. What's the preferred way
> > to patch the man pages?
> 
> The content comes from doc/man/ and is built with ./admin/build-doc
> 
> That puts the whole html into build-doc/output/html/ and the *roff in
> build-doc/output/man/ and from there it is migrated to man/ "by need"
> (there's too much noise in the changes to keep doing that all the
> time, and there's too many toolchain dependencies to generate docs on
> every build).

I see, thanks for explaining. The ./admin/build-doc command worked
for me out of the box on an Ubuntu lucid system btw.

> >> Step 2: have a monitoring system be able to feed back information to
> >> use as osd weights, with admin customazability
> > How could such a monitoring system be implemented? In particular if
> > abstract criteria like "future extension plans" have to be considered.
> 
> Going back to my initial list: storage size, disk IO speed, network
> link bandwidth, heat in that
> part of the data center, future expansion plans, ..
> 
> That divides into 3 groups:
> - things that are more about the capability of the hardware (= change
> very seldomly)
> - things that are monitored outside of ceph
> - plans
> 
> Hence, it seems to me that a sysadmin would do something like look at
> the node data gathered by something like Ohai/Chef, combine that with
> collectd/munin-style monitoring of the data center, optionally do
> something like "increase weights of rack 7 by 40%", and then spit out
> a mapping of osd id -> weight.

OK, got the idea. However, in this example the difficult thing is
the decision "increase weights of rack 7 by 40%", which is made by a
human. Recomputing the osd weights accordingly should be fairly simple.

> Our chef cookbooks will probably provide a skeleton for that in the
> future, but that's not a short term need; most installations will
> probably set the weights once when the hardware is new, and I'd expect
> practically all clusters <6 months old to have fairly homogenous
> hardware, and thus identical weights.

Are you implying that ceph is only suitable for new clusters with
homogeneous hardware? I'm asking because our cluster is far from
homogeneous. There are 8 year old 2-core nodes with small SCSI disks
as well as 64-core boxes with much larger SATA disks.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-21 17:32                 ` Andre Noll
@ 2011-11-21 17:36                   ` Tommi Virtanen
  2011-11-21 18:06                     ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Tommi Virtanen @ 2011-11-21 17:36 UTC (permalink / raw)
  To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel

On Mon, Nov 21, 2011 at 09:32, Andre Noll <maan@systemlinux.org> wrote:
> I believe that code and documentation should be located as close as
> possible, and I'd also prefer to edit and access the documentation
> locally via command line tools rather than through a browser. But
> I don't have a strong opinion on this, so let's go for the web
> documentation.

I agree, and that's a big part of the reasons for choosing the
toolchain I did! All the docs from http://ceph.newdream.net/docs are
in the doc/ directory of the source tree.

> Should I prepare something and post a request for inclusion to the
> web pages on this mailing list, or do you want me to edit the web
> documentation directly?

Submit it like you would submit a code change.

> Are you implying that ceph is only suitable for new clusters with
> homogeneous hardware? I'm asking because our cluster is far from
> homogeneous. There are 8 year old 2-core nodes with small SCSI disks
> as well as 64-core boxes with much larger SATA disks.

8 year old? Wow.

We do intend to fully support clusters with nodes of different
capacity and speed (as I strongly believe most clusters will go
through such a phase in their life). It's just not the default
configuration, and won't be needed by most setups in the beginning.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: v0.38 released
  2011-11-21 17:36                   ` Tommi Virtanen
@ 2011-11-21 18:06                     ` Andre Noll
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-21 18:06 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 954 bytes --]

On Mon, Nov 21, 09:36, Tommi Virtanen wrote
> I agree, and that's a big part of the reasons for choosing the
> toolchain I did! All the docs from http://ceph.newdream.net/docs are
> in the doc/ directory of the source tree.
> 
> > Should I prepare something and post a request for inclusion to the
> > web pages on this mailing list, or do you want me to edit the web
> > documentation directly?
> 
> Submit it like you would submit a code change.

OK, will do so. I'll start with the list of OSD subcommands from
the old wiki and try to improve on this. As soon I have something to
present I'll send an RFC-style patch series to the list. This will
likely contain questions on certain subcommands, and I'll include
the relevant parts of any replies in subsequent versions of the
patch series.

Thanks
Andre
-- 
Max Planck Institute for Developmental Biology
Spemannstrasse 35, 72076 Tübingen, Germany
Phone: (+049) 7071 601 829

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH/RFC 0/6]: Introduction
  2011-11-21 18:06                     ` Andre Noll
@ 2011-11-28 14:04                       ` Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll
                                           ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel

Here is what I have so far. This patch set imports the documentation
of the OSD subcommands from the wiki to the preciously emtpy file
doc/ops/monitor.rst of the git repo. The first patch is just the
result of a cut & paste operation of the corresponding wiki page
while the other patches try to improve on this. The aim is to have
a complete and up to date documentation for all osd subcommands.

I don't believe the series is ready for inclusion yet as some
subcommands (cluster_snap, lost, in, out, ...) still lack useful
descriptions. It would be nice to add one sentence to each such command
that explains its purpose and the circumstances under which one might
want to use this particular command.

So please review and comment. I will fold in your suggestions and
follow up with a re-rolled series provided there is substantial
feedback.

Thanks
Andre


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki.
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
@ 2011-11-28 14:04                         ` Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 2/6] doc: Add documentation of missing osd commands Andre Noll
                                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll

This adds the content of the wiki page at

	http://ceph.newdream.net/wiki/Monitor_commands

to doc/ops/monitor.rst in order to make it available at the new
official location for the ceph documentation. This first patch is
just the result of a cut-and-paste operation. There are no changes
in content, but the text was converted to rst format.

Signed-Off-By: Andre Noll <maan@systemlinux.org>
---
 doc/ops/monitor.rst |  178 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 177 insertions(+), 1 deletions(-)

diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
index 98c75c3..626685e 100644
--- a/doc/ops/monitor.rst
+++ b/doc/ops/monitor.rst
@@ -4,4 +4,180 @@
  Monitoring Ceph
 =================
 
-.. todo:: write me
+Monitor commands
+----------------
+
+Monitor commands are issued using the ceph utility (in versions before
+Dec08 it was called cmonctl)::
+
+	$ ceph [-m monhost] command
+
+where the command is usually of the form::
+
+	$ ceph subsystem command
+
+System commands
+---------------
+
+::
+
+	$ ceph stop
+
+Cleanly shuts down the cluster.  ::
+
+	$ ceph -s
+
+Shows an overview of the current status of the cluster.  ::
+
+	$ ceph -w
+
+Shows a running summary of the status of the cluster, and major events.
+
+AUTH subsystem
+--------------
+::
+
+	$ ceph auth add <osd> <--in-file|-i> <path-to-osd-keyring>
+
+Add auth keyring for an osd.  ::
+
+	$ ceph auth list
+
+Show auth key OSD subsystem.
+
+OSD subsystem
+-------------
+::
+
+	$ ceph osd stat
+
+Query osd subsystem status. ::
+
+	$ ceph osd getmap -o file
+
+Write a copy of the most recent osd map to a file. See osdmaptool. ::
+
+	$ ceph osd getcrushmap -o file
+
+Write a copy of the crush map from the most recent osd map to
+file. This is functionally equivalent to ::
+
+	$ ceph osd getmap -o /tmp/osdmap
+	$ osdmaptool /tmp/osdmap --export-crush file
+
+::
+
+	$ ceph osd getmaxosd
+
+Query the current max_osd parameter in the osd map. ::
+
+	$ ceph osd setmap -i file
+
+Import the given osd map. Note that this can be a bit dangerous,
+since the osd map includes dynamic state about which OSDs are current
+on or offline; only do this if you've just modified a (very) recent
+copy of the map. ::
+
+	$ ceph osd setcrushmap -i file
+
+Import the given crush map. ::
+
+	$ ceph osd setmaxosd
+
+Set the max_osd parameter in the osd map. This is necessary when
+expanding the storage cluster. ::
+
+	$ ceph osd down N
+
+Mark osdN down. ::
+
+	$ ceph osd out N
+
+Mark osdN out of the distribution (i.e. allocated no data). ::
+
+	$ ceph osd in N
+
+Mark osdN in the distribution (i.e. allocated data). ::
+
+	$ ceph class list
+
+List classes that are loaded in the ceph cluster. ::
+
+	$ ceph osd pause
+	$ ceph osd unpause
+
+TODO ::
+
+	$ ceph osd reweight N W
+
+Sets the weight of osdN to W. ::
+
+	$ ceph osd reweight-by-utilization [threshold]
+
+Reweights all the OSDs by reducing the weight of OSDs which are
+heavily overused. By default it will adjust the weights downward on
+OSDs which have 120% of the average utilization, but if you include
+threshold it will use that percentage instead. ::
+
+	$ ceph osd blacklist add ADDRESS[:source_port] [TIME]
+	$ ceph osd blacklist rm ADDRESS[:source_port]
+
+Adds/removes the address to/from the blacklist. When adding an address,
+you can specify how long it should be blacklisted in seconds; otherwise
+it will default to 1 hour. A blacklisted address is prevented from
+connecting to any osd. Blacklisting is most often used to prevent a
+laggy mds making bad changes to data on the osds.
+
+These commands are mostly only useful for failure testing, as
+blacklists are normally maintained automatically and shouldn't need
+manual intervention. ::
+
+	$ ceph osd pool mksnap POOL SNAPNAME
+	$ ceph osd pool rmsnap POOL SNAPNAME
+
+Creates/deletes a snapshot of a pool. ::
+
+	$ ceph osd pool create POOL
+	$ ceph osd pool delete POOL
+
+Creates/deletes a storage pool. ::
+
+	$ ceph osd pool set POOL FIELD VALUE
+
+Changes a pool setting. Valid fields are:
+
+	* ``size``: Sets the number of copies of data in the pool.
+	* ``pg_num``: TODO
+	* ``pgp_num``: TODO
+
+::
+
+	$ ceph osd scrub N
+
+Sends a scrub command to osdN. To send the command to all osds, use ``*``.
+TODO: what does this actually do ::
+
+	$ ceph osd repair N
+
+Sends a repair command to osdN. To send the command to all osds, use ``*``.
+TODO: what does this actually do
+
+MDS subsystem
+-------------
+
+Change configuration parameters on a running mds. ::
+
+	$ ceph mds tell <mds-id> injectargs '--<switch> <value> [--<switch> <value>]'
+
+Example::
+
+	$ ceph mds tell 0 injectargs '--debug_ms 1 --debug_mds 10'
+
+Enables debug messages. ::
+
+	$ ceph mds stat
+
+Displays the status of all metadata servers.
+
+dump, getmap, stop, set_max_mds, setmap: TODO
+
-- 
1.7.8.rc1.14.g248db


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH/RFC 2/6] doc: Add documentation of missing osd commands.
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll
@ 2011-11-28 14:04                         ` Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 3/6] doc: Document pause and unpause " Andre Noll
                                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll

The set of OSD commands which added by the previous commit is
incomplete. This patch adds documentation for the following
OSD commands which were previously missing: dump, tree, crush,
cluster_snap, lost, create, rm.

Signed-Off-By: Andre Noll <maan@systemlinux.org>
---
 doc/ops/monitor.rst |   44 +++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
index 626685e..07d9c4f 100644
--- a/doc/ops/monitor.rst
+++ b/doc/ops/monitor.rst
@@ -64,9 +64,48 @@ file. This is functionally equivalent to ::
 
 	$ ceph osd getmap -o /tmp/osdmap
 	$ osdmaptool /tmp/osdmap --export-crush file
-
 ::
 
+	$ ceph osd dump [--format format>]
+
+Dump the osd map. Valid formats for -f are "plain" and "json". If no
+--format option is given, the osd map is dumped as plain text. ::
+
+	$ ceph osd tree [--format format]
+
+Dump the osd map as a tree with one line per osd containing weight
+and state. ::
+
+	$ ceph osd crush add <id> <name> <weight> [<loc1> [<loc2> ...]]
+
+Add a new item with the given id/name/weight at the specified
+location. ::
+
+	$ ceph osd crush remove <id>
+
+Remove an existing item from the crush map. ::
+
+	$ ceph osd crush reweight <name> <weight>
+
+Set the weight of the item given by ``<name>`` to ``<weight>``. ::
+
+	$ ceph osd cluster_snap <name>
+
+Create a cluster snapshot. ::
+
+	$ ceph osd lost [--yes-i-really-mean-it]
+
+Mark an OSD as lost. This may result in permanent data loss. Use with caution. ::
+
+	$ ceph osd create [<id>]
+
+Create a new OSD. If no ID is given, a new ID is automatically selected
+if possible. ::
+
+	$ ceph osd rm [<id>...]
+
+Remove the given OSD(s). ::
+
 	$ ceph osd getmaxosd
 
 Query the current max_osd parameter in the osd map. ::
@@ -179,5 +218,4 @@ Enables debug messages. ::
 
 Displays the status of all metadata servers.
 
-dump, getmap, stop, set_max_mds, setmap: TODO
-
+set_max_mds: TODO
-- 
1.7.8.rc1.14.g248db


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH/RFC 3/6] doc: Document pause and unpause osd commands.
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 2/6] doc: Add documentation of missing osd commands Andre Noll
@ 2011-11-28 14:04                         ` Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command Andre Noll
                                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll

These two commands were undocumented so far. This patch adds a short
description.

Signed-Off-By: Andre Noll <maan@systemlinux.org>
---
 doc/ops/monitor.rst |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
index 07d9c4f..e7314ae 100644
--- a/doc/ops/monitor.rst
+++ b/doc/ops/monitor.rst
@@ -145,7 +145,9 @@ List classes that are loaded in the ceph cluster. ::
 	$ ceph osd pause
 	$ ceph osd unpause
 
-TODO ::
+Set or clear the pause flags in the OSD map. If set, no IO requests
+will be sent to any OSD. Clearing the flags via unpause results in
+resending pending requests. ::
 
 	$ ceph osd reweight N W
 
-- 
1.7.8.rc1.14.g248db


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command.
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
                                           ` (2 preceding siblings ...)
  2011-11-28 14:04                         ` [PATCH/RFC 3/6] doc: Document pause and unpause " Andre Noll
@ 2011-11-28 14:04                         ` Andre Noll
  2011-11-28 14:04                         ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll
                                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll

This list was lacking a few fields: crash_replay_interval, pg_num,
pgp_num and crush_ruleset. Include these fields and add add short
descriptions.

Signed-Off-By: Andre Noll <maan@systemlinux.org>
---
 doc/ops/monitor.rst |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
index e7314ae..4de3c19 100644
--- a/doc/ops/monitor.rst
+++ b/doc/ops/monitor.rst
@@ -64,6 +64,7 @@ file. This is functionally equivalent to ::
 
 	$ ceph osd getmap -o /tmp/osdmap
 	$ osdmaptool /tmp/osdmap --export-crush file
+
 ::
 
 	$ ceph osd dump [--format format>]
@@ -188,8 +189,11 @@ Creates/deletes a storage pool. ::
 Changes a pool setting. Valid fields are:
 
 	* ``size``: Sets the number of copies of data in the pool.
-	* ``pg_num``: TODO
-	* ``pgp_num``: TODO
+	* ``crash_replay_interval``: The number of seconds to allow
+	  clients to replay acknowledged but uncommited requests.
+	* ``pg_num``: The placement group number.
+	* ``pgp_num``: Effective number when calculating pg placement.
+	* ``crush_ruleset``: rule number for mapping placement.
 
 ::
 
-- 
1.7.8.rc1.14.g248db


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get.
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
                                           ` (3 preceding siblings ...)
  2011-11-28 14:04                         ` [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command Andre Noll
@ 2011-11-28 14:04                         ` Andre Noll
  2011-11-28 18:37                           ` Gregory Farnum
  2011-11-28 14:04                         ` [PATCH/RFC 6/6] doc: Clarify documentation of reweight command Andre Noll
  2011-12-05 21:09                         ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen
  6 siblings, 1 reply; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll

"osd pool set" was already documented, but the corresponding "get"
command was not. This patch adds the list of valid fields for this
command, together with short descriptions.

Signed-Off-By: Andre Noll <maan@systemlinux.org>
---
 doc/ops/monitor.rst |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
index 4de3c19..076c8e1 100644
--- a/doc/ops/monitor.rst
+++ b/doc/ops/monitor.rst
@@ -197,6 +197,17 @@ Changes a pool setting. Valid fields are:
 
 ::
 
+	$ ceph osd pool get POOL FIELD
+
+Get the value of a pool setting. Valid fields are:
+
+	* ``pg_num``: See above.
+	* ``pgp_num``: See above.
+	* ``lpg_num``: The localized pg number.
+	* ``lpgp_num``: The number of localized pgs.
+
+::
+
 	$ ceph osd scrub N
 
 Sends a scrub command to osdN. To send the command to all osds, use ``*``.
-- 
1.7.8.rc1.14.g248db


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH/RFC 6/6] doc: Clarify documentation of reweight command.
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
                                           ` (4 preceding siblings ...)
  2011-11-28 14:04                         ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll
@ 2011-11-28 14:04                         ` Andre Noll
  2011-12-05 21:09                         ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen
  6 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll

This has led to some discussions on the mailing list, so let's try
to be clear about the meaning of an OSD weight.
---
 doc/ops/monitor.rst |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
index 076c8e1..5fcb0aa 100644
--- a/doc/ops/monitor.rst
+++ b/doc/ops/monitor.rst
@@ -152,7 +152,9 @@ resending pending requests. ::
 
 	$ ceph osd reweight N W
 
-Sets the weight of osdN to W. ::
+Set the weight of osdN to W. Two OSDs with the same weight will receive
+roughly the same number of I/O requests and store approximately the
+same amount of data. ::
 
 	$ ceph osd reweight-by-utilization [threshold]
 
-- 
1.7.8.rc1.14.g248db


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get.
  2011-11-28 14:04                         ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll
@ 2011-11-28 18:37                           ` Gregory Farnum
  2011-11-28 19:27                             ` Andre Noll
  0 siblings, 1 reply; 23+ messages in thread
From: Gregory Farnum @ 2011-11-28 18:37 UTC (permalink / raw)
  To: Andre Noll; +Cc: ceph-devel

On Mon, Nov 28, 2011 at 6:04 AM, Andre Noll <maan@systemlinux.org> wrote:
> "osd pool set" was already documented, but the corresponding "get"
> command was not. This patch adds the list of valid fields for this
> command, together with short descriptions.
>
> Signed-Off-By: Andre Noll <maan@systemlinux.org>
> ---
>  doc/ops/monitor.rst |   11 +++++++++++
>  1 files changed, 11 insertions(+), 0 deletions(-)
>
> diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst
> index 4de3c19..076c8e1 100644
> --- a/doc/ops/monitor.rst
> +++ b/doc/ops/monitor.rst
> @@ -197,6 +197,17 @@ Changes a pool setting. Valid fields are:
>
>  ::
>
> +       $ ceph osd pool get POOL FIELD
> +
> +Get the value of a pool setting. Valid fields are:
> +
> +       * ``pg_num``: See above.
> +       * ``pgp_num``: See above.
> +       * ``lpg_num``: The localized pg number.
> +       * ``lpgp_num``: The number of localized pgs.
The lpg_num and lpgp_num are analogous to the pg_num and the pgp_num —
the lpg_num is the number of local PGs, and the lpgp_num is the number
used for placing them. This matters less for the local PGs than the
regular PGs but it can still control where the replicas are placed.
-Greg

>        $ ceph osd scrub N
>
>  Sends a scrub command to osdN. To send the command to all osds, use ``*``.
> --
> 1.7.8.rc1.14.g248db
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get.
  2011-11-28 18:37                           ` Gregory Farnum
@ 2011-11-28 19:27                             ` Andre Noll
  0 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-11-28 19:27 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 698 bytes --]

On Mon, Nov 28, 10:37, Gregory Farnum wrote:
> > +       * ``pg_num``: See above.
> > +       * ``pgp_num``: See above.
> > +       * ``lpg_num``: The localized pg number.
> > +       * ``lpgp_num``: The number of localized pgs.
> The lpg_num and lpgp_num are analogous to the pg_num and the pgp_num —
> the lpg_num is the number of local PGs, and the lpgp_num is the number
> used for placing them. This matters less for the local PGs than the
> regular PGs but it can still control where the replicas are placed.

Thanks for the clarification. I will update the patch accordingly.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH/RFC 0/6]: Introduction
  2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
                                           ` (5 preceding siblings ...)
  2011-11-28 14:04                         ` [PATCH/RFC 6/6] doc: Clarify documentation of reweight command Andre Noll
@ 2011-12-05 21:09                         ` Tommi Virtanen
  2011-12-06 17:01                           ` Andre Noll
  6 siblings, 1 reply; 23+ messages in thread
From: Tommi Virtanen @ 2011-12-05 21:09 UTC (permalink / raw)
  To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel

On Mon, Nov 28, 2011 at 06:04, Andre Noll <maan@systemlinux.org> wrote:
> Here is what I have so far. This patch set imports the documentation
> of the OSD subcommands from the wiki to the preciously emtpy file
> doc/ops/monitor.rst of the git repo. The first patch is just the
> result of a cut & paste operation of the corresponding wiki page
> while the other patches try to improve on this. The aim is to have
> a complete and up to date documentation for all osd subcommands.
>
> I don't believe the series is ready for inclusion yet as some
> subcommands (cluster_snap, lost, in, out, ...) still lack useful
> descriptions. It would be nice to add one sentence to each such command
> that explains its purpose and the circumstances under which one might
> want to use this particular command.
>
> So please review and comment. I will fold in your suggestions and
> follow up with a re-rolled series provided there is substantial
> feedback.

Good work! I want to roll this in the docs asap, even if it is still partial.

For that to happen, we need to do two things:

1. get you to add Signed-off-by lines as per SubmittingPatches
2. figure out where this documentation belongs

For ops/monitor is meant for "how do I reassure myself my service
works the right way". Think nagios, collectd, munin, etc. Apologies
for not having much content there yet..

The ops/ hierarchy as a whole is meant to be "user/goal oriented".
That is, I don't want to put in a section "ceph monitor commands".
Instead, we need to ask the question "what is the admin trying to do",
and that's the guiding principle for ops/.

A reference-style document that exhaustively lists all possible
actions should go into some other top-level section. Right now, the
closest parallels we have are config/ ("Configuration reference") and
api/, both meant to be comprehensive references.

Let's do this: make your patch put things in doc/control.rst, with the
title "Control commands", and have doc/index.rst toctree have, below
config, an entry for control. I can take it from there if we want to
reorganize the document more.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH/RFC 0/6]: Introduction
  2011-12-05 21:09                         ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen
@ 2011-12-06 17:01                           ` Andre Noll
  0 siblings, 0 replies; 23+ messages in thread
From: Andre Noll @ 2011-12-06 17:01 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 746 bytes --]

On Mon, Dec 05, 13:09, Tommi Virtanen wrote:
> > So please review and comment. I will fold in your suggestions and
> > follow up with a re-rolled series provided there is substantial
> > feedback.
> 
> Good work! I want to roll this in the docs asap, even if it is still partial.

[...]

> Let's do this: make your patch put things in doc/control.rst, with the
> title "Control commands", and have doc/index.rst toctree have, below
> config, an entry for control. I can take it from there if we want to
> reorganize the document more.

OK. I will update the patch series according to your comments and send
an updated version soon.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-12-06 16:59 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-11  5:14 v0.38 released Sage Weil
2011-11-15 16:42 ` Andre Noll
2011-11-15 19:53   ` Gregory Farnum
2011-11-16  9:56     ` Andre Noll
2011-11-16 18:04       ` Tommi Virtanen
2011-11-17 10:35         ` Andre Noll
2011-11-17 18:01           ` Tommi Virtanen
2011-11-18 15:01             ` Andre Noll
2011-11-18 18:47               ` Tommi Virtanen
2011-11-21 17:32                 ` Andre Noll
2011-11-21 17:36                   ` Tommi Virtanen
2011-11-21 18:06                     ` Andre Noll
2011-11-28 14:04                       ` [PATCH/RFC 0/6]: Introduction Andre Noll
2011-11-28 14:04                         ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll
2011-11-28 14:04                         ` [PATCH/RFC 2/6] doc: Add documentation of missing osd commands Andre Noll
2011-11-28 14:04                         ` [PATCH/RFC 3/6] doc: Document pause and unpause " Andre Noll
2011-11-28 14:04                         ` [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command Andre Noll
2011-11-28 14:04                         ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll
2011-11-28 18:37                           ` Gregory Farnum
2011-11-28 19:27                             ` Andre Noll
2011-11-28 14:04                         ` [PATCH/RFC 6/6] doc: Clarify documentation of reweight command Andre Noll
2011-12-05 21:09                         ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen
2011-12-06 17:01                           ` Andre Noll

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.