All of lore.kernel.org
 help / color / mirror / Atom feed
* defaults paths
@ 2012-04-05  6:32 Sage Weil
  2012-04-05  6:57 ` Bernard Grymonpon
  0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2012-04-05  6:32 UTC (permalink / raw)
  To: ceph-devel

We want to standardize the locations for ceph data directories, configs, 
etc.  We'd also like to allow a single host to run OSDs that participate 
in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid 
UUIDs if we can).

The metavariables are:
 cluster = ceph (by default)
 type = osd, mon, mds
 id = 1, foo, 
 name = $type.$id = osd.0, mds.a, etc.

The $cluster variable will come from the command line (--cluster foo) or, 
in the case of a udev hotplug tool or something, matching the uuid on the 
device with the 'fsid = <uuid>' line in the available config files found 
in /etc/ceph.

The locations could be:

 ceph config file:
  /etc/ceph/$cluster.conf     (default is thus ceph.conf)

 keyring:
  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)

 osd_data, mon_data:
  /var/lib/ceph/$cluster.$name
  /var/lib/ceph/$cluster/$name
  /var/lib/ceph/data/$cluster.$name
  /var/lib/ceph/$type-data/$cluster-$id

TV and I talked about this today, and one thing we want is for items of a 
given type to live together in separate directory so that we don't have to 
do any filtering to, say, get all osd data directories.  This suggests the 
last option (/var/lib/ceph/osd-data/ceph-1, 
/var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.

Another option would be to make it

 /var/lib/ceph/$type-data/$id

(with no $cluster) and make users override the default with something that 
includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when 
they want multicluster nodes that don't interfere.  Then we'd get 
/var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.

Any other suggestions?  Thoughts?
sage

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05  6:32 defaults paths Sage Weil
@ 2012-04-05  6:57 ` Bernard Grymonpon
  2012-04-05  7:12   ` Andrey Korolyov
  0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05  6:57 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel


On 05 Apr 2012, at 08:32, Sage Weil wrote:

> We want to standardize the locations for ceph data directories, configs, 
> etc.  We'd also like to allow a single host to run OSDs that participate 
> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid 
> UUIDs if we can).
> 
> The metavariables are:
> cluster = ceph (by default)
> type = osd, mon, mds
> id = 1, foo, 
> name = $type.$id = osd.0, mds.a, etc.
> 
> The $cluster variable will come from the command line (--cluster foo) or, 
> in the case of a udev hotplug tool or something, matching the uuid on the 
> device with the 'fsid = <uuid>' line in the available config files found 
> in /etc/ceph.
> 
> The locations could be:
> 
> ceph config file:
>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
> 
> keyring:
>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
> 
> osd_data, mon_data:
>  /var/lib/ceph/$cluster.$name
>  /var/lib/ceph/$cluster/$name
>  /var/lib/ceph/data/$cluster.$name
>  /var/lib/ceph/$type-data/$cluster-$id
> 
> TV and I talked about this today, and one thing we want is for items of a 
> given type to live together in separate directory so that we don't have to 
> do any filtering to, say, get all osd data directories.  This suggests the 
> last option (/var/lib/ceph/osd-data/ceph-1, 
> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
> 
> Another option would be to make it
> 
> /var/lib/ceph/$type-data/$id
> 
> (with no $cluster) and make users override the default with something that 
> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when 
> they want multicluster nodes that don't interfere.  Then we'd get 
> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.

As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:

I would suggest 

/var/lib/ceph/osd/$id/data
and
/var/lib/ceph/osd/$id/journal

($id could be replaced by $uuid or $name, for which I would prefer $uuid)

Rgds,
Bernard

> 
> Any other suggestions?  Thoughts?
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05  6:57 ` Bernard Grymonpon
@ 2012-04-05  7:12   ` Andrey Korolyov
  2012-04-05  7:28     ` Bernard Grymonpon
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Korolyov @ 2012-04-05  7:12 UTC (permalink / raw)
  To: Bernard Grymonpon; +Cc: Sage Weil, ceph-devel

Right, but probably we need journal separation at the directory level
by default, because there is a very small amount of cases when speed
of main storage is sufficient for journal or when resulting speed
decrease is not significant, so journal by default may go into
/var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
the fast disk.

On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>
> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>
>> We want to standardize the locations for ceph data directories, configs,
>> etc.  We'd also like to allow a single host to run OSDs that participate
>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>> UUIDs if we can).
>>
>> The metavariables are:
>> cluster = ceph (by default)
>> type = osd, mon, mds
>> id = 1, foo,
>> name = $type.$id = osd.0, mds.a, etc.
>>
>> The $cluster variable will come from the command line (--cluster foo) or,
>> in the case of a udev hotplug tool or something, matching the uuid on the
>> device with the 'fsid = <uuid>' line in the available config files found
>> in /etc/ceph.
>>
>> The locations could be:
>>
>> ceph config file:
>>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>
>> keyring:
>>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>
>> osd_data, mon_data:
>>  /var/lib/ceph/$cluster.$name
>>  /var/lib/ceph/$cluster/$name
>>  /var/lib/ceph/data/$cluster.$name
>>  /var/lib/ceph/$type-data/$cluster-$id
>>
>> TV and I talked about this today, and one thing we want is for items of a
>> given type to live together in separate directory so that we don't have to
>> do any filtering to, say, get all osd data directories.  This suggests the
>> last option (/var/lib/ceph/osd-data/ceph-1,
>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>
>> Another option would be to make it
>>
>> /var/lib/ceph/$type-data/$id
>>
>> (with no $cluster) and make users override the default with something that
>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>> they want multicluster nodes that don't interfere.  Then we'd get
>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>
> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>
> I would suggest
>
> /var/lib/ceph/osd/$id/data
> and
> /var/lib/ceph/osd/$id/journal
>
> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>
> Rgds,
> Bernard
>
>>
>> Any other suggestions?  Thoughts?
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05  7:12   ` Andrey Korolyov
@ 2012-04-05  7:28     ` Bernard Grymonpon
  2012-04-05  7:37       ` Andrey Korolyov
  0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05  7:28 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.

Rgds,
Bernard

On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:

> Right, but probably we need journal separation at the directory level
> by default, because there is a very small amount of cases when speed
> of main storage is sufficient for journal or when resulting speed
> decrease is not significant, so journal by default may go into
> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
> the fast disk.
> 
> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>> 
>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>> 
>>> We want to standardize the locations for ceph data directories, configs,
>>> etc.  We'd also like to allow a single host to run OSDs that participate
>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>>> UUIDs if we can).
>>> 
>>> The metavariables are:
>>> cluster = ceph (by default)
>>> type = osd, mon, mds
>>> id = 1, foo,
>>> name = $type.$id = osd.0, mds.a, etc.
>>> 
>>> The $cluster variable will come from the command line (--cluster foo) or,
>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>> device with the 'fsid = <uuid>' line in the available config files found
>>> in /etc/ceph.
>>> 
>>> The locations could be:
>>> 
>>> ceph config file:
>>>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>> 
>>> keyring:
>>>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>> 
>>> osd_data, mon_data:
>>>  /var/lib/ceph/$cluster.$name
>>>  /var/lib/ceph/$cluster/$name
>>>  /var/lib/ceph/data/$cluster.$name
>>>  /var/lib/ceph/$type-data/$cluster-$id
>>> 
>>> TV and I talked about this today, and one thing we want is for items of a
>>> given type to live together in separate directory so that we don't have to
>>> do any filtering to, say, get all osd data directories.  This suggests the
>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>> 
>>> Another option would be to make it
>>> 
>>> /var/lib/ceph/$type-data/$id
>>> 
>>> (with no $cluster) and make users override the default with something that
>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>> they want multicluster nodes that don't interfere.  Then we'd get
>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>> 
>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>> 
>> I would suggest
>> 
>> /var/lib/ceph/osd/$id/data
>> and
>> /var/lib/ceph/osd/$id/journal
>> 
>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>> 
>> Rgds,
>> Bernard
>> 
>>> 
>>> Any other suggestions?  Thoughts?
>>> sage
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05  7:28     ` Bernard Grymonpon
@ 2012-04-05  7:37       ` Andrey Korolyov
  2012-04-05  8:38         ` Bernard Grymonpon
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Korolyov @ 2012-04-05  7:37 UTC (permalink / raw)
  To: Bernard Grymonpon; +Cc: Sage Weil, ceph-devel

In ceph case, such layout breakage may be necessary in almost all
installations(except testing), comparing to almost all general-purpose
server software which need division like that only in very specific
setups.

On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>
> Rgds,
> Bernard
>
> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>
>> Right, but probably we need journal separation at the directory level
>> by default, because there is a very small amount of cases when speed
>> of main storage is sufficient for journal or when resulting speed
>> decrease is not significant, so journal by default may go into
>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>> the fast disk.
>>
>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>>>
>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>
>>>> We want to standardize the locations for ceph data directories, configs,
>>>> etc.  We'd also like to allow a single host to run OSDs that participate
>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>>>> UUIDs if we can).
>>>>
>>>> The metavariables are:
>>>> cluster = ceph (by default)
>>>> type = osd, mon, mds
>>>> id = 1, foo,
>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>
>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>> device with the 'fsid = <uuid>' line in the available config files found
>>>> in /etc/ceph.
>>>>
>>>> The locations could be:
>>>>
>>>> ceph config file:
>>>>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>>>
>>>> keyring:
>>>>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>>>
>>>> osd_data, mon_data:
>>>>  /var/lib/ceph/$cluster.$name
>>>>  /var/lib/ceph/$cluster/$name
>>>>  /var/lib/ceph/data/$cluster.$name
>>>>  /var/lib/ceph/$type-data/$cluster-$id
>>>>
>>>> TV and I talked about this today, and one thing we want is for items of a
>>>> given type to live together in separate directory so that we don't have to
>>>> do any filtering to, say, get all osd data directories.  This suggests the
>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>
>>>> Another option would be to make it
>>>>
>>>> /var/lib/ceph/$type-data/$id
>>>>
>>>> (with no $cluster) and make users override the default with something that
>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>> they want multicluster nodes that don't interfere.  Then we'd get
>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>
>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>
>>> I would suggest
>>>
>>> /var/lib/ceph/osd/$id/data
>>> and
>>> /var/lib/ceph/osd/$id/journal
>>>
>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>
>>> Rgds,
>>> Bernard
>>>
>>>>
>>>> Any other suggestions?  Thoughts?
>>>> sage
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05  7:37       ` Andrey Korolyov
@ 2012-04-05  8:38         ` Bernard Grymonpon
  2012-04-05 12:34           ` Wido den Hollander
  0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05  8:38 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.

Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)

In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.

Rgds,
Bernard

On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:

> In ceph case, such layout breakage may be necessary in almost all
> installations(except testing), comparing to almost all general-purpose
> server software which need division like that only in very specific
> setups.
> 
> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>> 
>> Rgds,
>> Bernard
>> 
>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>> 
>>> Right, but probably we need journal separation at the directory level
>>> by default, because there is a very small amount of cases when speed
>>> of main storage is sufficient for journal or when resulting speed
>>> decrease is not significant, so journal by default may go into
>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>> the fast disk.
>>> 
>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>>>> 
>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>> 
>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>> etc.  We'd also like to allow a single host to run OSDs that participate
>>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>>>>> UUIDs if we can).
>>>>> 
>>>>> The metavariables are:
>>>>> cluster = ceph (by default)
>>>>> type = osd, mon, mds
>>>>> id = 1, foo,
>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>> 
>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>> device with the 'fsid = <uuid>' line in the available config files found
>>>>> in /etc/ceph.
>>>>> 
>>>>> The locations could be:
>>>>> 
>>>>> ceph config file:
>>>>>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>>>> 
>>>>> keyring:
>>>>>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>>>> 
>>>>> osd_data, mon_data:
>>>>>  /var/lib/ceph/$cluster.$name
>>>>>  /var/lib/ceph/$cluster/$name
>>>>>  /var/lib/ceph/data/$cluster.$name
>>>>>  /var/lib/ceph/$type-data/$cluster-$id
>>>>> 
>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>> given type to live together in separate directory so that we don't have to
>>>>> do any filtering to, say, get all osd data directories.  This suggests the
>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>> 
>>>>> Another option would be to make it
>>>>> 
>>>>> /var/lib/ceph/$type-data/$id
>>>>> 
>>>>> (with no $cluster) and make users override the default with something that
>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>> they want multicluster nodes that don't interfere.  Then we'd get
>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>> 
>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>> 
>>>> I would suggest
>>>> 
>>>> /var/lib/ceph/osd/$id/data
>>>> and
>>>> /var/lib/ceph/osd/$id/journal
>>>> 
>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>> 
>>>> Rgds,
>>>> Bernard
>>>> 
>>>>> 
>>>>> Any other suggestions?  Thoughts?
>>>>> sage
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> 
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05  8:38         ` Bernard Grymonpon
@ 2012-04-05 12:34           ` Wido den Hollander
  2012-04-05 13:00             ` Bernard Grymonpon
  0 siblings, 1 reply; 11+ messages in thread
From: Wido den Hollander @ 2012-04-05 12:34 UTC (permalink / raw)
  To: Bernard Grymonpon; +Cc: Sage Weil, ceph-devel

On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
>
> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)

I think that's a wrong assumption. On most systems I think multiple OSDs 
will exist, it's debatable if one would run OSDs from different clusters 
very often.

I'm currently using: osd data = /var/lib/ceph/$name

To get back to what sage mentioned, why add the "-data" suffix to a 
directory name? Isn't it obvious that a directory will contain data?

As I think it is a very specific scenario where a machine would be 
participating in multiple Ceph clusters I'd vote for:

  /var/lib/ceph/$type/$id

Wido

>
> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
>
> Rgds,
> Bernard
>
> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
>
>> In ceph case, such layout breakage may be necessary in almost all
>> installations(except testing), comparing to almost all general-purpose
>> server software which need division like that only in very specific
>> setups.
>>
>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>>
>>> Rgds,
>>> Bernard
>>>
>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>>
>>>> Right, but probably we need journal separation at the directory level
>>>> by default, because there is a very small amount of cases when speed
>>>> of main storage is sufficient for journal or when resulting speed
>>>> decrease is not significant, so journal by default may go into
>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>>> the fast disk.
>>>>
>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
>>>>>
>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>>
>>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>>> etc.  We'd also like to allow a single host to run OSDs that participate
>>>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>>>>>> UUIDs if we can).
>>>>>>
>>>>>> The metavariables are:
>>>>>> cluster = ceph (by default)
>>>>>> type = osd, mon, mds
>>>>>> id = 1, foo,
>>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>>
>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>>> device with the 'fsid =<uuid>' line in the available config files found
>>>>>> in /etc/ceph.
>>>>>>
>>>>>> The locations could be:
>>>>>>
>>>>>> ceph config file:
>>>>>>   /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>>>>>
>>>>>> keyring:
>>>>>>   /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>>>>>
>>>>>> osd_data, mon_data:
>>>>>>   /var/lib/ceph/$cluster.$name
>>>>>>   /var/lib/ceph/$cluster/$name
>>>>>>   /var/lib/ceph/data/$cluster.$name
>>>>>>   /var/lib/ceph/$type-data/$cluster-$id
>>>>>>
>>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>>> given type to live together in separate directory so that we don't have to
>>>>>> do any filtering to, say, get all osd data directories.  This suggests the
>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>>
>>>>>> Another option would be to make it
>>>>>>
>>>>>> /var/lib/ceph/$type-data/$id
>>>>>>
>>>>>> (with no $cluster) and make users override the default with something that
>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>>> they want multicluster nodes that don't interfere.  Then we'd get
>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>>
>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>>
>>>>> I would suggest
>>>>>
>>>>> /var/lib/ceph/osd/$id/data
>>>>> and
>>>>> /var/lib/ceph/osd/$id/journal
>>>>>
>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>>
>>>>> Rgds,
>>>>> Bernard
>>>>>
>>>>>>
>>>>>> Any other suggestions?  Thoughts?
>>>>>> sage
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05 12:34           ` Wido den Hollander
@ 2012-04-05 13:00             ` Bernard Grymonpon
  2012-04-05 15:17               ` Sage Weil
  0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 13:00 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: Sage Weil, ceph-devel

On 05 Apr 2012, at 14:34, Wido den Hollander wrote:

> On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
>> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
>> 
>> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
> 
> I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.

If it is recommended setup to have multiple OSDs per node (like, one OSD per physical drive), then we need to take that in account - but don't assume that one node only has one SSD disk for journals, which would be shared between all OSDs...

> 
> I'm currently using: osd data = /var/lib/ceph/$name
> 
> To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?

Each osd has data and a journal... there should be some way to identify both...

Rgds,
-bg

> 
> As I think it is a very specific scenario where a machine would be participating in multiple Ceph clusters I'd vote for:
> 
> /var/lib/ceph/$type/$id
> 
> Wido
> 
>> 
>> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
>> 
>> Rgds,
>> Bernard
>> 
>> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
>> 
>>> In ceph case, such layout breakage may be necessary in almost all
>>> installations(except testing), comparing to almost all general-purpose
>>> server software which need division like that only in very specific
>>> setups.
>>> 
>>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
>>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>>> 
>>>> Rgds,
>>>> Bernard
>>>> 
>>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>>> 
>>>>> Right, but probably we need journal separation at the directory level
>>>>> by default, because there is a very small amount of cases when speed
>>>>> of main storage is sufficient for journal or when resulting speed
>>>>> decrease is not significant, so journal by default may go into
>>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>>>> the fast disk.
>>>>> 
>>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
>>>>>> 
>>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>>> 
>>>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>>>> etc.  We'd also like to allow a single host to run OSDs that participate
>>>>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>>>>>>> UUIDs if we can).
>>>>>>> 
>>>>>>> The metavariables are:
>>>>>>> cluster = ceph (by default)
>>>>>>> type = osd, mon, mds
>>>>>>> id = 1, foo,
>>>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>>> 
>>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>>>> device with the 'fsid =<uuid>' line in the available config files found
>>>>>>> in /etc/ceph.
>>>>>>> 
>>>>>>> The locations could be:
>>>>>>> 
>>>>>>> ceph config file:
>>>>>>>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>>>>>> 
>>>>>>> keyring:
>>>>>>>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>>>>>> 
>>>>>>> osd_data, mon_data:
>>>>>>>  /var/lib/ceph/$cluster.$name
>>>>>>>  /var/lib/ceph/$cluster/$name
>>>>>>>  /var/lib/ceph/data/$cluster.$name
>>>>>>>  /var/lib/ceph/$type-data/$cluster-$id
>>>>>>> 
>>>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>>>> given type to live together in separate directory so that we don't have to
>>>>>>> do any filtering to, say, get all osd data directories.  This suggests the
>>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>>> 
>>>>>>> Another option would be to make it
>>>>>>> 
>>>>>>> /var/lib/ceph/$type-data/$id
>>>>>>> 
>>>>>>> (with no $cluster) and make users override the default with something that
>>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>>>> they want multicluster nodes that don't interfere.  Then we'd get
>>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>>> 
>>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>>> 
>>>>>> I would suggest
>>>>>> 
>>>>>> /var/lib/ceph/osd/$id/data
>>>>>> and
>>>>>> /var/lib/ceph/osd/$id/journal
>>>>>> 
>>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>>> 
>>>>>> Rgds,
>>>>>> Bernard
>>>>>> 
>>>>>>> 
>>>>>>> Any other suggestions?  Thoughts?
>>>>>>> sage
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> 
>>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05 13:00             ` Bernard Grymonpon
@ 2012-04-05 15:17               ` Sage Weil
  2012-04-05 16:27                 ` Bernard Grymonpon
  0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2012-04-05 15:17 UTC (permalink / raw)
  To: Bernard Grymonpon; +Cc: Wido den Hollander, ceph-devel

On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
> On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
> 
> > On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
> >> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
> >> 
> >> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
> > 
> > I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
> 
> If it is recommended setup to have multiple OSDs per node (like, one OSD 
> per physical drive), then we need to take that in account - but don't 
> assume that one node only has one SSD disk for journals, which would be 
> shared between all OSDs...
> 
> > 
> > I'm currently using: osd data = /var/lib/ceph/$name
> > 
> > To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
> 
> Each osd has data and a journal... there should be some way to identify 
> both...

Yes.  The plan is for the chef/juju/whatever bits to that part.  For 
example, the scripts triggered by udev/chef/juju would look at the GPT 
labesl to identify OSD disks and mount them in place.  It will similarly 
identify journals by matching the osd uuids and start up the daemon with 
the correct journal.

The current plan is that if /var/lib/ceph/osd-data/$id/journal doesn't 
exist (e.g., because we put it on another device), it will look/wait until 
a journal appears.  If it is present, ceph-osd can start using that.

> > /var/lib/ceph/$type/$id

I like this.  We were originally thinking

 /var/lib/ceph/osd-data/
 /var/lib/ceph/osd-journal/
 /var/lib/ceph/mon-data/

but managing bind mounts or symlinks for journals seems error prone.  TV's 
now thinking we should just start ceph-osd with

  ceph-osd --osd-journal /somewhere/else -i $id

from upstart/whatever if we have a matching journal elsewhere.

sage



> > 
> > Wido
> > 
> >> 
> >> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
> >> 
> >> Rgds,
> >> Bernard
> >> 
> >> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
> >> 
> >>> In ceph case, such layout breakage may be necessary in almost all
> >>> installations(except testing), comparing to almost all general-purpose
> >>> server software which need division like that only in very specific
> >>> setups.
> >>> 
> >>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
> >>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
> >>>> 
> >>>> Rgds,
> >>>> Bernard
> >>>> 
> >>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
> >>>> 
> >>>>> Right, but probably we need journal separation at the directory level
> >>>>> by default, because there is a very small amount of cases when speed
> >>>>> of main storage is sufficient for journal or when resulting speed
> >>>>> decrease is not significant, so journal by default may go into
> >>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
> >>>>> the fast disk.
> >>>>> 
> >>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
> >>>>>> 
> >>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
> >>>>>> 
> >>>>>>> We want to standardize the locations for ceph data directories, configs,
> >>>>>>> etc.  We'd also like to allow a single host to run OSDs that participate
> >>>>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
> >>>>>>> UUIDs if we can).
> >>>>>>> 
> >>>>>>> The metavariables are:
> >>>>>>> cluster = ceph (by default)
> >>>>>>> type = osd, mon, mds
> >>>>>>> id = 1, foo,
> >>>>>>> name = $type.$id = osd.0, mds.a, etc.
> >>>>>>> 
> >>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
> >>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
> >>>>>>> device with the 'fsid =<uuid>' line in the available config files found
> >>>>>>> in /etc/ceph.
> >>>>>>> 
> >>>>>>> The locations could be:
> >>>>>>> 
> >>>>>>> ceph config file:
> >>>>>>>  /etc/ceph/$cluster.conf     (default is thus ceph.conf)
> >>>>>>> 
> >>>>>>> keyring:
> >>>>>>>  /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
> >>>>>>> 
> >>>>>>> osd_data, mon_data:
> >>>>>>>  /var/lib/ceph/$cluster.$name
> >>>>>>>  /var/lib/ceph/$cluster/$name
> >>>>>>>  /var/lib/ceph/data/$cluster.$name
> >>>>>>>  /var/lib/ceph/$type-data/$cluster-$id
> >>>>>>> 
> >>>>>>> TV and I talked about this today, and one thing we want is for items of a
> >>>>>>> given type to live together in separate directory so that we don't have to
> >>>>>>> do any filtering to, say, get all osd data directories.  This suggests the
> >>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
> >>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
> >>>>>>> 
> >>>>>>> Another option would be to make it
> >>>>>>> 
> >>>>>>> /var/lib/ceph/$type-data/$id
> >>>>>>> 
> >>>>>>> (with no $cluster) and make users override the default with something that
> >>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
> >>>>>>> they want multicluster nodes that don't interfere.  Then we'd get
> >>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
> >>>>>> 
> >>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
> >>>>>> 
> >>>>>> I would suggest
> >>>>>> 
> >>>>>> /var/lib/ceph/osd/$id/data
> >>>>>> and
> >>>>>> /var/lib/ceph/osd/$id/journal
> >>>>>> 
> >>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
> >>>>>> 
> >>>>>> Rgds,
> >>>>>> Bernard
> >>>>>> 
> >>>>>>> 
> >>>>>>> Any other suggestions?  Thoughts?
> >>>>>>> sage
> >>>>>>> --
> >>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>>> 
> >>>>>> 
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>> 
> >>>> 
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> 
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05 15:17               ` Sage Weil
@ 2012-04-05 16:27                 ` Bernard Grymonpon
  2012-04-05 16:33                   ` Sage Weil
  0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 16:27 UTC (permalink / raw)
  To: Sage Weil; +Cc: Wido den Hollander, ceph-devel


On 05 Apr 2012, at 17:17, Sage Weil wrote:

> On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
>> On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
>> 
>>> On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
>>>> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
>>>> 
>>>> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
>>> 
>>> I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
>> 
>> If it is recommended setup to have multiple OSDs per node (like, one OSD 
>> per physical drive), then we need to take that in account - but don't 
>> assume that one node only has one SSD disk for journals, which would be 
>> shared between all OSDs...
>> 
>>> 
>>> I'm currently using: osd data = /var/lib/ceph/$name
>>> 
>>> To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
>> 
>> Each osd has data and a journal... there should be some way to identify 
>> both...
> 
> Yes.  The plan is for the chef/juju/whatever bits to that part.  For 
> example, the scripts triggered by udev/chef/juju would look at the GPT 
> labesl to identify OSD disks and mount them in place.  It will similarly 
> identify journals by matching the osd uuids and start up the daemon with 
> the correct journal.
> 
> The current plan is that if /var/lib/ceph/osd-data/$id/journal doesn't 
> exist (e.g., because we put it on another device), it will look/wait until 
> a journal appears.  If it is present, ceph-osd can start using that.

I would suggest you fail the startup of the daemon, as it doesn't have all the needed parts - I personally don't like these "autodiscover" thingies, you never know why they are waiting/searching for,... 

> 
>>> /var/lib/ceph/$type/$id
> 
> I like this.  We were originally thinking
> 
> /var/lib/ceph/osd-data/
> /var/lib/ceph/osd-journal/
> /var/lib/ceph/mon-data/
> 
> but managing bind mounts or symlinks for journals seems error prone.  TV's 
> now thinking we should just start ceph-osd with
> 
>  ceph-osd --osd-journal /somewhere/else -i $id

... I like this more, and i would even suggest to allow to start the daemon just like

ceph-osd --osd-journal /somehwere --osd-data /somewhereelse --conf /etc/ceph/clustername.conf 

(config file is for the monitors)

Configuration and determining which one(s) to start is up to our deployment tools (chef in our case).

Say that we duplicate a node, for some testing/failover/... I would not want to daemon to automatically start, just because the data is there...

Rgds,
Bernard
Openminds BVBA


> 
> from upstart/whatever if we have a matching journal elsewhere.
> 
> sage
> 
> 
> 
>>> 
>>> Wido
>>> 
>>>> 
>>>> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
>>>> 
>>>> Rgds,
>>>> Bernard
>>>> 
>>>> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
>>>> 
>>>>> In ceph case, such layout breakage may be necessary in almost all
>>>>> installations(except testing), comparing to almost all general-purpose
>>>>> server software which need division like that only in very specific
>>>>> setups.
>>>>> 
>>>>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
>>>>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>>>>> 
>>>>>> Rgds,
>>>>>> Bernard
>>>>>> 
>>>>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>>>>> 
>>>>>>> Right, but probably we need journal separation at the directory level
>>>>>>> by default, because there is a very small amount of cases when speed
>>>>>>> of main storage is sufficient for journal or when resulting speed
>>>>>>> decrease is not significant, so journal by default may go into
>>>>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>>>>>> the fast disk.
>>>>>>> 
>>>>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
>>>>>>>> 
>>>>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>>>>> 
>>>>>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>>>>>> etc.  We'd also like to allow a single host to run OSDs that participate
>>>>>>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
>>>>>>>>> UUIDs if we can).
>>>>>>>>> 
>>>>>>>>> The metavariables are:
>>>>>>>>> cluster = ceph (by default)
>>>>>>>>> type = osd, mon, mds
>>>>>>>>> id = 1, foo,
>>>>>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>>>>> 
>>>>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>>>>>> device with the 'fsid =<uuid>' line in the available config files found
>>>>>>>>> in /etc/ceph.
>>>>>>>>> 
>>>>>>>>> The locations could be:
>>>>>>>>> 
>>>>>>>>> ceph config file:
>>>>>>>>> /etc/ceph/$cluster.conf     (default is thus ceph.conf)
>>>>>>>>> 
>>>>>>>>> keyring:
>>>>>>>>> /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
>>>>>>>>> 
>>>>>>>>> osd_data, mon_data:
>>>>>>>>> /var/lib/ceph/$cluster.$name
>>>>>>>>> /var/lib/ceph/$cluster/$name
>>>>>>>>> /var/lib/ceph/data/$cluster.$name
>>>>>>>>> /var/lib/ceph/$type-data/$cluster-$id
>>>>>>>>> 
>>>>>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>>>>>> given type to live together in separate directory so that we don't have to
>>>>>>>>> do any filtering to, say, get all osd data directories.  This suggests the
>>>>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>>>>> 
>>>>>>>>> Another option would be to make it
>>>>>>>>> 
>>>>>>>>> /var/lib/ceph/$type-data/$id
>>>>>>>>> 
>>>>>>>>> (with no $cluster) and make users override the default with something that
>>>>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>>>>>> they want multicluster nodes that don't interfere.  Then we'd get
>>>>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>>>>> 
>>>>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>>>>> 
>>>>>>>> I would suggest
>>>>>>>> 
>>>>>>>> /var/lib/ceph/osd/$id/data
>>>>>>>> and
>>>>>>>> /var/lib/ceph/osd/$id/journal
>>>>>>>> 
>>>>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>>>>> 
>>>>>>>> Rgds,
>>>>>>>> Bernard
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Any other suggestions?  Thoughts?
>>>>>>>>> sage
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>> 
>>>>>> 
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> 
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: defaults paths
  2012-04-05 16:27                 ` Bernard Grymonpon
@ 2012-04-05 16:33                   ` Sage Weil
  0 siblings, 0 replies; 11+ messages in thread
From: Sage Weil @ 2012-04-05 16:33 UTC (permalink / raw)
  To: Bernard Grymonpon; +Cc: Wido den Hollander, ceph-devel

On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
> 
> On 05 Apr 2012, at 17:17, Sage Weil wrote:
> 
> > On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
> >> On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
> >> 
> >>> On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
> >>>> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
> >>>> 
> >>>> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
> >>> 
> >>> I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
> >> 
> >> If it is recommended setup to have multiple OSDs per node (like, one OSD 
> >> per physical drive), then we need to take that in account - but don't 
> >> assume that one node only has one SSD disk for journals, which would be 
> >> shared between all OSDs...
> >> 
> >>> 
> >>> I'm currently using: osd data = /var/lib/ceph/$name
> >>> 
> >>> To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
> >> 
> >> Each osd has data and a journal... there should be some way to identify 
> >> both...
> > 
> > Yes.  The plan is for the chef/juju/whatever bits to that part.  For 
> > example, the scripts triggered by udev/chef/juju would look at the GPT 
> > labesl to identify OSD disks and mount them in place.  It will similarly 
> > identify journals by matching the osd uuids and start up the daemon with 
> > the correct journal.
> > 
> > The current plan is that if /var/lib/ceph/osd-data/$id/journal doesn't 
> > exist (e.g., because we put it on another device), it will look/wait until 
> > a journal appears.  If it is present, ceph-osd can start using that.
> 
> I would suggest you fail the startup of the daemon, as it doesn't have 
> all the needed parts - I personally don't like these "autodiscover" 
> thingies, you never know why they are waiting/searching for,...

Agreed.  The udev rule would not try to start ceph-osd because the journal 
wasn't present.  ceph-osd won't be started unless the journal is present.  

> > 
> >>> /var/lib/ceph/$type/$id
> > 
> > I like this.  We were originally thinking
> > 
> > /var/lib/ceph/osd-data/
> > /var/lib/ceph/osd-journal/
> > /var/lib/ceph/mon-data/
> > 
> > but managing bind mounts or symlinks for journals seems error prone.  TV's 
> > now thinking we should just start ceph-osd with
> > 
> >  ceph-osd --osd-journal /somewhere/else -i $id
> 
> ... I like this more, and i would even suggest to allow to start the 
> daemon just like
> 
> ceph-osd --osd-journal /somehwere --osd-data /somewhereelse --conf 
> /etc/ceph/clustername.conf
> 
> (config file is for the monitors)
> 
> Configuration and determining which one(s) to start is up to our 
> deployment tools (chef in our case).

Yeah.  Explicitly specifying osd_data isn't strictly necessary if it 
matches the default, but the deployment tool could anyway.
 
> Say that we duplicate a node, for some testing/failover/... I would not 
> want to daemon to automatically start, just because the data is there...

I'm not sure if this is something we've looked at yet... TV?

sage


> 
> Rgds,
> Bernard
> Openminds BVBA
> 
> 
> > 
> > from upstart/whatever if we have a matching journal elsewhere.
> > 
> > sage
> > 
> > 
> > 
> >>> 
> >>> Wido
> >>> 
> >>>> 
> >>>> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
> >>>> 
> >>>> Rgds,
> >>>> Bernard
> >>>> 
> >>>> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
> >>>> 
> >>>>> In ceph case, such layout breakage may be necessary in almost all
> >>>>> installations(except testing), comparing to almost all general-purpose
> >>>>> server software which need division like that only in very specific
> >>>>> setups.
> >>>>> 
> >>>>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
> >>>>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
> >>>>>> 
> >>>>>> Rgds,
> >>>>>> Bernard
> >>>>>> 
> >>>>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
> >>>>>> 
> >>>>>>> Right, but probably we need journal separation at the directory level
> >>>>>>> by default, because there is a very small amount of cases when speed
> >>>>>>> of main storage is sufficient for journal or when resulting speed
> >>>>>>> decrease is not significant, so journal by default may go into
> >>>>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
> >>>>>>> the fast disk.
> >>>>>>> 
> >>>>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be>  wrote:
> >>>>>>>> 
> >>>>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
> >>>>>>>> 
> >>>>>>>>> We want to standardize the locations for ceph data directories, configs,
> >>>>>>>>> etc.  We'd also like to allow a single host to run OSDs that participate
> >>>>>>>>> in multiple ceph clusters.  We'd like easy to deal with names (i.e., avoid
> >>>>>>>>> UUIDs if we can).
> >>>>>>>>> 
> >>>>>>>>> The metavariables are:
> >>>>>>>>> cluster = ceph (by default)
> >>>>>>>>> type = osd, mon, mds
> >>>>>>>>> id = 1, foo,
> >>>>>>>>> name = $type.$id = osd.0, mds.a, etc.
> >>>>>>>>> 
> >>>>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
> >>>>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
> >>>>>>>>> device with the 'fsid =<uuid>' line in the available config files found
> >>>>>>>>> in /etc/ceph.
> >>>>>>>>> 
> >>>>>>>>> The locations could be:
> >>>>>>>>> 
> >>>>>>>>> ceph config file:
> >>>>>>>>> /etc/ceph/$cluster.conf     (default is thus ceph.conf)
> >>>>>>>>> 
> >>>>>>>>> keyring:
> >>>>>>>>> /etc/ceph/$cluster.keyring  (fallback to /etc/ceph/keyring)
> >>>>>>>>> 
> >>>>>>>>> osd_data, mon_data:
> >>>>>>>>> /var/lib/ceph/$cluster.$name
> >>>>>>>>> /var/lib/ceph/$cluster/$name
> >>>>>>>>> /var/lib/ceph/data/$cluster.$name
> >>>>>>>>> /var/lib/ceph/$type-data/$cluster-$id
> >>>>>>>>> 
> >>>>>>>>> TV and I talked about this today, and one thing we want is for items of a
> >>>>>>>>> given type to live together in separate directory so that we don't have to
> >>>>>>>>> do any filtering to, say, get all osd data directories.  This suggests the
> >>>>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
> >>>>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
> >>>>>>>>> 
> >>>>>>>>> Another option would be to make it
> >>>>>>>>> 
> >>>>>>>>> /var/lib/ceph/$type-data/$id
> >>>>>>>>> 
> >>>>>>>>> (with no $cluster) and make users override the default with something that
> >>>>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
> >>>>>>>>> they want multicluster nodes that don't interfere.  Then we'd get
> >>>>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
> >>>>>>>> 
> >>>>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
> >>>>>>>> 
> >>>>>>>> I would suggest
> >>>>>>>> 
> >>>>>>>> /var/lib/ceph/osd/$id/data
> >>>>>>>> and
> >>>>>>>> /var/lib/ceph/osd/$id/journal
> >>>>>>>> 
> >>>>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
> >>>>>>>> 
> >>>>>>>> Rgds,
> >>>>>>>> Bernard
> >>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>> Any other suggestions?  Thoughts?
> >>>>>>>>> sage
> >>>>>>>>> --
> >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>>>>> 
> >>>>>>>> 
> >>>>>>>> --
> >>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>>> --
> >>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>>> 
> >>>>>> 
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>> 
> >>>> 
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> 
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> 
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> 
> >> 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-04-05 16:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-05  6:32 defaults paths Sage Weil
2012-04-05  6:57 ` Bernard Grymonpon
2012-04-05  7:12   ` Andrey Korolyov
2012-04-05  7:28     ` Bernard Grymonpon
2012-04-05  7:37       ` Andrey Korolyov
2012-04-05  8:38         ` Bernard Grymonpon
2012-04-05 12:34           ` Wido den Hollander
2012-04-05 13:00             ` Bernard Grymonpon
2012-04-05 15:17               ` Sage Weil
2012-04-05 16:27                 ` Bernard Grymonpon
2012-04-05 16:33                   ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.