* defaults paths
@ 2012-04-05 6:32 Sage Weil
2012-04-05 6:57 ` Bernard Grymonpon
0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2012-04-05 6:32 UTC (permalink / raw)
To: ceph-devel
We want to standardize the locations for ceph data directories, configs,
etc. We'd also like to allow a single host to run OSDs that participate
in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
UUIDs if we can).
The metavariables are:
cluster = ceph (by default)
type = osd, mon, mds
id = 1, foo,
name = $type.$id = osd.0, mds.a, etc.
The $cluster variable will come from the command line (--cluster foo) or,
in the case of a udev hotplug tool or something, matching the uuid on the
device with the 'fsid = <uuid>' line in the available config files found
in /etc/ceph.
The locations could be:
ceph config file:
/etc/ceph/$cluster.conf (default is thus ceph.conf)
keyring:
/etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
osd_data, mon_data:
/var/lib/ceph/$cluster.$name
/var/lib/ceph/$cluster/$name
/var/lib/ceph/data/$cluster.$name
/var/lib/ceph/$type-data/$cluster-$id
TV and I talked about this today, and one thing we want is for items of a
given type to live together in separate directory so that we don't have to
do any filtering to, say, get all osd data directories. This suggests the
last option (/var/lib/ceph/osd-data/ceph-1,
/var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
Another option would be to make it
/var/lib/ceph/$type-data/$id
(with no $cluster) and make users override the default with something that
includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
they want multicluster nodes that don't interfere. Then we'd get
/var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
Any other suggestions? Thoughts?
sage
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 6:32 defaults paths Sage Weil
@ 2012-04-05 6:57 ` Bernard Grymonpon
2012-04-05 7:12 ` Andrey Korolyov
0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 6:57 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 05 Apr 2012, at 08:32, Sage Weil wrote:
> We want to standardize the locations for ceph data directories, configs,
> etc. We'd also like to allow a single host to run OSDs that participate
> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
> UUIDs if we can).
>
> The metavariables are:
> cluster = ceph (by default)
> type = osd, mon, mds
> id = 1, foo,
> name = $type.$id = osd.0, mds.a, etc.
>
> The $cluster variable will come from the command line (--cluster foo) or,
> in the case of a udev hotplug tool or something, matching the uuid on the
> device with the 'fsid = <uuid>' line in the available config files found
> in /etc/ceph.
>
> The locations could be:
>
> ceph config file:
> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>
> keyring:
> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>
> osd_data, mon_data:
> /var/lib/ceph/$cluster.$name
> /var/lib/ceph/$cluster/$name
> /var/lib/ceph/data/$cluster.$name
> /var/lib/ceph/$type-data/$cluster-$id
>
> TV and I talked about this today, and one thing we want is for items of a
> given type to live together in separate directory so that we don't have to
> do any filtering to, say, get all osd data directories. This suggests the
> last option (/var/lib/ceph/osd-data/ceph-1,
> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>
> Another option would be to make it
>
> /var/lib/ceph/$type-data/$id
>
> (with no $cluster) and make users override the default with something that
> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
> they want multicluster nodes that don't interfere. Then we'd get
> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
I would suggest
/var/lib/ceph/osd/$id/data
and
/var/lib/ceph/osd/$id/journal
($id could be replaced by $uuid or $name, for which I would prefer $uuid)
Rgds,
Bernard
>
> Any other suggestions? Thoughts?
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 6:57 ` Bernard Grymonpon
@ 2012-04-05 7:12 ` Andrey Korolyov
2012-04-05 7:28 ` Bernard Grymonpon
0 siblings, 1 reply; 11+ messages in thread
From: Andrey Korolyov @ 2012-04-05 7:12 UTC (permalink / raw)
To: Bernard Grymonpon; +Cc: Sage Weil, ceph-devel
Right, but probably we need journal separation at the directory level
by default, because there is a very small amount of cases when speed
of main storage is sufficient for journal or when resulting speed
decrease is not significant, so journal by default may go into
/var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
the fast disk.
On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>
> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>
>> We want to standardize the locations for ceph data directories, configs,
>> etc. We'd also like to allow a single host to run OSDs that participate
>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>> UUIDs if we can).
>>
>> The metavariables are:
>> cluster = ceph (by default)
>> type = osd, mon, mds
>> id = 1, foo,
>> name = $type.$id = osd.0, mds.a, etc.
>>
>> The $cluster variable will come from the command line (--cluster foo) or,
>> in the case of a udev hotplug tool or something, matching the uuid on the
>> device with the 'fsid = <uuid>' line in the available config files found
>> in /etc/ceph.
>>
>> The locations could be:
>>
>> ceph config file:
>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>
>> keyring:
>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>
>> osd_data, mon_data:
>> /var/lib/ceph/$cluster.$name
>> /var/lib/ceph/$cluster/$name
>> /var/lib/ceph/data/$cluster.$name
>> /var/lib/ceph/$type-data/$cluster-$id
>>
>> TV and I talked about this today, and one thing we want is for items of a
>> given type to live together in separate directory so that we don't have to
>> do any filtering to, say, get all osd data directories. This suggests the
>> last option (/var/lib/ceph/osd-data/ceph-1,
>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>
>> Another option would be to make it
>>
>> /var/lib/ceph/$type-data/$id
>>
>> (with no $cluster) and make users override the default with something that
>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>> they want multicluster nodes that don't interfere. Then we'd get
>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>
> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>
> I would suggest
>
> /var/lib/ceph/osd/$id/data
> and
> /var/lib/ceph/osd/$id/journal
>
> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>
> Rgds,
> Bernard
>
>>
>> Any other suggestions? Thoughts?
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 7:12 ` Andrey Korolyov
@ 2012-04-05 7:28 ` Bernard Grymonpon
2012-04-05 7:37 ` Andrey Korolyov
0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 7:28 UTC (permalink / raw)
To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel
I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
Rgds,
Bernard
On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
> Right, but probably we need journal separation at the directory level
> by default, because there is a very small amount of cases when speed
> of main storage is sufficient for journal or when resulting speed
> decrease is not significant, so journal by default may go into
> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
> the fast disk.
>
> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>>
>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>
>>> We want to standardize the locations for ceph data directories, configs,
>>> etc. We'd also like to allow a single host to run OSDs that participate
>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>>> UUIDs if we can).
>>>
>>> The metavariables are:
>>> cluster = ceph (by default)
>>> type = osd, mon, mds
>>> id = 1, foo,
>>> name = $type.$id = osd.0, mds.a, etc.
>>>
>>> The $cluster variable will come from the command line (--cluster foo) or,
>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>> device with the 'fsid = <uuid>' line in the available config files found
>>> in /etc/ceph.
>>>
>>> The locations could be:
>>>
>>> ceph config file:
>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>>
>>> keyring:
>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>>
>>> osd_data, mon_data:
>>> /var/lib/ceph/$cluster.$name
>>> /var/lib/ceph/$cluster/$name
>>> /var/lib/ceph/data/$cluster.$name
>>> /var/lib/ceph/$type-data/$cluster-$id
>>>
>>> TV and I talked about this today, and one thing we want is for items of a
>>> given type to live together in separate directory so that we don't have to
>>> do any filtering to, say, get all osd data directories. This suggests the
>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>
>>> Another option would be to make it
>>>
>>> /var/lib/ceph/$type-data/$id
>>>
>>> (with no $cluster) and make users override the default with something that
>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>> they want multicluster nodes that don't interfere. Then we'd get
>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>
>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>
>> I would suggest
>>
>> /var/lib/ceph/osd/$id/data
>> and
>> /var/lib/ceph/osd/$id/journal
>>
>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>
>> Rgds,
>> Bernard
>>
>>>
>>> Any other suggestions? Thoughts?
>>> sage
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 7:28 ` Bernard Grymonpon
@ 2012-04-05 7:37 ` Andrey Korolyov
2012-04-05 8:38 ` Bernard Grymonpon
0 siblings, 1 reply; 11+ messages in thread
From: Andrey Korolyov @ 2012-04-05 7:37 UTC (permalink / raw)
To: Bernard Grymonpon; +Cc: Sage Weil, ceph-devel
In ceph case, such layout breakage may be necessary in almost all
installations(except testing), comparing to almost all general-purpose
server software which need division like that only in very specific
setups.
On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>
> Rgds,
> Bernard
>
> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>
>> Right, but probably we need journal separation at the directory level
>> by default, because there is a very small amount of cases when speed
>> of main storage is sufficient for journal or when resulting speed
>> decrease is not significant, so journal by default may go into
>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>> the fast disk.
>>
>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>>>
>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>
>>>> We want to standardize the locations for ceph data directories, configs,
>>>> etc. We'd also like to allow a single host to run OSDs that participate
>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>>>> UUIDs if we can).
>>>>
>>>> The metavariables are:
>>>> cluster = ceph (by default)
>>>> type = osd, mon, mds
>>>> id = 1, foo,
>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>
>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>> device with the 'fsid = <uuid>' line in the available config files found
>>>> in /etc/ceph.
>>>>
>>>> The locations could be:
>>>>
>>>> ceph config file:
>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>>>
>>>> keyring:
>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>>>
>>>> osd_data, mon_data:
>>>> /var/lib/ceph/$cluster.$name
>>>> /var/lib/ceph/$cluster/$name
>>>> /var/lib/ceph/data/$cluster.$name
>>>> /var/lib/ceph/$type-data/$cluster-$id
>>>>
>>>> TV and I talked about this today, and one thing we want is for items of a
>>>> given type to live together in separate directory so that we don't have to
>>>> do any filtering to, say, get all osd data directories. This suggests the
>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>
>>>> Another option would be to make it
>>>>
>>>> /var/lib/ceph/$type-data/$id
>>>>
>>>> (with no $cluster) and make users override the default with something that
>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>> they want multicluster nodes that don't interfere. Then we'd get
>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>
>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>
>>> I would suggest
>>>
>>> /var/lib/ceph/osd/$id/data
>>> and
>>> /var/lib/ceph/osd/$id/journal
>>>
>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>
>>> Rgds,
>>> Bernard
>>>
>>>>
>>>> Any other suggestions? Thoughts?
>>>> sage
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 7:37 ` Andrey Korolyov
@ 2012-04-05 8:38 ` Bernard Grymonpon
2012-04-05 12:34 ` Wido den Hollander
0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 8:38 UTC (permalink / raw)
To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel
I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
Rgds,
Bernard
On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
> In ceph case, such layout breakage may be necessary in almost all
> installations(except testing), comparing to almost all general-purpose
> server software which need division like that only in very specific
> setups.
>
> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>
>> Rgds,
>> Bernard
>>
>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>
>>> Right, but probably we need journal separation at the directory level
>>> by default, because there is a very small amount of cases when speed
>>> of main storage is sufficient for journal or when resulting speed
>>> decrease is not significant, so journal by default may go into
>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>> the fast disk.
>>>
>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon <bernard@openminds.be> wrote:
>>>>
>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>
>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>> etc. We'd also like to allow a single host to run OSDs that participate
>>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>>>>> UUIDs if we can).
>>>>>
>>>>> The metavariables are:
>>>>> cluster = ceph (by default)
>>>>> type = osd, mon, mds
>>>>> id = 1, foo,
>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>
>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>> device with the 'fsid = <uuid>' line in the available config files found
>>>>> in /etc/ceph.
>>>>>
>>>>> The locations could be:
>>>>>
>>>>> ceph config file:
>>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>>>>
>>>>> keyring:
>>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>>>>
>>>>> osd_data, mon_data:
>>>>> /var/lib/ceph/$cluster.$name
>>>>> /var/lib/ceph/$cluster/$name
>>>>> /var/lib/ceph/data/$cluster.$name
>>>>> /var/lib/ceph/$type-data/$cluster-$id
>>>>>
>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>> given type to live together in separate directory so that we don't have to
>>>>> do any filtering to, say, get all osd data directories. This suggests the
>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>
>>>>> Another option would be to make it
>>>>>
>>>>> /var/lib/ceph/$type-data/$id
>>>>>
>>>>> (with no $cluster) and make users override the default with something that
>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>> they want multicluster nodes that don't interfere. Then we'd get
>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>
>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>
>>>> I would suggest
>>>>
>>>> /var/lib/ceph/osd/$id/data
>>>> and
>>>> /var/lib/ceph/osd/$id/journal
>>>>
>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>
>>>> Rgds,
>>>> Bernard
>>>>
>>>>>
>>>>> Any other suggestions? Thoughts?
>>>>> sage
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 8:38 ` Bernard Grymonpon
@ 2012-04-05 12:34 ` Wido den Hollander
2012-04-05 13:00 ` Bernard Grymonpon
0 siblings, 1 reply; 11+ messages in thread
From: Wido den Hollander @ 2012-04-05 12:34 UTC (permalink / raw)
To: Bernard Grymonpon; +Cc: Sage Weil, ceph-devel
On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
>
> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
I think that's a wrong assumption. On most systems I think multiple OSDs
will exist, it's debatable if one would run OSDs from different clusters
very often.
I'm currently using: osd data = /var/lib/ceph/$name
To get back to what sage mentioned, why add the "-data" suffix to a
directory name? Isn't it obvious that a directory will contain data?
As I think it is a very specific scenario where a machine would be
participating in multiple Ceph clusters I'd vote for:
/var/lib/ceph/$type/$id
Wido
>
> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
>
> Rgds,
> Bernard
>
> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
>
>> In ceph case, such layout breakage may be necessary in almost all
>> installations(except testing), comparing to almost all general-purpose
>> server software which need division like that only in very specific
>> setups.
>>
>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>>
>>> Rgds,
>>> Bernard
>>>
>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>>
>>>> Right, but probably we need journal separation at the directory level
>>>> by default, because there is a very small amount of cases when speed
>>>> of main storage is sufficient for journal or when resulting speed
>>>> decrease is not significant, so journal by default may go into
>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>>> the fast disk.
>>>>
>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
>>>>>
>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>>
>>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>>> etc. We'd also like to allow a single host to run OSDs that participate
>>>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>>>>>> UUIDs if we can).
>>>>>>
>>>>>> The metavariables are:
>>>>>> cluster = ceph (by default)
>>>>>> type = osd, mon, mds
>>>>>> id = 1, foo,
>>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>>
>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>>> device with the 'fsid =<uuid>' line in the available config files found
>>>>>> in /etc/ceph.
>>>>>>
>>>>>> The locations could be:
>>>>>>
>>>>>> ceph config file:
>>>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>>>>>
>>>>>> keyring:
>>>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>>>>>
>>>>>> osd_data, mon_data:
>>>>>> /var/lib/ceph/$cluster.$name
>>>>>> /var/lib/ceph/$cluster/$name
>>>>>> /var/lib/ceph/data/$cluster.$name
>>>>>> /var/lib/ceph/$type-data/$cluster-$id
>>>>>>
>>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>>> given type to live together in separate directory so that we don't have to
>>>>>> do any filtering to, say, get all osd data directories. This suggests the
>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>>
>>>>>> Another option would be to make it
>>>>>>
>>>>>> /var/lib/ceph/$type-data/$id
>>>>>>
>>>>>> (with no $cluster) and make users override the default with something that
>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>>> they want multicluster nodes that don't interfere. Then we'd get
>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>>
>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>>
>>>>> I would suggest
>>>>>
>>>>> /var/lib/ceph/osd/$id/data
>>>>> and
>>>>> /var/lib/ceph/osd/$id/journal
>>>>>
>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>>
>>>>> Rgds,
>>>>> Bernard
>>>>>
>>>>>>
>>>>>> Any other suggestions? Thoughts?
>>>>>> sage
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 12:34 ` Wido den Hollander
@ 2012-04-05 13:00 ` Bernard Grymonpon
2012-04-05 15:17 ` Sage Weil
0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 13:00 UTC (permalink / raw)
To: Wido den Hollander; +Cc: Sage Weil, ceph-devel
On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
> On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
>> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
>>
>> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
>
> I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
If it is recommended setup to have multiple OSDs per node (like, one OSD per physical drive), then we need to take that in account - but don't assume that one node only has one SSD disk for journals, which would be shared between all OSDs...
>
> I'm currently using: osd data = /var/lib/ceph/$name
>
> To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
Each osd has data and a journal... there should be some way to identify both...
Rgds,
-bg
>
> As I think it is a very specific scenario where a machine would be participating in multiple Ceph clusters I'd vote for:
>
> /var/lib/ceph/$type/$id
>
> Wido
>
>>
>> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
>>
>> Rgds,
>> Bernard
>>
>> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
>>
>>> In ceph case, such layout breakage may be necessary in almost all
>>> installations(except testing), comparing to almost all general-purpose
>>> server software which need division like that only in very specific
>>> setups.
>>>
>>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
>>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>>>
>>>> Rgds,
>>>> Bernard
>>>>
>>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>>>
>>>>> Right, but probably we need journal separation at the directory level
>>>>> by default, because there is a very small amount of cases when speed
>>>>> of main storage is sufficient for journal or when resulting speed
>>>>> decrease is not significant, so journal by default may go into
>>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>>>> the fast disk.
>>>>>
>>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
>>>>>>
>>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>>>
>>>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>>>> etc. We'd also like to allow a single host to run OSDs that participate
>>>>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>>>>>>> UUIDs if we can).
>>>>>>>
>>>>>>> The metavariables are:
>>>>>>> cluster = ceph (by default)
>>>>>>> type = osd, mon, mds
>>>>>>> id = 1, foo,
>>>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>>>
>>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>>>> device with the 'fsid =<uuid>' line in the available config files found
>>>>>>> in /etc/ceph.
>>>>>>>
>>>>>>> The locations could be:
>>>>>>>
>>>>>>> ceph config file:
>>>>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>>>>>>
>>>>>>> keyring:
>>>>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>>>>>>
>>>>>>> osd_data, mon_data:
>>>>>>> /var/lib/ceph/$cluster.$name
>>>>>>> /var/lib/ceph/$cluster/$name
>>>>>>> /var/lib/ceph/data/$cluster.$name
>>>>>>> /var/lib/ceph/$type-data/$cluster-$id
>>>>>>>
>>>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>>>> given type to live together in separate directory so that we don't have to
>>>>>>> do any filtering to, say, get all osd data directories. This suggests the
>>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>>>
>>>>>>> Another option would be to make it
>>>>>>>
>>>>>>> /var/lib/ceph/$type-data/$id
>>>>>>>
>>>>>>> (with no $cluster) and make users override the default with something that
>>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>>>> they want multicluster nodes that don't interfere. Then we'd get
>>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>>>
>>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>>>
>>>>>> I would suggest
>>>>>>
>>>>>> /var/lib/ceph/osd/$id/data
>>>>>> and
>>>>>> /var/lib/ceph/osd/$id/journal
>>>>>>
>>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>>>
>>>>>> Rgds,
>>>>>> Bernard
>>>>>>
>>>>>>>
>>>>>>> Any other suggestions? Thoughts?
>>>>>>> sage
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 13:00 ` Bernard Grymonpon
@ 2012-04-05 15:17 ` Sage Weil
2012-04-05 16:27 ` Bernard Grymonpon
0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2012-04-05 15:17 UTC (permalink / raw)
To: Bernard Grymonpon; +Cc: Wido den Hollander, ceph-devel
On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
> On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
>
> > On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
> >> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
> >>
> >> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
> >
> > I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
>
> If it is recommended setup to have multiple OSDs per node (like, one OSD
> per physical drive), then we need to take that in account - but don't
> assume that one node only has one SSD disk for journals, which would be
> shared between all OSDs...
>
> >
> > I'm currently using: osd data = /var/lib/ceph/$name
> >
> > To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
>
> Each osd has data and a journal... there should be some way to identify
> both...
Yes. The plan is for the chef/juju/whatever bits to that part. For
example, the scripts triggered by udev/chef/juju would look at the GPT
labesl to identify OSD disks and mount them in place. It will similarly
identify journals by matching the osd uuids and start up the daemon with
the correct journal.
The current plan is that if /var/lib/ceph/osd-data/$id/journal doesn't
exist (e.g., because we put it on another device), it will look/wait until
a journal appears. If it is present, ceph-osd can start using that.
> > /var/lib/ceph/$type/$id
I like this. We were originally thinking
/var/lib/ceph/osd-data/
/var/lib/ceph/osd-journal/
/var/lib/ceph/mon-data/
but managing bind mounts or symlinks for journals seems error prone. TV's
now thinking we should just start ceph-osd with
ceph-osd --osd-journal /somewhere/else -i $id
from upstart/whatever if we have a matching journal elsewhere.
sage
> >
> > Wido
> >
> >>
> >> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
> >>
> >> Rgds,
> >> Bernard
> >>
> >> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
> >>
> >>> In ceph case, such layout breakage may be necessary in almost all
> >>> installations(except testing), comparing to almost all general-purpose
> >>> server software which need division like that only in very specific
> >>> setups.
> >>>
> >>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
> >>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
> >>>>
> >>>> Rgds,
> >>>> Bernard
> >>>>
> >>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
> >>>>
> >>>>> Right, but probably we need journal separation at the directory level
> >>>>> by default, because there is a very small amount of cases when speed
> >>>>> of main storage is sufficient for journal or when resulting speed
> >>>>> decrease is not significant, so journal by default may go into
> >>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
> >>>>> the fast disk.
> >>>>>
> >>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
> >>>>>>
> >>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
> >>>>>>
> >>>>>>> We want to standardize the locations for ceph data directories, configs,
> >>>>>>> etc. We'd also like to allow a single host to run OSDs that participate
> >>>>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
> >>>>>>> UUIDs if we can).
> >>>>>>>
> >>>>>>> The metavariables are:
> >>>>>>> cluster = ceph (by default)
> >>>>>>> type = osd, mon, mds
> >>>>>>> id = 1, foo,
> >>>>>>> name = $type.$id = osd.0, mds.a, etc.
> >>>>>>>
> >>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
> >>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
> >>>>>>> device with the 'fsid =<uuid>' line in the available config files found
> >>>>>>> in /etc/ceph.
> >>>>>>>
> >>>>>>> The locations could be:
> >>>>>>>
> >>>>>>> ceph config file:
> >>>>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
> >>>>>>>
> >>>>>>> keyring:
> >>>>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
> >>>>>>>
> >>>>>>> osd_data, mon_data:
> >>>>>>> /var/lib/ceph/$cluster.$name
> >>>>>>> /var/lib/ceph/$cluster/$name
> >>>>>>> /var/lib/ceph/data/$cluster.$name
> >>>>>>> /var/lib/ceph/$type-data/$cluster-$id
> >>>>>>>
> >>>>>>> TV and I talked about this today, and one thing we want is for items of a
> >>>>>>> given type to live together in separate directory so that we don't have to
> >>>>>>> do any filtering to, say, get all osd data directories. This suggests the
> >>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
> >>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
> >>>>>>>
> >>>>>>> Another option would be to make it
> >>>>>>>
> >>>>>>> /var/lib/ceph/$type-data/$id
> >>>>>>>
> >>>>>>> (with no $cluster) and make users override the default with something that
> >>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
> >>>>>>> they want multicluster nodes that don't interfere. Then we'd get
> >>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
> >>>>>>
> >>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
> >>>>>>
> >>>>>> I would suggest
> >>>>>>
> >>>>>> /var/lib/ceph/osd/$id/data
> >>>>>> and
> >>>>>> /var/lib/ceph/osd/$id/journal
> >>>>>>
> >>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
> >>>>>>
> >>>>>> Rgds,
> >>>>>> Bernard
> >>>>>>
> >>>>>>>
> >>>>>>> Any other suggestions? Thoughts?
> >>>>>>> sage
> >>>>>>> --
> >>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>
> >>>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 15:17 ` Sage Weil
@ 2012-04-05 16:27 ` Bernard Grymonpon
2012-04-05 16:33 ` Sage Weil
0 siblings, 1 reply; 11+ messages in thread
From: Bernard Grymonpon @ 2012-04-05 16:27 UTC (permalink / raw)
To: Sage Weil; +Cc: Wido den Hollander, ceph-devel
On 05 Apr 2012, at 17:17, Sage Weil wrote:
> On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
>> On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
>>
>>> On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
>>>> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
>>>>
>>>> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
>>>
>>> I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
>>
>> If it is recommended setup to have multiple OSDs per node (like, one OSD
>> per physical drive), then we need to take that in account - but don't
>> assume that one node only has one SSD disk for journals, which would be
>> shared between all OSDs...
>>
>>>
>>> I'm currently using: osd data = /var/lib/ceph/$name
>>>
>>> To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
>>
>> Each osd has data and a journal... there should be some way to identify
>> both...
>
> Yes. The plan is for the chef/juju/whatever bits to that part. For
> example, the scripts triggered by udev/chef/juju would look at the GPT
> labesl to identify OSD disks and mount them in place. It will similarly
> identify journals by matching the osd uuids and start up the daemon with
> the correct journal.
>
> The current plan is that if /var/lib/ceph/osd-data/$id/journal doesn't
> exist (e.g., because we put it on another device), it will look/wait until
> a journal appears. If it is present, ceph-osd can start using that.
I would suggest you fail the startup of the daemon, as it doesn't have all the needed parts - I personally don't like these "autodiscover" thingies, you never know why they are waiting/searching for,...
>
>>> /var/lib/ceph/$type/$id
>
> I like this. We were originally thinking
>
> /var/lib/ceph/osd-data/
> /var/lib/ceph/osd-journal/
> /var/lib/ceph/mon-data/
>
> but managing bind mounts or symlinks for journals seems error prone. TV's
> now thinking we should just start ceph-osd with
>
> ceph-osd --osd-journal /somewhere/else -i $id
... I like this more, and i would even suggest to allow to start the daemon just like
ceph-osd --osd-journal /somehwere --osd-data /somewhereelse --conf /etc/ceph/clustername.conf
(config file is for the monitors)
Configuration and determining which one(s) to start is up to our deployment tools (chef in our case).
Say that we duplicate a node, for some testing/failover/... I would not want to daemon to automatically start, just because the data is there...
Rgds,
Bernard
Openminds BVBA
>
> from upstart/whatever if we have a matching journal elsewhere.
>
> sage
>
>
>
>>>
>>> Wido
>>>
>>>>
>>>> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
>>>>
>>>> Rgds,
>>>> Bernard
>>>>
>>>> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
>>>>
>>>>> In ceph case, such layout breakage may be necessary in almost all
>>>>> installations(except testing), comparing to almost all general-purpose
>>>>> server software which need division like that only in very specific
>>>>> setups.
>>>>>
>>>>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
>>>>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
>>>>>>
>>>>>> Rgds,
>>>>>> Bernard
>>>>>>
>>>>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
>>>>>>
>>>>>>> Right, but probably we need journal separation at the directory level
>>>>>>> by default, because there is a very small amount of cases when speed
>>>>>>> of main storage is sufficient for journal or when resulting speed
>>>>>>> decrease is not significant, so journal by default may go into
>>>>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
>>>>>>> the fast disk.
>>>>>>>
>>>>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
>>>>>>>>
>>>>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
>>>>>>>>
>>>>>>>>> We want to standardize the locations for ceph data directories, configs,
>>>>>>>>> etc. We'd also like to allow a single host to run OSDs that participate
>>>>>>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
>>>>>>>>> UUIDs if we can).
>>>>>>>>>
>>>>>>>>> The metavariables are:
>>>>>>>>> cluster = ceph (by default)
>>>>>>>>> type = osd, mon, mds
>>>>>>>>> id = 1, foo,
>>>>>>>>> name = $type.$id = osd.0, mds.a, etc.
>>>>>>>>>
>>>>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
>>>>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
>>>>>>>>> device with the 'fsid =<uuid>' line in the available config files found
>>>>>>>>> in /etc/ceph.
>>>>>>>>>
>>>>>>>>> The locations could be:
>>>>>>>>>
>>>>>>>>> ceph config file:
>>>>>>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
>>>>>>>>>
>>>>>>>>> keyring:
>>>>>>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
>>>>>>>>>
>>>>>>>>> osd_data, mon_data:
>>>>>>>>> /var/lib/ceph/$cluster.$name
>>>>>>>>> /var/lib/ceph/$cluster/$name
>>>>>>>>> /var/lib/ceph/data/$cluster.$name
>>>>>>>>> /var/lib/ceph/$type-data/$cluster-$id
>>>>>>>>>
>>>>>>>>> TV and I talked about this today, and one thing we want is for items of a
>>>>>>>>> given type to live together in separate directory so that we don't have to
>>>>>>>>> do any filtering to, say, get all osd data directories. This suggests the
>>>>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
>>>>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
>>>>>>>>>
>>>>>>>>> Another option would be to make it
>>>>>>>>>
>>>>>>>>> /var/lib/ceph/$type-data/$id
>>>>>>>>>
>>>>>>>>> (with no $cluster) and make users override the default with something that
>>>>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
>>>>>>>>> they want multicluster nodes that don't interfere. Then we'd get
>>>>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
>>>>>>>>
>>>>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
>>>>>>>>
>>>>>>>> I would suggest
>>>>>>>>
>>>>>>>> /var/lib/ceph/osd/$id/data
>>>>>>>> and
>>>>>>>> /var/lib/ceph/osd/$id/journal
>>>>>>>>
>>>>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
>>>>>>>>
>>>>>>>> Rgds,
>>>>>>>> Bernard
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Any other suggestions? Thoughts?
>>>>>>>>> sage
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: defaults paths
2012-04-05 16:27 ` Bernard Grymonpon
@ 2012-04-05 16:33 ` Sage Weil
0 siblings, 0 replies; 11+ messages in thread
From: Sage Weil @ 2012-04-05 16:33 UTC (permalink / raw)
To: Bernard Grymonpon; +Cc: Wido den Hollander, ceph-devel
On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
>
> On 05 Apr 2012, at 17:17, Sage Weil wrote:
>
> > On Thu, 5 Apr 2012, Bernard Grymonpon wrote:
> >> On 05 Apr 2012, at 14:34, Wido den Hollander wrote:
> >>
> >>> On 04/05/2012 10:38 AM, Bernard Grymonpon wrote:
> >>>> I assume most OSD nodes will normally run a single OSD, so this would not apply to most nodes.
> >>>>
> >>>> Only in specific cases (where multiple OSDs run on a single node) this would come up, and these specific cases might even require to have the journals split over multiple devices (multiple ssd-disks ...)
> >>>
> >>> I think that's a wrong assumption. On most systems I think multiple OSDs will exist, it's debatable if one would run OSDs from different clusters very often.
> >>
> >> If it is recommended setup to have multiple OSDs per node (like, one OSD
> >> per physical drive), then we need to take that in account - but don't
> >> assume that one node only has one SSD disk for journals, which would be
> >> shared between all OSDs...
> >>
> >>>
> >>> I'm currently using: osd data = /var/lib/ceph/$name
> >>>
> >>> To get back to what sage mentioned, why add the "-data" suffix to a directory name? Isn't it obvious that a directory will contain data?
> >>
> >> Each osd has data and a journal... there should be some way to identify
> >> both...
> >
> > Yes. The plan is for the chef/juju/whatever bits to that part. For
> > example, the scripts triggered by udev/chef/juju would look at the GPT
> > labesl to identify OSD disks and mount them in place. It will similarly
> > identify journals by matching the osd uuids and start up the daemon with
> > the correct journal.
> >
> > The current plan is that if /var/lib/ceph/osd-data/$id/journal doesn't
> > exist (e.g., because we put it on another device), it will look/wait until
> > a journal appears. If it is present, ceph-osd can start using that.
>
> I would suggest you fail the startup of the daemon, as it doesn't have
> all the needed parts - I personally don't like these "autodiscover"
> thingies, you never know why they are waiting/searching for,...
Agreed. The udev rule would not try to start ceph-osd because the journal
wasn't present. ceph-osd won't be started unless the journal is present.
> >
> >>> /var/lib/ceph/$type/$id
> >
> > I like this. We were originally thinking
> >
> > /var/lib/ceph/osd-data/
> > /var/lib/ceph/osd-journal/
> > /var/lib/ceph/mon-data/
> >
> > but managing bind mounts or symlinks for journals seems error prone. TV's
> > now thinking we should just start ceph-osd with
> >
> > ceph-osd --osd-journal /somewhere/else -i $id
>
> ... I like this more, and i would even suggest to allow to start the
> daemon just like
>
> ceph-osd --osd-journal /somehwere --osd-data /somewhereelse --conf
> /etc/ceph/clustername.conf
>
> (config file is for the monitors)
>
> Configuration and determining which one(s) to start is up to our
> deployment tools (chef in our case).
Yeah. Explicitly specifying osd_data isn't strictly necessary if it
matches the default, but the deployment tool could anyway.
> Say that we duplicate a node, for some testing/failover/... I would not
> want to daemon to automatically start, just because the data is there...
I'm not sure if this is something we've looked at yet... TV?
sage
>
> Rgds,
> Bernard
> Openminds BVBA
>
>
> >
> > from upstart/whatever if we have a matching journal elsewhere.
> >
> > sage
> >
> >
> >
> >>>
> >>> Wido
> >>>
> >>>>
> >>>> In my case, this doesn't really matter, it is up to the provision software to make the needed symlinks/mounts.
> >>>>
> >>>> Rgds,
> >>>> Bernard
> >>>>
> >>>> On 05 Apr 2012, at 09:37, Andrey Korolyov wrote:
> >>>>
> >>>>> In ceph case, such layout breakage may be necessary in almost all
> >>>>> installations(except testing), comparing to almost all general-purpose
> >>>>> server software which need division like that only in very specific
> >>>>> setups.
> >>>>>
> >>>>> On Thu, Apr 5, 2012 at 11:28 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
> >>>>>> I feel it's up to the sysadmin to mount / symlink the correct storage devices on the correct paths - ceph should not be concerned that some volumes might need to sit together.
> >>>>>>
> >>>>>> Rgds,
> >>>>>> Bernard
> >>>>>>
> >>>>>> On 05 Apr 2012, at 09:12, Andrey Korolyov wrote:
> >>>>>>
> >>>>>>> Right, but probably we need journal separation at the directory level
> >>>>>>> by default, because there is a very small amount of cases when speed
> >>>>>>> of main storage is sufficient for journal or when resulting speed
> >>>>>>> decrease is not significant, so journal by default may go into
> >>>>>>> /var/lib/ceph/osd/journals/$i/journal where osd/journals mounted on
> >>>>>>> the fast disk.
> >>>>>>>
> >>>>>>> On Thu, Apr 5, 2012 at 10:57 AM, Bernard Grymonpon<bernard@openminds.be> wrote:
> >>>>>>>>
> >>>>>>>> On 05 Apr 2012, at 08:32, Sage Weil wrote:
> >>>>>>>>
> >>>>>>>>> We want to standardize the locations for ceph data directories, configs,
> >>>>>>>>> etc. We'd also like to allow a single host to run OSDs that participate
> >>>>>>>>> in multiple ceph clusters. We'd like easy to deal with names (i.e., avoid
> >>>>>>>>> UUIDs if we can).
> >>>>>>>>>
> >>>>>>>>> The metavariables are:
> >>>>>>>>> cluster = ceph (by default)
> >>>>>>>>> type = osd, mon, mds
> >>>>>>>>> id = 1, foo,
> >>>>>>>>> name = $type.$id = osd.0, mds.a, etc.
> >>>>>>>>>
> >>>>>>>>> The $cluster variable will come from the command line (--cluster foo) or,
> >>>>>>>>> in the case of a udev hotplug tool or something, matching the uuid on the
> >>>>>>>>> device with the 'fsid =<uuid>' line in the available config files found
> >>>>>>>>> in /etc/ceph.
> >>>>>>>>>
> >>>>>>>>> The locations could be:
> >>>>>>>>>
> >>>>>>>>> ceph config file:
> >>>>>>>>> /etc/ceph/$cluster.conf (default is thus ceph.conf)
> >>>>>>>>>
> >>>>>>>>> keyring:
> >>>>>>>>> /etc/ceph/$cluster.keyring (fallback to /etc/ceph/keyring)
> >>>>>>>>>
> >>>>>>>>> osd_data, mon_data:
> >>>>>>>>> /var/lib/ceph/$cluster.$name
> >>>>>>>>> /var/lib/ceph/$cluster/$name
> >>>>>>>>> /var/lib/ceph/data/$cluster.$name
> >>>>>>>>> /var/lib/ceph/$type-data/$cluster-$id
> >>>>>>>>>
> >>>>>>>>> TV and I talked about this today, and one thing we want is for items of a
> >>>>>>>>> given type to live together in separate directory so that we don't have to
> >>>>>>>>> do any filtering to, say, get all osd data directories. This suggests the
> >>>>>>>>> last option (/var/lib/ceph/osd-data/ceph-1,
> >>>>>>>>> /var/lib/ceph/mon-data/ceph-foo, etc.), but it's kind of fugly.
> >>>>>>>>>
> >>>>>>>>> Another option would be to make it
> >>>>>>>>>
> >>>>>>>>> /var/lib/ceph/$type-data/$id
> >>>>>>>>>
> >>>>>>>>> (with no $cluster) and make users override the default with something that
> >>>>>>>>> includes $cluster (or $fsid, or whatever) in their $cluster.conf if/when
> >>>>>>>>> they want multicluster nodes that don't interfere. Then we'd get
> >>>>>>>>> /var/lib/ceph/osd-data/1 for non-crazy people, which is pretty easy.
> >>>>>>>>
> >>>>>>>> As a osd consists of data and the journal, it should stay together, with all info for that one osd in one place:
> >>>>>>>>
> >>>>>>>> I would suggest
> >>>>>>>>
> >>>>>>>> /var/lib/ceph/osd/$id/data
> >>>>>>>> and
> >>>>>>>> /var/lib/ceph/osd/$id/journal
> >>>>>>>>
> >>>>>>>> ($id could be replaced by $uuid or $name, for which I would prefer $uuid)
> >>>>>>>>
> >>>>>>>> Rgds,
> >>>>>>>> Bernard
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Any other suggestions? Thoughts?
> >>>>>>>>> sage
> >>>>>>>>> --
> >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>> --
> >>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>>>
> >>>>>>
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>>
> >>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >>
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-04-05 16:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-05 6:32 defaults paths Sage Weil
2012-04-05 6:57 ` Bernard Grymonpon
2012-04-05 7:12 ` Andrey Korolyov
2012-04-05 7:28 ` Bernard Grymonpon
2012-04-05 7:37 ` Andrey Korolyov
2012-04-05 8:38 ` Bernard Grymonpon
2012-04-05 12:34 ` Wido den Hollander
2012-04-05 13:00 ` Bernard Grymonpon
2012-04-05 15:17 ` Sage Weil
2012-04-05 16:27 ` Bernard Grymonpon
2012-04-05 16:33 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.