All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes
@ 2017-06-16 18:11 Warren Wang - ISD
       [not found] ` <A5E80C1D-74F4-455B-8257-5B9E2FF6AB39-dFwxUrggiyBBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Warren Wang - ISD @ 2017-06-16 18:11 UTC (permalink / raw)
  To: Alfredo Deza, ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-devel

I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.

Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.

2 cents from one of the LVM for Ceph users,
Warren Wang
Walmart ✻

On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:

    Hello,
    
    At the last CDM [0] we talked about `ceph-lvm` and the ability to
    deploy OSDs from logical volumes. We have now an initial draft for the
    documentation [1] and would like some feedback.
    
    The important features for this new tool are:
    
    * parting ways with udev (new approach will rely on LVM functionality
    for discovery)
    * compatibility/migration for existing LVM volumes deployed as directories
    * dmcache support
    
    By documenting the API and workflows first we are making sure that
    those look fine before starting on actual development.
    
    It would be great to get some feedback, specially if you are currently
    using LVM with ceph (or planning to!).
    
    Please note that the documentation is not complete and is missing
    content on some parts.
    
    [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
    [1] http://docs.ceph.com/ceph-lvm/
    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes
       [not found] ` <A5E80C1D-74F4-455B-8257-5B9E2FF6AB39-dFwxUrggiyBBDgjK7y7TUQ@public.gmane.org>
@ 2017-06-16 18:23   ` Alfredo Deza
  2017-06-16 18:42     ` EXT: [ceph-users] " Sage Weil
       [not found]     ` <CAC-Np1yqCZO7CzEjTV+hrNve857BtM_rZ+LxAi6Vf9UJkPM04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 2 replies; 23+ messages in thread
From: Alfredo Deza @ 2017-06-16 18:23 UTC (permalink / raw)
  To: Warren Wang - ISD; +Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-devel

On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
<Warren.Wang@walmart.com> wrote:
> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>
> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.

Sage, you had mentioned the need for "composable" tools for this, and
I think that if we go with `ceph-volume` we could allow plugins for
each strategy. We are starting with `lvm` support so that would look
like: `ceph-volume lvm`

The `lvm` functionality could be implemented as a plugin itself, and
when we start working with supporting regular disks, then `ceph-volume
disk` can come along, etc...

It would also open the door for anyone to be able to write a plugin to
`ceph-volume` to implement their own logic, while at the same time
re-using most of what we are implementing today: logging, reporting,
systemd support, OSD metadata, etc...

If we were to separate these into single-purpose tools, all those
would need to be re-done.


>
> 2 cents from one of the LVM for Ceph users,
> Warren Wang
> Walmart ✻
>
> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>
>     Hello,
>
>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>     deploy OSDs from logical volumes. We have now an initial draft for the
>     documentation [1] and would like some feedback.
>
>     The important features for this new tool are:
>
>     * parting ways with udev (new approach will rely on LVM functionality
>     for discovery)
>     * compatibility/migration for existing LVM volumes deployed as directories
>     * dmcache support
>
>     By documenting the API and workflows first we are making sure that
>     those look fine before starting on actual development.
>
>     It would be great to get some feedback, specially if you are currently
>     using LVM with ceph (or planning to!).
>
>     Please note that the documentation is not complete and is missing
>     content on some parts.
>
>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>     [1] http://docs.ceph.com/ceph-lvm/
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@lists.ceph.com
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-16 18:23   ` Alfredo Deza
@ 2017-06-16 18:42     ` Sage Weil
       [not found]     ` <CAC-Np1yqCZO7CzEjTV+hrNve857BtM_rZ+LxAi6Vf9UJkPM04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 0 replies; 23+ messages in thread
From: Sage Weil @ 2017-06-16 18:42 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: Warren Wang - ISD, ceph-users, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3033 bytes --]

On Fri, 16 Jun 2017, Alfredo Deza wrote:
> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
> <Warren.Wang@walmart.com> wrote:
> > I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
> >
> > Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
> 
> Sage, you had mentioned the need for "composable" tools for this, and
> I think that if we go with `ceph-volume` we could allow plugins for
> each strategy. We are starting with `lvm` support so that would look
> like: `ceph-volume lvm`
> 
> The `lvm` functionality could be implemented as a plugin itself, and
> when we start working with supporting regular disks, then `ceph-volume
> disk` can come along, etc...
> 
> It would also open the door for anyone to be able to write a plugin to
> `ceph-volume` to implement their own logic, while at the same time
> re-using most of what we are implementing today: logging, reporting,
> systemd support, OSD metadata, etc...
> 
> If we were to separate these into single-purpose tools, all those
> would need to be re-done.

Yeah, this sounds great to me!

sage


> 
> 
> >
> > 2 cents from one of the LVM for Ceph users,
> > Warren Wang
> > Walmart ✻
> >
> > On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
> >
> >     Hello,
> >
> >     At the last CDM [0] we talked about `ceph-lvm` and the ability to
> >     deploy OSDs from logical volumes. We have now an initial draft for the
> >     documentation [1] and would like some feedback.
> >
> >     The important features for this new tool are:
> >
> >     * parting ways with udev (new approach will rely on LVM functionality
> >     for discovery)
> >     * compatibility/migration for existing LVM volumes deployed as directories
> >     * dmcache support
> >
> >     By documenting the API and workflows first we are making sure that
> >     those look fine before starting on actual development.
> >
> >     It would be great to get some feedback, specially if you are currently
> >     using LVM with ceph (or planning to!).
> >
> >     Please note that the documentation is not complete and is missing
> >     content on some parts.
> >
> >     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
> >     [1] http://docs.ceph.com/ceph-lvm/
> >     _______________________________________________
> >     ceph-users mailing list
> >     ceph-users@lists.ceph.com
> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes
       [not found]     ` <CAC-Np1yqCZO7CzEjTV+hrNve857BtM_rZ+LxAi6Vf9UJkPM04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-16 19:37       ` Willem Jan Withagen
  2017-06-19 13:27       ` John Spray
  1 sibling, 0 replies; 23+ messages in thread
From: Willem Jan Withagen @ 2017-06-16 19:37 UTC (permalink / raw)
  To: Alfredo Deza, Warren Wang - ISD
  Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-devel

On 16-6-2017 20:23, Alfredo Deza wrote:
> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
> <Warren.Wang@walmart.com> wrote:
>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>
>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
> 
> Sage, you had mentioned the need for "composable" tools for this, and
> I think that if we go with `ceph-volume` we could allow plugins for
> each strategy. We are starting with `lvm` support so that would look
> like: `ceph-volume lvm`
> 
> The `lvm` functionality could be implemented as a plugin itself, and
> when we start working with supporting regular disks, then `ceph-volume
> disk` can come along, etc...
> 
> It would also open the door for anyone to be able to write a plugin to
> `ceph-volume` to implement their own logic, while at the same time
> re-using most of what we are implementing today: logging, reporting,
> systemd support, OSD metadata, etc...
> 
> If we were to separate these into single-purpose tools, all those
> would need to be re-done.

Looking at my current work I did on ceph-disk for FreeBSD, it starts to
look like something like `ceph-volume zfs`. It would make porting a lot
more manageble. But the composable decomposition needs to be at a level
high enough that as little as possible of linux-isms seep through.
The one that springs to mind in this is: encryption.

So +1

--WjW
> 
> 
>>
>> 2 cents from one of the LVM for Ceph users,
>> Warren Wang
>> Walmart ✻
>>
>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>
>>     Hello,
>>
>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>     documentation [1] and would like some feedback.
>>
>>     The important features for this new tool are:
>>
>>     * parting ways with udev (new approach will rely on LVM functionality
>>     for discovery)
>>     * compatibility/migration for existing LVM volumes deployed as directories
>>     * dmcache support
>>
>>     By documenting the API and workflows first we are making sure that
>>     those look fine before starting on actual development.
>>
>>     It would be great to get some feedback, specially if you are currently
>>     using LVM with ceph (or planning to!).
>>
>>     Please note that the documentation is not complete and is missing
>>     content on some parts.
>>
>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>     [1] http://docs.ceph.com/ceph-lvm/
>>     _______________________________________________
>>     ceph-users mailing list
>>     ceph-users@lists.ceph.com
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes
       [not found]     ` <CAC-Np1yqCZO7CzEjTV+hrNve857BtM_rZ+LxAi6Vf9UJkPM04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-06-16 19:37       ` EXT: " Willem Jan Withagen
@ 2017-06-19 13:27       ` John Spray
       [not found]         ` <CALe9h7cV6U3A_OT9R8tv_yPoGG9zFaWF3qXV5cwYK0KM-NDu4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: John Spray @ 2017-06-19 13:27 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw

On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
> <Warren.Wang@walmart.com> wrote:
>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>
>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
>
> Sage, you had mentioned the need for "composable" tools for this, and
> I think that if we go with `ceph-volume` we could allow plugins for
> each strategy. We are starting with `lvm` support so that would look
> like: `ceph-volume lvm`
>
> The `lvm` functionality could be implemented as a plugin itself, and
> when we start working with supporting regular disks, then `ceph-volume
> disk` can come along, etc...
>
> It would also open the door for anyone to be able to write a plugin to
> `ceph-volume` to implement their own logic, while at the same time
> re-using most of what we are implementing today: logging, reporting,
> systemd support, OSD metadata, etc...
>
> If we were to separate these into single-purpose tools, all those
> would need to be re-done.

Couple of thoughts:
 - let's keep this in the Ceph repository unless there's a strong
reason not to -- it'll enable the tool's branching to automatically
happen in line with Ceph's.
 - I agree with others that a single entrypoint (i.e. executable) will
be more manageable than having conspicuously separate tools, but we
shouldn't worry too much about making things "plugins" as such -- they
can just be distinct code inside one tool, sharing as much or as
little as they need.

What if we delivered this set of LVM functionality as "ceph-disk lvm
..." commands to minimise the impression that the tooling is changing,
even if internally it's all new/distinct code?

At the risk of being a bit picky about language, I don't like calling
this anything with "volume" in the name, because afaik we've never
ever called OSDs or the drives they occupy "volumes", so we're
introducing a whole new noun, and a widely used (to mean different
things) one at that.

John

>
>
>>
>> 2 cents from one of the LVM for Ceph users,
>> Warren Wang
>> Walmart ✻
>>
>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>
>>     Hello,
>>
>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>     documentation [1] and would like some feedback.
>>
>>     The important features for this new tool are:
>>
>>     * parting ways with udev (new approach will rely on LVM functionality
>>     for discovery)
>>     * compatibility/migration for existing LVM volumes deployed as directories
>>     * dmcache support
>>
>>     By documenting the API and workflows first we are making sure that
>>     those look fine before starting on actual development.
>>
>>     It would be great to get some feedback, specially if you are currently
>>     using LVM with ceph (or planning to!).
>>
>>     Please note that the documentation is not complete and is missing
>>     content on some parts.
>>
>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>     [1] http://docs.ceph.com/ceph-lvm/
>>     _______________________________________________
>>     ceph-users mailing list
>>     ceph-users@lists.ceph.com
>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes
       [not found]         ` <CALe9h7cV6U3A_OT9R8tv_yPoGG9zFaWF3qXV5cwYK0KM-NDu4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-19 14:13           ` Alfredo Deza
       [not found]             ` <CAC-Np1yiRgkmhZCOij9qSBmqUo-YYtErWXk2gevYuvWKrYFyeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-06-19 16:55             ` John Spray
  0 siblings, 2 replies; 23+ messages in thread
From: Alfredo Deza @ 2017-06-19 14:13 UTC (permalink / raw)
  To: John Spray; +Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw

On Mon, Jun 19, 2017 at 9:27 AM, John Spray <jspray@redhat.com> wrote:
> On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
>> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
>> <Warren.Wang@walmart.com> wrote:
>>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>>
>>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
>>
>> Sage, you had mentioned the need for "composable" tools for this, and
>> I think that if we go with `ceph-volume` we could allow plugins for
>> each strategy. We are starting with `lvm` support so that would look
>> like: `ceph-volume lvm`
>>
>> The `lvm` functionality could be implemented as a plugin itself, and
>> when we start working with supporting regular disks, then `ceph-volume
>> disk` can come along, etc...
>>
>> It would also open the door for anyone to be able to write a plugin to
>> `ceph-volume` to implement their own logic, while at the same time
>> re-using most of what we are implementing today: logging, reporting,
>> systemd support, OSD metadata, etc...
>>
>> If we were to separate these into single-purpose tools, all those
>> would need to be re-done.
>
> Couple of thoughts:
>  - let's keep this in the Ceph repository unless there's a strong
> reason not to -- it'll enable the tool's branching to automatically
> happen in line with Ceph's.

For initial development this is easier to have as a separate tool from
the Ceph source tree. There are some niceties about being in-source,
like
not being required to deal with what features we are supporting on what version.

Although there is no code yet, I consider the project in an "unstable"
state, it will move incredibly fast (it has to!) and that puts it at
odds with the cadence
of Ceph. Specifically, these two things are very important right now:

* faster release cycles
* easier and faster to test

I am not ruling out going into Ceph at some point though, ideally when
things slow down and become stable.

Is your argument only to have parity in Ceph's branching? That was
never a problem with out-of-tree tools like ceph-deploy for example.

>  - I agree with others that a single entrypoint (i.e. executable) will
> be more manageable than having conspicuously separate tools, but we
> shouldn't worry too much about making things "plugins" as such -- they
> can just be distinct code inside one tool, sharing as much or as
> little as they need.
>
> What if we delivered this set of LVM functionality as "ceph-disk lvm
> ..." commands to minimise the impression that the tooling is changing,
> even if internally it's all new/distinct code?

That sounded appealing initially, but because we are introducing a
very different API, it would look odd to interact
with other subcommands without a normalized interaction. For example,
for 'prepare' this would be:

ceph-disk prepare [...]

And for LVM it would possible be

ceph-disk lvm prepare [...]

The level at which these similar actions are presented imply that one
may be a preferred (or even default) one, while the other one
isn't.

At one point we are going to add regular disk worfklows (replacing
ceph-disk functionality) and then it would become even more
confusing to keep it there (or do you think at that point we could split?)

>
> At the risk of being a bit picky about language, I don't like calling
> this anything with "volume" in the name, because afaik we've never
> ever called OSDs or the drives they occupy "volumes", so we're
> introducing a whole new noun, and a widely used (to mean different
> things) one at that.
>

We have never called them 'volumes' because there was never anything
to support something other than regular disks, the approach
has always been disks and partitions.

A "volume" can be a physical volume (e.g. a disk) or a logical one
(lvm, dmcache). It is an all-encompassing name to allow different
device-like to work with.


> John
>
>>
>>
>>>
>>> 2 cents from one of the LVM for Ceph users,
>>> Warren Wang
>>> Walmart ✻
>>>
>>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>>
>>>     Hello,
>>>
>>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>>     documentation [1] and would like some feedback.
>>>
>>>     The important features for this new tool are:
>>>
>>>     * parting ways with udev (new approach will rely on LVM functionality
>>>     for discovery)
>>>     * compatibility/migration for existing LVM volumes deployed as directories
>>>     * dmcache support
>>>
>>>     By documenting the API and workflows first we are making sure that
>>>     those look fine before starting on actual development.
>>>
>>>     It would be great to get some feedback, specially if you are currently
>>>     using LVM with ceph (or planning to!).
>>>
>>>     Please note that the documentation is not complete and is missing
>>>     content on some parts.
>>>
>>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>>     [1] http://docs.ceph.com/ceph-lvm/
>>>     _______________________________________________
>>>     ceph-users mailing list
>>>     ceph-users@lists.ceph.com
>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes
       [not found]             ` <CAC-Np1yiRgkmhZCOij9qSBmqUo-YYtErWXk2gevYuvWKrYFyeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-19 15:37               ` Willem Jan Withagen
  2017-06-19 17:57                 ` EXT: [ceph-users] " Alfredo Deza
  0 siblings, 1 reply; 23+ messages in thread
From: Willem Jan Withagen @ 2017-06-19 15:37 UTC (permalink / raw)
  To: Alfredo Deza, John Spray; +Cc: ceph-devel, ceph-users-idqoXFIVOFJgJs9I8MT0rw

On 19-6-2017 16:13, Alfredo Deza wrote:
> On Mon, Jun 19, 2017 at 9:27 AM, John Spray <jspray@redhat.com> wrote:
>> On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
>>> <Warren.Wang@walmart.com> wrote:
>>>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>>>
>>>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
>>>
>>> Sage, you had mentioned the need for "composable" tools for this, and
>>> I think that if we go with `ceph-volume` we could allow plugins for
>>> each strategy. We are starting with `lvm` support so that would look
>>> like: `ceph-volume lvm`
>>>
>>> The `lvm` functionality could be implemented as a plugin itself, and
>>> when we start working with supporting regular disks, then `ceph-volume
>>> disk` can come along, etc...
>>>
>>> It would also open the door for anyone to be able to write a plugin to
>>> `ceph-volume` to implement their own logic, while at the same time
>>> re-using most of what we are implementing today: logging, reporting,
>>> systemd support, OSD metadata, etc...
>>>
>>> If we were to separate these into single-purpose tools, all those
>>> would need to be re-done.
>>
>> Couple of thoughts:
>>  - let's keep this in the Ceph repository unless there's a strong
>> reason not to -- it'll enable the tool's branching to automatically
>> happen in line with Ceph's.
> 
> For initial development this is easier to have as a separate tool from
> the Ceph source tree. There are some niceties about being in-source,
> like
> not being required to deal with what features we are supporting on what version.

Just my observation, need not be true at all, but ...

As long as you do not have it interact with the other tools, that is
true. But as soon as you start depending on ceph-{disk-new,volume} in
other parts of the mainstream ceph-code you have created a ty-in with
the versioning and will require it to be maintained in the same way.


> Although there is no code yet, I consider the project in an "unstable"
> state, it will move incredibly fast (it has to!) and that puts it at
> odds with the cadence
> of Ceph. Specifically, these two things are very important right now:
> 
> * faster release cycles
> * easier and faster to test
> 
> I am not ruling out going into Ceph at some point though, ideally when
> things slow down and become stable.
> 
> Is your argument only to have parity in Ceph's branching? That was
> never a problem with out-of-tree tools like ceph-deploy for example.

Some of the external targets move so fast (ceph-asible) that I have
given up on trying to see what is going on. For this tool I'd like it to
do the ZFS/FreeBSD stuff as a plugin-module.
In the expectation that it will supersede the current ceph-disk,
otherwise there are 2 place to maintain this type of code.

>>  - I agree with others that a single entrypoint (i.e. executable) will
>> be more manageable than having conspicuously separate tools, but we
>> shouldn't worry too much about making things "plugins" as such -- they
>> can just be distinct code inside one tool, sharing as much or as
>> little as they need.
>>
>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>> ..." commands to minimise the impression that the tooling is changing,
>> even if internally it's all new/distinct code?
> 
> That sounded appealing initially, but because we are introducing a
> very different API, it would look odd to interact
> with other subcommands without a normalized interaction. For example,
> for 'prepare' this would be:
> 
> ceph-disk prepare [...]
> 
> And for LVM it would possible be
> 
> ceph-disk lvm prepare [...]
> 
> The level at which these similar actions are presented imply that one
> may be a preferred (or even default) one, while the other one
> isn't.

Is this about API "cosmetics"? Because there is a lot of examples
suggestions and other stuff out there that is using the old syntax.

And why not do a hybrid? it will require a bit more commandline parsing,
but that is not a major dealbreaker.

so the line would look like
    ceph-disk [lvm,zfs,disk,partition] prepare [...]
and the first parameter is optional reverting to the current supported
systems.

You can always start warning users that their API usage is old style,
and that it is going to go away in a next release.

> At one point we are going to add regular disk worfklows (replacing
> ceph-disk functionality) and then it would become even more
> confusing to keep it there (or do you think at that point we could split?)

The more separate you go, the more akward it is going to be when things
start to melt together.

>> At the risk of being a bit picky about language, I don't like calling
>> this anything with "volume" in the name, because afaik we've never
>> ever called OSDs or the drives they occupy "volumes", so we're
>> introducing a whole new noun, and a widely used (to mean different
>> things) one at that.
>>
> 
> We have never called them 'volumes' because there was never anything
> to support something other than regular disks, the approach
> has always been disks and partitions.
> 
> A "volume" can be a physical volume (e.g. a disk) or a logical one
> (lvm, dmcache). It is an all-encompassing name to allow different
> device-like to work with.

ZFS talks about volumes, vdev, partitions, .... and perhaps even more.
Being picky: ceph-disk now also works on pre-build trees to build filestore.

I would just try to glue it into ceph-disk in the most flexible way
possible.
--WjW

> 
> 
>> John
>>
>>>
>>>
>>>>
>>>> 2 cents from one of the LVM for Ceph users,
>>>> Warren Wang
>>>> Walmart ✻
>>>>
>>>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>>>
>>>>     Hello,
>>>>
>>>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>>>     documentation [1] and would like some feedback.
>>>>
>>>>     The important features for this new tool are:
>>>>
>>>>     * parting ways with udev (new approach will rely on LVM functionality
>>>>     for discovery)
>>>>     * compatibility/migration for existing LVM volumes deployed as directories
>>>>     * dmcache support
>>>>
>>>>     By documenting the API and workflows first we are making sure that
>>>>     those look fine before starting on actual development.
>>>>
>>>>     It would be great to get some feedback, specially if you are currently
>>>>     using LVM with ceph (or planning to!).
>>>>
>>>>     Please note that the documentation is not complete and is missing
>>>>     content on some parts.
>>>>
>>>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>>>     [1] http://docs.ceph.com/ceph-lvm/
>>>>     _______________________________________________
>>>>     ceph-users mailing list
>>>>     ceph-users@lists.ceph.com
>>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 14:13           ` Alfredo Deza
       [not found]             ` <CAC-Np1yiRgkmhZCOij9qSBmqUo-YYtErWXk2gevYuvWKrYFyeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-19 16:55             ` John Spray
  2017-06-19 17:53               ` Alfredo Deza
  1 sibling, 1 reply; 23+ messages in thread
From: John Spray @ 2017-06-19 16:55 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: Warren Wang - ISD, ceph-users, ceph-devel

On Mon, Jun 19, 2017 at 3:13 PM, Alfredo Deza <adeza@redhat.com> wrote:
> On Mon, Jun 19, 2017 at 9:27 AM, John Spray <jspray@redhat.com> wrote:
>> On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
>>> <Warren.Wang@walmart.com> wrote:
>>>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>>>
>>>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
>>>
>>> Sage, you had mentioned the need for "composable" tools for this, and
>>> I think that if we go with `ceph-volume` we could allow plugins for
>>> each strategy. We are starting with `lvm` support so that would look
>>> like: `ceph-volume lvm`
>>>
>>> The `lvm` functionality could be implemented as a plugin itself, and
>>> when we start working with supporting regular disks, then `ceph-volume
>>> disk` can come along, etc...
>>>
>>> It would also open the door for anyone to be able to write a plugin to
>>> `ceph-volume` to implement their own logic, while at the same time
>>> re-using most of what we are implementing today: logging, reporting,
>>> systemd support, OSD metadata, etc...
>>>
>>> If we were to separate these into single-purpose tools, all those
>>> would need to be re-done.
>>
>> Couple of thoughts:
>>  - let's keep this in the Ceph repository unless there's a strong
>> reason not to -- it'll enable the tool's branching to automatically
>> happen in line with Ceph's.
>
> For initial development this is easier to have as a separate tool from
> the Ceph source tree. There are some niceties about being in-source,
> like
> not being required to deal with what features we are supporting on what version.
>
> Although there is no code yet, I consider the project in an "unstable"
> state, it will move incredibly fast (it has to!) and that puts it at
> odds with the cadence
> of Ceph. Specifically, these two things are very important right now:
>
> * faster release cycles
> * easier and faster to test

I think having one part of Ceph on a different release cycle to the
rest of Ceph is an even more dramatic thing than having it in a
separate git repository.

It seems like there is some dissatisfaction with how the Ceph project
as whole is doing things that is driving you to try and do work
outside of the repo where the rest of the project lives -- if the
release cycles or test infrastructure within Ceph are not adequate for
the tool that formats drives for OSDs, what can we do to fix them?

> I am not ruling out going into Ceph at some point though, ideally when
> things slow down and become stable.

I think that the decision about where this code lives needs to be made
before it is released -- moving it later is rather awkward.  If you'd
rather not have the code in Ceph master until you're happy with it,
then a branch would be the natural way to do that.

> Is your argument only to have parity in Ceph's branching? That was
> never a problem with out-of-tree tools like ceph-deploy for example.

I guess my argument isn't so much an argument as it is an assertion
that if you want to go your own way then you need to have a really
strong clear reason.

Put a bit bluntly: if CephFS, RBD, RGW, the mon and the OSD can all
successfully co-habit in one git repository, what makes the CLI that
formats drives so special that it needs its own?

>>  - I agree with others that a single entrypoint (i.e. executable) will
>> be more manageable than having conspicuously separate tools, but we
>> shouldn't worry too much about making things "plugins" as such -- they
>> can just be distinct code inside one tool, sharing as much or as
>> little as they need.
>>
>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>> ..." commands to minimise the impression that the tooling is changing,
>> even if internally it's all new/distinct code?
>
> That sounded appealing initially, but because we are introducing a
> very different API, it would look odd to interact
> with other subcommands without a normalized interaction. For example,
> for 'prepare' this would be:
>
> ceph-disk prepare [...]
>
> And for LVM it would possible be
>
> ceph-disk lvm prepare [...]
>
> The level at which these similar actions are presented imply that one
> may be a preferred (or even default) one, while the other one
> isn't.
>
> At one point we are going to add regular disk worfklows (replacing
> ceph-disk functionality) and then it would become even more
> confusing to keep it there (or do you think at that point we could split?)
>
>>
>> At the risk of being a bit picky about language, I don't like calling
>> this anything with "volume" in the name, because afaik we've never
>> ever called OSDs or the drives they occupy "volumes", so we're
>> introducing a whole new noun, and a widely used (to mean different
>> things) one at that.
>>
>
> We have never called them 'volumes' because there was never anything
> to support something other than regular disks, the approach
> has always been disks and partitions.
>
> A "volume" can be a physical volume (e.g. a disk) or a logical one
> (lvm, dmcache). It is an all-encompassing name to allow different
> device-like to work with.

The trouble with "volume" is that it means so many things in so many
different storage systems -- I haven't often seen it used to mean
"block device" or "drive".  It's more often used to describe a logical
entity.  I also think "disk" is fine -- most people get the idea that
a disk is a hard drive but it could also be any block device.

John

>
>
>> John
>>
>>>
>>>
>>>>
>>>> 2 cents from one of the LVM for Ceph users,
>>>> Warren Wang
>>>> Walmart ✻
>>>>
>>>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>>>
>>>>     Hello,
>>>>
>>>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>>>     documentation [1] and would like some feedback.
>>>>
>>>>     The important features for this new tool are:
>>>>
>>>>     * parting ways with udev (new approach will rely on LVM functionality
>>>>     for discovery)
>>>>     * compatibility/migration for existing LVM volumes deployed as directories
>>>>     * dmcache support
>>>>
>>>>     By documenting the API and workflows first we are making sure that
>>>>     those look fine before starting on actual development.
>>>>
>>>>     It would be great to get some feedback, specially if you are currently
>>>>     using LVM with ceph (or planning to!).
>>>>
>>>>     Please note that the documentation is not complete and is missing
>>>>     content on some parts.
>>>>
>>>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>>>     [1] http://docs.ceph.com/ceph-lvm/
>>>>     _______________________________________________
>>>>     ceph-users mailing list
>>>>     ceph-users@lists.ceph.com
>>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 16:55             ` John Spray
@ 2017-06-19 17:53               ` Alfredo Deza
  2017-06-19 18:41                 ` Andrew Schoen
  2017-06-19 20:24                 ` Fwd: " John Spray
  0 siblings, 2 replies; 23+ messages in thread
From: Alfredo Deza @ 2017-06-19 17:53 UTC (permalink / raw)
  To: John Spray; +Cc: Warren Wang - ISD, ceph-users, ceph-devel

On Mon, Jun 19, 2017 at 12:55 PM, John Spray <jspray@redhat.com> wrote:
> On Mon, Jun 19, 2017 at 3:13 PM, Alfredo Deza <adeza@redhat.com> wrote:
>> On Mon, Jun 19, 2017 at 9:27 AM, John Spray <jspray@redhat.com> wrote:
>>> On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>>> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
>>>> <Warren.Wang@walmart.com> wrote:
>>>>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>>>>
>>>>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
>>>>
>>>> Sage, you had mentioned the need for "composable" tools for this, and
>>>> I think that if we go with `ceph-volume` we could allow plugins for
>>>> each strategy. We are starting with `lvm` support so that would look
>>>> like: `ceph-volume lvm`
>>>>
>>>> The `lvm` functionality could be implemented as a plugin itself, and
>>>> when we start working with supporting regular disks, then `ceph-volume
>>>> disk` can come along, etc...
>>>>
>>>> It would also open the door for anyone to be able to write a plugin to
>>>> `ceph-volume` to implement their own logic, while at the same time
>>>> re-using most of what we are implementing today: logging, reporting,
>>>> systemd support, OSD metadata, etc...
>>>>
>>>> If we were to separate these into single-purpose tools, all those
>>>> would need to be re-done.
>>>
>>> Couple of thoughts:
>>>  - let's keep this in the Ceph repository unless there's a strong
>>> reason not to -- it'll enable the tool's branching to automatically
>>> happen in line with Ceph's.
>>
>> For initial development this is easier to have as a separate tool from
>> the Ceph source tree. There are some niceties about being in-source,
>> like
>> not being required to deal with what features we are supporting on what version.
>>
>> Although there is no code yet, I consider the project in an "unstable"
>> state, it will move incredibly fast (it has to!) and that puts it at
>> odds with the cadence
>> of Ceph. Specifically, these two things are very important right now:
>>
>> * faster release cycles
>> * easier and faster to test
>
> I think having one part of Ceph on a different release cycle to the
> rest of Ceph is an even more dramatic thing than having it in a
> separate git repository.
>
> It seems like there is some dissatisfaction with how the Ceph project
> as whole is doing things that is driving you to try and do work
> outside of the repo where the rest of the project lives -- if the
> release cycles or test infrastructure within Ceph are not adequate for
> the tool that formats drives for OSDs, what can we do to fix them?

It isn't Ceph the project :)

Not every tool about Ceph has to come from ceph.git, in which case the
argument could be flipped around: why isn't ceph-installer,
ceph-ansible, ceph-deploy, radosgw-agent, etc... all coming from
within ceph.git ?

They don't necessarily need to be tied in. In the case of
ceph-installer: there is nothing ceph-specific it needs from ceph.git
to run, why force it in?

>
>> I am not ruling out going into Ceph at some point though, ideally when
>> things slow down and become stable.
>
> I think that the decision about where this code lives needs to be made
> before it is released -- moving it later is rather awkward.  If you'd
> rather not have the code in Ceph master until you're happy with it,
> then a branch would be the natural way to do that.
>

The decision was made a few weeks ago, and I really don't think we
should be in ceph.git, but I am OK to keep
discussing on the reasoning.

>> Is your argument only to have parity in Ceph's branching? That was
>> never a problem with out-of-tree tools like ceph-deploy for example.
>
> I guess my argument isn't so much an argument as it is an assertion
> that if you want to go your own way then you need to have a really
> strong clear reason.

Many! Like I mentioned: easier testing, faster release cycle, can
publish in any package index, doesn't need anything in ceph.git to
operate, etc..

>
> Put a bit bluntly: if CephFS, RBD, RGW, the mon and the OSD can all
> successfully co-habit in one git repository, what makes the CLI that
> formats drives so special that it needs its own?

Sure. Again, there is nothing some of our tooling needs from ceph.git
so I don't see why the need to have then in-tree. I am sure RGW and
other
components do need to consume Ceph code in some way? I don't even
think ceph-disk should be in tree for the same reason. I believe that
in the very
beginning it was just so easy to have everything be built from ceph.git

Even in some cases like pybind, it has been requested numerous times
to get them on separate package indexes like PyPI, but that has always
been
*tremendously* difficult: http://tracker.ceph.com/issues/5900


>
>>>  - I agree with others that a single entrypoint (i.e. executable) will
>>> be more manageable than having conspicuously separate tools, but we
>>> shouldn't worry too much about making things "plugins" as such -- they
>>> can just be distinct code inside one tool, sharing as much or as
>>> little as they need.
>>>
>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>>> ..." commands to minimise the impression that the tooling is changing,
>>> even if internally it's all new/distinct code?
>>
>> That sounded appealing initially, but because we are introducing a
>> very different API, it would look odd to interact
>> with other subcommands without a normalized interaction. For example,
>> for 'prepare' this would be:
>>
>> ceph-disk prepare [...]
>>
>> And for LVM it would possible be
>>
>> ceph-disk lvm prepare [...]
>>
>> The level at which these similar actions are presented imply that one
>> may be a preferred (or even default) one, while the other one
>> isn't.
>>
>> At one point we are going to add regular disk worfklows (replacing
>> ceph-disk functionality) and then it would become even more
>> confusing to keep it there (or do you think at that point we could split?)
>>
>>>
>>> At the risk of being a bit picky about language, I don't like calling
>>> this anything with "volume" in the name, because afaik we've never
>>> ever called OSDs or the drives they occupy "volumes", so we're
>>> introducing a whole new noun, and a widely used (to mean different
>>> things) one at that.
>>>
>>
>> We have never called them 'volumes' because there was never anything
>> to support something other than regular disks, the approach
>> has always been disks and partitions.
>>
>> A "volume" can be a physical volume (e.g. a disk) or a logical one
>> (lvm, dmcache). It is an all-encompassing name to allow different
>> device-like to work with.
>
> The trouble with "volume" is that it means so many things in so many
> different storage systems -- I haven't often seen it used to mean
> "block device" or "drive".  It's more often used to describe a logical
> entity.  I also think "disk" is fine -- most people get the idea that
> a disk is a hard drive but it could also be any block device.

If your thinking is that a disk can be any block device then yes, we
are at opposite ends here of our naming. We are picking a
"widely used" term because it is not specific. "disk" sounds fairly
specific, and we don't want that.


>
> John
>
>>
>>
>>> John
>>>
>>>>
>>>>
>>>>>
>>>>> 2 cents from one of the LVM for Ceph users,
>>>>> Warren Wang
>>>>> Walmart ✻
>>>>>
>>>>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>>>>
>>>>>     Hello,
>>>>>
>>>>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>>>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>>>>     documentation [1] and would like some feedback.
>>>>>
>>>>>     The important features for this new tool are:
>>>>>
>>>>>     * parting ways with udev (new approach will rely on LVM functionality
>>>>>     for discovery)
>>>>>     * compatibility/migration for existing LVM volumes deployed as directories
>>>>>     * dmcache support
>>>>>
>>>>>     By documenting the API and workflows first we are making sure that
>>>>>     those look fine before starting on actual development.
>>>>>
>>>>>     It would be great to get some feedback, specially if you are currently
>>>>>     using LVM with ceph (or planning to!).
>>>>>
>>>>>     Please note that the documentation is not complete and is missing
>>>>>     content on some parts.
>>>>>
>>>>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>>>>     [1] http://docs.ceph.com/ceph-lvm/
>>>>>     _______________________________________________
>>>>>     ceph-users mailing list
>>>>>     ceph-users@lists.ceph.com
>>>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 15:37               ` Willem Jan Withagen
@ 2017-06-19 17:57                 ` Alfredo Deza
  2017-06-19 18:57                   ` Willem Jan Withagen
  0 siblings, 1 reply; 23+ messages in thread
From: Alfredo Deza @ 2017-06-19 17:57 UTC (permalink / raw)
  To: Willem Jan Withagen; +Cc: John Spray, Warren Wang - ISD, ceph-users, ceph-devel

On Mon, Jun 19, 2017 at 11:37 AM, Willem Jan Withagen <wjw@digiware.nl> wrote:
> On 19-6-2017 16:13, Alfredo Deza wrote:
>> On Mon, Jun 19, 2017 at 9:27 AM, John Spray <jspray@redhat.com> wrote:
>>> On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>>> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
>>>> <Warren.Wang@walmart.com> wrote:
>>>>> I would prefer that this is something more generic, to possibly support other backends one day, like ceph-volume. Creating one tool per backend seems silly.
>>>>>
>>>>> Also, ceph-lvm seems to imply that ceph itself has something to do with lvm, which it really doesn’t. This is simply to deal with the underlying disk. If there’s resistance to something more generic like ceph-volume, then it should at least be called something like ceph-disk-lvm.
>>>>
>>>> Sage, you had mentioned the need for "composable" tools for this, and
>>>> I think that if we go with `ceph-volume` we could allow plugins for
>>>> each strategy. We are starting with `lvm` support so that would look
>>>> like: `ceph-volume lvm`
>>>>
>>>> The `lvm` functionality could be implemented as a plugin itself, and
>>>> when we start working with supporting regular disks, then `ceph-volume
>>>> disk` can come along, etc...
>>>>
>>>> It would also open the door for anyone to be able to write a plugin to
>>>> `ceph-volume` to implement their own logic, while at the same time
>>>> re-using most of what we are implementing today: logging, reporting,
>>>> systemd support, OSD metadata, etc...
>>>>
>>>> If we were to separate these into single-purpose tools, all those
>>>> would need to be re-done.
>>>
>>> Couple of thoughts:
>>>  - let's keep this in the Ceph repository unless there's a strong
>>> reason not to -- it'll enable the tool's branching to automatically
>>> happen in line with Ceph's.
>>
>> For initial development this is easier to have as a separate tool from
>> the Ceph source tree. There are some niceties about being in-source,
>> like
>> not being required to deal with what features we are supporting on what version.
>
> Just my observation, need not be true at all, but ...
>
> As long as you do not have it interact with the other tools, that is
> true. But as soon as you start depending on ceph-{disk-new,volume} in
> other parts of the mainstream ceph-code you have created a ty-in with
> the versioning and will require it to be maintained in the same way.
>
>
>> Although there is no code yet, I consider the project in an "unstable"
>> state, it will move incredibly fast (it has to!) and that puts it at
>> odds with the cadence
>> of Ceph. Specifically, these two things are very important right now:
>>
>> * faster release cycles
>> * easier and faster to test
>>
>> I am not ruling out going into Ceph at some point though, ideally when
>> things slow down and become stable.
>>
>> Is your argument only to have parity in Ceph's branching? That was
>> never a problem with out-of-tree tools like ceph-deploy for example.
>
> Some of the external targets move so fast (ceph-asible) that I have
> given up on trying to see what is going on. For this tool I'd like it to
> do the ZFS/FreeBSD stuff as a plugin-module.
> In the expectation that it will supersede the current ceph-disk,
> otherwise there are 2 place to maintain this type of code.

Yes, the idea is that it will be pluggable from the start, and that it
will supersede current ceph-disk (but not immediately)

>
>>>  - I agree with others that a single entrypoint (i.e. executable) will
>>> be more manageable than having conspicuously separate tools, but we
>>> shouldn't worry too much about making things "plugins" as such -- they
>>> can just be distinct code inside one tool, sharing as much or as
>>> little as they need.
>>>
>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>>> ..." commands to minimise the impression that the tooling is changing,
>>> even if internally it's all new/distinct code?
>>
>> That sounded appealing initially, but because we are introducing a
>> very different API, it would look odd to interact
>> with other subcommands without a normalized interaction. For example,
>> for 'prepare' this would be:
>>
>> ceph-disk prepare [...]
>>
>> And for LVM it would possible be
>>
>> ceph-disk lvm prepare [...]
>>
>> The level at which these similar actions are presented imply that one
>> may be a preferred (or even default) one, while the other one
>> isn't.
>
> Is this about API "cosmetics"? Because there is a lot of examples
> suggestions and other stuff out there that is using the old syntax.
>
> And why not do a hybrid? it will require a bit more commandline parsing,
> but that is not a major dealbreaker.
>
> so the line would look like
>     ceph-disk [lvm,zfs,disk,partition] prepare [...]
> and the first parameter is optional reverting to the current supported
> systems.
>
> You can always start warning users that their API usage is old style,
> and that it is going to go away in a next release.
>
>> At one point we are going to add regular disk worfklows (replacing
>> ceph-disk functionality) and then it would become even more
>> confusing to keep it there (or do you think at that point we could split?)
>
> The more separate you go, the more akward it is going to be when things
> start to melt together.
>
>>> At the risk of being a bit picky about language, I don't like calling
>>> this anything with "volume" in the name, because afaik we've never
>>> ever called OSDs or the drives they occupy "volumes", so we're
>>> introducing a whole new noun, and a widely used (to mean different
>>> things) one at that.
>>>
>>
>> We have never called them 'volumes' because there was never anything
>> to support something other than regular disks, the approach
>> has always been disks and partitions.
>>
>> A "volume" can be a physical volume (e.g. a disk) or a logical one
>> (lvm, dmcache). It is an all-encompassing name to allow different
>> device-like to work with.
>
> ZFS talks about volumes, vdev, partitions, .... and perhaps even more.
> Being picky: ceph-disk now also works on pre-build trees to build filestore.
>

Naming will continue to be the hardest part to get consensus on :) Of
course we could've just gone the other
way and pick something that has a nice ring to it, and nothing that
even hints at what it does :)


> I would just try to glue it into ceph-disk in the most flexible way

We can't "glue it into ceph-disk" because we are proposing a
completely new way of doing things that
go against how ceph-disk works.


> possible.
> --WjW
>
>>
>>
>>> John
>>>
>>>>
>>>>
>>>>>
>>>>> 2 cents from one of the LVM for Ceph users,
>>>>> Warren Wang
>>>>> Walmart ✻
>>>>>
>>>>> On 6/16/17, 10:25 AM, "ceph-users on behalf of Alfredo Deza" <ceph-users-bounces@lists.ceph.com on behalf of adeza@redhat.com> wrote:
>>>>>
>>>>>     Hello,
>>>>>
>>>>>     At the last CDM [0] we talked about `ceph-lvm` and the ability to
>>>>>     deploy OSDs from logical volumes. We have now an initial draft for the
>>>>>     documentation [1] and would like some feedback.
>>>>>
>>>>>     The important features for this new tool are:
>>>>>
>>>>>     * parting ways with udev (new approach will rely on LVM functionality
>>>>>     for discovery)
>>>>>     * compatibility/migration for existing LVM volumes deployed as directories
>>>>>     * dmcache support
>>>>>
>>>>>     By documenting the API and workflows first we are making sure that
>>>>>     those look fine before starting on actual development.
>>>>>
>>>>>     It would be great to get some feedback, specially if you are currently
>>>>>     using LVM with ceph (or planning to!).
>>>>>
>>>>>     Please note that the documentation is not complete and is missing
>>>>>     content on some parts.
>>>>>
>>>>>     [0] http://tracker.ceph.com/projects/ceph/wiki/CDM_06-JUN-2017
>>>>>     [1] http://docs.ceph.com/ceph-lvm/
>>>>>     _______________________________________________
>>>>>     ceph-users mailing list
>>>>>     ceph-users@lists.ceph.com
>>>>>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 17:53               ` Alfredo Deza
@ 2017-06-19 18:41                 ` Andrew Schoen
  2017-06-19 20:24                 ` Fwd: " John Spray
  1 sibling, 0 replies; 23+ messages in thread
From: Andrew Schoen @ 2017-06-19 18:41 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: John Spray, Warren Wang - ISD, ceph-users, ceph-devel

>>
>> I think having one part of Ceph on a different release cycle to the
>> rest of Ceph is an even more dramatic thing than having it in a
>> separate git repository.
>>
>> It seems like there is some dissatisfaction with how the Ceph project
>> as whole is doing things that is driving you to try and do work
>> outside of the repo where the rest of the project lives -- if the
>> release cycles or test infrastructure within Ceph are not adequate for
>> the tool that formats drives for OSDs, what can we do to fix them?
>
> It isn't Ceph the project :)

I think there needs to be a distinction between things that *are* ceph
(CephFS, RBD, RGW, MON, OSD) and things that might leverage ceph or
help with it's installation / usage (ceph-ansible, ceph-deploy,
ceph-installer, ceph-volume). I don't think the later group needs to
be in ceph.git.


>>
>> I guess my argument isn't so much an argument as it is an assertion
>> that if you want to go your own way then you need to have a really
>> strong clear reason.
>
> Many! Like I mentioned: easier testing, faster release cycle, can
> publish in any package index, doesn't need anything in ceph.git to
> operate, etc..

I agree with all these points. I would add that having ceph-volume in
a separate git repo greatly simplifies the CI interaction with the
project. When I submit a PR to ceph-volume.git I'd want all our unit
tests run and any new docs automatically published. If this lived in
ceph.git it's very clumsy (maybe not possible) to have a jenkins react
and start jobs that only pertain to the code being changed. If I
submit new ceph-volume code why would I need make check ran or ceph
packages built?

Having ceph-disk tied to ceph.git (and it's release cycle) has caused
problems with ceph-docker in the past. We've had a race condition (in
ceph-disk) that exposes itself in our CI present for quite some time
even though the patch was merged to master upstream. I think the fixed
missed the 2.3 downstream release as well because it wasn't back
ported quickly enough. Keeping tools like ceph-disk or ceph-volume
outside of ceph.git would allow us to merge those fixes back into
ceph-docker more efficiently.

Maybe I don't understand the strong reasons for keeping ceph-volume in
ceph.git? If it's only for parity in branches, I never thought of
ceph-volume having a branch per version of ceph supported anyway. I'd
expect it to have numbered releases that support a documented number
of ceph releases, ceph-ansible works similarly here and I believe
ceph-deploy did as well.

- Andrew

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 17:57                 ` EXT: [ceph-users] " Alfredo Deza
@ 2017-06-19 18:57                   ` Willem Jan Withagen
  0 siblings, 0 replies; 23+ messages in thread
From: Willem Jan Withagen @ 2017-06-19 18:57 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: John Spray, Warren Wang - ISD, ceph-users, ceph-devel



Op 19-6-2017 om 19:57 schreef Alfredo Deza:
> On Mon, Jun 19, 2017 at 11:37 AM, Willem Jan Withagen <wjw@digiware.nl> wrote:
>> On 19-6-2017 16:13, Alfredo Deza wrote:
>>> On Mon, Jun 19, 2017 at 9:27 AM, John Spray <jspray@redhat.com> wrote:
>>>> On Fri, Jun 16, 2017 at 7:23 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>>>> On Fri, Jun 16, 2017 at 2:11 PM, Warren Wang - ISD
>>>>> <Warren.Wang@walmart.com> wrote:
>>>>>
>> I would just try to glue it into ceph-disk in the most flexible way
> We can't "glue it into ceph-disk" because we are proposing a
> completely new way of doing things that
> go against how ceph-disk works.

'mmm,

Not really a valid argument if you want the 2 to become equal.

I have limited python knowledge, but I can envision an outer wrapper 
that just call
the old version of ceph-disk as an external executable. User impact is 
thus reduced
to bare minimum.

Got admit that it is not very elegant, but it would work.

But I'll see what you guys come up with.
Best proof is always the code.

--WjW


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Fwd: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 17:53               ` Alfredo Deza
  2017-06-19 18:41                 ` Andrew Schoen
@ 2017-06-19 20:24                 ` John Spray
  2017-06-19 21:14                   ` Alfredo Deza
  2017-06-20  7:12                   ` Fwd: " Nathan Cutler
  1 sibling, 2 replies; 23+ messages in thread
From: John Spray @ 2017-06-19 20:24 UTC (permalink / raw)
  To: Alfredo Deza, Ceph Development

On Mon, Jun 19, 2017 at 6:53 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>> * faster release cycles
>>> * easier and faster to test
>>
>> I think having one part of Ceph on a different release cycle to the
>> rest of Ceph is an even more dramatic thing than having it in a
>> separate git repository.
>>
>> It seems like there is some dissatisfaction with how the Ceph project
>> as whole is doing things that is driving you to try and do work
>> outside of the repo where the rest of the project lives -- if the
>> release cycles or test infrastructure within Ceph are not adequate for
>> the tool that formats drives for OSDs, what can we do to fix them?
>
> It isn't Ceph the project :)
>
> Not every tool about Ceph has to come from ceph.git, in which case the
> argument could be flipped around: why isn't ceph-installer,
> ceph-ansible, ceph-deploy, radosgw-agent, etc... all coming from
> within ceph.git ?

ceph-installer, ceph-deploy and ceph-ansible are special cases because
they are installers, that operate before a particular version of Ceph
has been selected for installation, and might operate on two
differently versioned clusters at the same time.

radosgw-agent, presumably (I haven't worked on it) is separate because
it sits between two clusters but is logically part of neither, and
those clusters could potentially be different-versioned too.

ceph-disk, on the other hand, rides alongside ceph-osd, writes a
format that ceph-osd needs to understand, the two go together
everywhere.  You use whatever version of ceph-disk corresponds to the
ceph-osd package you have.  You run whatever ceph-osd corresponds to
the version of ceph-disk you just used.  The two things are not
separate, any more than ceph-objectstore-tool would be.

It would be more intuitive if we had called ceph-disk
"ceph-osd-format" or similar.  The utility that prepares drives for
use by the OSD naturally belongs in the same package (or at the very
least the same release!) as the OSD code that reads that on-disk
format.

There is a very clear distinction in my mind between things that
install Ceph (i.e. they operate before the ceph packages are on the
system), and things that prepare the system (a particular Ceph version
is already installed, we're just getting ready to run it).
ceph-objectstore-tool would be another example of something that
operates on the drives, but is intimately coupled to the OSDs and
would not make sense as a separately released thing.

> They don't necessarily need to be tied in. In the case of
> ceph-installer: there is nothing ceph-specific it needs from ceph.git
> to run, why force it in?

Because Ceph is already a huge, complex codebase, and we already have
lots of things to keep track of.  Sometimes breaking things up makes
life easier, sometimes commonality makes live easier -- the trick is
knowing when to do which.

The binaries, the libraries, the APIs, these things benefit from being
broken down into manageable bitesize pieces.  The version control, the
releases, the build management, these things do not (with the
exception of optimizing jenkins by doing fewer builds in some cases).

I don't ever want to have to ask or answer the question "What version
of ceph-disk to I need for ceph x.y.z?", or "Can I run ceph-osd x.y.z
on a drive formatted with ceph-disk a.b.c?".

Being able to give a short, simple answer to "what version of Ceph is
this?" has HUGE value, and that goes out the window when you start
splitting bits off on their own release schedules.

>>> I am not ruling out going into Ceph at some point though, ideally when
>>> things slow down and become stable.
>>
>> I think that the decision about where this code lives needs to be made
>> before it is released -- moving it later is rather awkward.  If you'd
>> rather not have the code in Ceph master until you're happy with it,
>> then a branch would be the natural way to do that.
>>
>
> The decision was made a few weeks ago, and I really don't think we
> should be in ceph.git, but I am OK to keep
> discussing on the reasoning.
>
>
>>> Is your argument only to have parity in Ceph's branching? That was
>>> never a problem with out-of-tree tools like ceph-deploy for example.
>>
>> I guess my argument isn't so much an argument as it is an assertion
>> that if you want to go your own way then you need to have a really
>> strong clear reason.
>
> Many! Like I mentioned: easier testing, faster release cycle, can
> publish in any package index, doesn't need anything in ceph.git to
> operate, etc..

Testing: being separate is only easier if you're only doing python
unit testing.  If you're testing that ceph-disk/ceph-volume really
does its job, then you absolutely do want to be in the ceph tree, so
that you can fire up an OSD that checks that ceph-disk really did it's
job.

Faster release cycle: we release pretty often.  We release often
enough to deal with critical OSD and mon bugs.  The tool that formats
OSDs doesn't need to be released more often than the OSD itself.

Package indices: putting any of the Ceph code in pypi is of limited
value, even if we do periodically run into people with a passion for
it.  If someone did a "pip install librados", the very next thing they
would have to do would be to go find some packages of the C librados
bindings, and hope like hell that those packages matched whatever they
just downloaded from pypi, and they probably wouldn't, because what
are the chances that pip is fetching python bindings that match the
Ceph version I have on my system?  I don't want to have to deal with
users who get themselves into that situation.

>> Put a bit bluntly: if CephFS, RBD, RGW, the mon and the OSD can all
>> successfully co-habit in one git repository, what makes the CLI that
>> formats drives so special that it needs its own?
>
> Sure. Again, there is nothing some of our tooling needs from ceph.git
> so I don't see why the need to have then in-tree. I am sure RGW and
> other
> components do need to consume Ceph code in some way? I don't even
> think ceph-disk should be in tree for the same reason. I believe that
> in the very
> beginning it was just so easy to have everything be built from ceph.git

We are, for better or worse, currently in a "one big repo" model (with
the exception of installers and inter-cluster rgw bits).

One could legitimately argue that more modularity is better, and
separate out RBD and RGW into separate projects, because hey, they're
standalone, right?  Or, one can go the other way and argue that more
modularity creates versioning headaches that just don't need to exist.

Both are valid worldviews, but the WORST outcome is to have almost
everything in one repo, and then splinter off individual command line
tools based on ad hoc decisions when someone is doing a particular
feature.

I know how backwards that must sound, when you're looking at the
possibility of having a nice self contained git repo, that contains a
pypi-eligible python module, which has unit tests that run fast in
jenkins on every commit.  I get the appeal!  But for the sake of the
overall simplicity of Ceph, please think again, or if you really want
to convert us to a multi-repo model, then make that case for the
project as a whole rather than doing it individually on a bit-by-bit
basis.

John

> Even in some cases like pybind, it has been requested numerous times
> to get them on separate package indexes like PyPI, but that has always
> been
> *tremendously* difficult: http://tracker.ceph.com/issues/5900
>>>>  - I agree with others that a single entrypoint (i.e. executable) will
>>>> be more manageable than having conspicuously separate tools, but we
>>>> shouldn't worry too much about making things "plugins" as such -- they
>>>> can just be distinct code inside one tool, sharing as much or as
>>>> little as they need.
>>>>
>>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>>>> ..." commands to minimise the impression that the tooling is changing,
>>>> even if internally it's all new/distinct code?
>>>
>>> That sounded appealing initially, but because we are introducing a
>>> very different API, it would look odd to interact
>>> with other subcommands without a normalized interaction. For example,
>>> for 'prepare' this would be:
>>>
>>> ceph-disk prepare [...]
>>>
>>> And for LVM it would possible be
>>>
>>> ceph-disk lvm prepare [...]
>>>
>>> The level at which these similar actions are presented imply that one
>>> may be a preferred (or even default) one, while the other one
>>> isn't.
>>>
>>> At one point we are going to add regular disk worfklows (replacing
>>> ceph-disk functionality) and then it would become even more
>>> confusing to keep it there (or do you think at that point we could split?)
>>>
>>>>
>>>> At the risk of being a bit picky about language, I don't like calling
>>>> this anything with "volume" in the name, because afaik we've never
>>>> ever called OSDs or the drives they occupy "volumes", so we're
>>>> introducing a whole new noun, and a widely used (to mean different
>>>> things) one at that.
>>>>
>>>
>>> We have never called them 'volumes' because there was never anything
>>> to support something other than regular disks, the approach
>>> has always been disks and partitions.
>>>
>>> A "volume" can be a physical volume (e.g. a disk) or a logical one
>>> (lvm, dmcache). It is an all-encompassing name to allow different
>>> device-like to work with.
>>
>> The trouble with "volume" is that it means so many things in so many
>> different storage systems -- I haven't often seen it used to mean
>> "block device" or "drive".  It's more often used to describe a logical
>> entity.  I also think "disk" is fine -- most people get the idea that
>> a disk is a hard drive but it could also be any block device.
>
> If your thinking is that a disk can be any block device then yes, we
> are at opposite ends here of our naming. We are picking a
> "widely used" term because it is not specific. "disk" sounds fairly
> specific, and we don't want that.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 20:24                 ` Fwd: " John Spray
@ 2017-06-19 21:14                   ` Alfredo Deza
  2017-06-19 21:58                     ` Sage Weil
  2017-06-20  7:12                   ` Fwd: " Nathan Cutler
  1 sibling, 1 reply; 23+ messages in thread
From: Alfredo Deza @ 2017-06-19 21:14 UTC (permalink / raw)
  To: John Spray; +Cc: Ceph Development

On Mon, Jun 19, 2017 at 4:24 PM, John Spray <jspray@redhat.com> wrote:
> On Mon, Jun 19, 2017 at 6:53 PM, Alfredo Deza <adeza@redhat.com> wrote:
>>>> * faster release cycles
>>>> * easier and faster to test
>>>
>>> I think having one part of Ceph on a different release cycle to the
>>> rest of Ceph is an even more dramatic thing than having it in a
>>> separate git repository.
>>>
>>> It seems like there is some dissatisfaction with how the Ceph project
>>> as whole is doing things that is driving you to try and do work
>>> outside of the repo where the rest of the project lives -- if the
>>> release cycles or test infrastructure within Ceph are not adequate for
>>> the tool that formats drives for OSDs, what can we do to fix them?
>>
>> It isn't Ceph the project :)
>>
>> Not every tool about Ceph has to come from ceph.git, in which case the
>> argument could be flipped around: why isn't ceph-installer,
>> ceph-ansible, ceph-deploy, radosgw-agent, etc... all coming from
>> within ceph.git ?
>
> ceph-installer, ceph-deploy and ceph-ansible are special cases because
> they are installers, that operate before a particular version of Ceph
> has been selected for installation, and might operate on two
> differently versioned clusters at the same time.

This is a perfect use case for ceph-volume, the OSD doesn't (and in
most cases this is true) care what is beneath it, as long
as it is mounted and has what it needs to function. The rest is
*almost like installation*.

>
> radosgw-agent, presumably (I haven't worked on it) is separate because
> it sits between two clusters but is logically part of neither, and
> those clusters could potentially be different-versioned too.
>
> ceph-disk, on the other hand, rides alongside ceph-osd, writes a
> format that ceph-osd needs to understand, the two go together
> everywhere.  You use whatever version of ceph-disk corresponds to the
> ceph-osd package you have.  You run whatever ceph-osd corresponds to
> the version of ceph-disk you just used.  The two things are not
> separate, any more than ceph-objectstore-tool would be.

The OSD needs a mounted volume that has pieces that the OSD itself
puts in there. It is a bit convoluted because
there are other steps, but the tool itself isn't crucial for the OSD
to function, it is borderline an orchestrator to get the volume
where the OSD runs ready.

>
> It would be more intuitive if we had called ceph-disk
> "ceph-osd-format" or similar.  The utility that prepares drives for
> use by the OSD naturally belongs in the same package (or at the very
> least the same release!) as the OSD code that reads that on-disk
> format.
>
> There is a very clear distinction in my mind between things that
> install Ceph (i.e. they operate before the ceph packages are on the
> system), and things that prepare the system (a particular Ceph version
> is already installed, we're just getting ready to run it).
> ceph-objectstore-tool would be another example of something that
> operates on the drives, but is intimately coupled to the OSDs and
> would not make sense as a separately released thing.

And ceph-disk isn't really coupled (maybe a tiny corner of it is). Or
maybe you can exemplify how those are tied? I've gone through every
single
step to get an OSD, and although in some cases it is a bit more
complex, it isn't more than a few steps (6 in total from our own
docs):

http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#adding-an-osd-manual

ceph-ansible *does* prepare a system for running Ceph, so does
ceph-docker. ceph-disk has had some pitfalls that ceph-ansible has to
workaround,
and has to implement other things as well to be able to deploy OSDs.


>
>> They don't necessarily need to be tied in. In the case of
>> ceph-installer: there is nothing ceph-specific it needs from ceph.git
>> to run, why force it in?
>
> Because Ceph is already a huge, complex codebase, and we already have
> lots of things to keep track of.  Sometimes breaking things up makes
> life easier, sometimes commonality makes live easier -- the trick is
> knowing when to do which.
>
> The binaries, the libraries, the APIs, these things benefit from being
> broken down into manageable bitesize pieces.  The version control, the
> releases, the build management, these things do not (with the
> exception of optimizing jenkins by doing fewer builds in some cases).
>
> I don't ever want to have to ask or answer the question "What version
> of ceph-disk to I need for ceph x.y.z?", or "Can I run ceph-osd x.y.z
> on a drive formatted with ceph-disk a.b.c?".
>

That is the same question users of ceph-ansible need to look for. I
think this is fine as long as it is well defined. Now, the
ceph-ansible implementation
is far more complex because it needs to support old features and new
ones for every single aspect. I haven't seen much API changes in
ceph-disk that are completely
backwards incompatible with older releases.

And even if that is the case, we have been able to implement tooling
that is perfectly capable of managing that use case.

I would be more concerned if the "Adding an OSD (manual)" section
would keep changing on every Ceph release (it has almost stayed the
same for quite a few releases).

> Being able to give a short, simple answer to "what version of Ceph is
> this?" has HUGE value, and that goes out the window when you start
> splitting bits off on their own release schedules.
>

We are planning on being fully compatible, unless there is a major
change in how Ceph is exposing the bits for creating an OSD. Just like
we've done with ceph-deploy in the past.


>>>> I am not ruling out going into Ceph at some point though, ideally when
>>>> things slow down and become stable.
>>>
>>> I think that the decision about where this code lives needs to be made
>>> before it is released -- moving it later is rather awkward.  If you'd
>>> rather not have the code in Ceph master until you're happy with it,
>>> then a branch would be the natural way to do that.
>>>
>>
>> The decision was made a few weeks ago, and I really don't think we
>> should be in ceph.git, but I am OK to keep
>> discussing on the reasoning.
>>
>>
>>>> Is your argument only to have parity in Ceph's branching? That was
>>>> never a problem with out-of-tree tools like ceph-deploy for example.
>>>
>>> I guess my argument isn't so much an argument as it is an assertion
>>> that if you want to go your own way then you need to have a really
>>> strong clear reason.
>>
>> Many! Like I mentioned: easier testing, faster release cycle, can
>> publish in any package index, doesn't need anything in ceph.git to
>> operate, etc..
>
> Testing: being separate is only easier if you're only doing python
> unit testing.  If you're testing that ceph-disk/ceph-volume really
> does its job, then you absolutely do want to be in the ceph tree, so
> that you can fire up an OSD that checks that ceph-disk really did it's
> job.
>
> Faster release cycle: we release pretty often.

Uh, it depends on what "fast" means for you. 4 months waiting on a
ceph-disk issue that was fixed/merged to have ceph-ansible not have
that bug
is not really fast.

>  We release often
> enough to deal with critical OSD and mon bugs.  The tool that formats
> OSDs doesn't need to be released more often than the OSD itself.
>

It does need to be released often when the tool is new!

> Package indices: putting any of the Ceph code in pypi is of limited
> value, even if we do periodically run into people with a passion for
> it.  If someone did a "pip install librados", the very next thing they
> would have to do would be to go find some packages of the C librados
> bindings, and hope like hell that those packages matched whatever they
> just downloaded from pypi, and they probably wouldn't, because what
> are the chances that pip is fetching python bindings that match the
> Ceph version I have on my system?  I don't want to have to deal with
> users who get themselves into that situation.

So in this case you are using bindings to explicit internal Ceph APIs,
we aren't doing that, this is not a use case we are contemplating.

>
>>> Put a bit bluntly: if CephFS, RBD, RGW, the mon and the OSD can all
>>> successfully co-habit in one git repository, what makes the CLI that
>>> formats drives so special that it needs its own?
>>
>> Sure. Again, there is nothing some of our tooling needs from ceph.git
>> so I don't see why the need to have then in-tree. I am sure RGW and
>> other
>> components do need to consume Ceph code in some way? I don't even
>> think ceph-disk should be in tree for the same reason. I believe that
>> in the very
>> beginning it was just so easy to have everything be built from ceph.git
>
> We are, for better or worse, currently in a "one big repo" model (with
> the exception of installers and inter-cluster rgw bits).
>
> One could legitimately argue that more modularity is better, and
> separate out RBD and RGW into separate projects, because hey, they're
> standalone, right?  Or, one can go the other way and argue that more
> modularity creates versioning headaches that just don't need to exist.
>
> Both are valid worldviews, but the WORST outcome is to have almost
> everything in one repo, and then splinter off individual command line
> tools based on ad hoc decisions when someone is doing a particular
> feature.

RBD and RGW are unfair to use as an example to a (hopefully) small CLI
tool that wants to "prepare" a device for an OSD to start, that
doesn't consume any Python bindings, or Ceph APIs (aside from the ceph CLI).

>
> I know how backwards that must sound, when you're looking at the
> possibility of having a nice self contained git repo, that contains a
> pypi-eligible python module, which has unit tests that run fast in
> jenkins on every commit.  I get the appeal!  But for the sake of the
> overall simplicity of Ceph, please think again, or if you really want
> to convert us to a multi-repo model, then make that case for the
> project as a whole rather than doing it individually on a bit-by-bit
> basis.

We can't make the world of Ceph repos to abide today by a multi-repo
model. I would need to counter argue you for a few more months :)

The examples you give for ceph-disk, and how ceph-disk is today, is
why we want to change things.

It is not only faster unit tests, or a "nice self contained git repo"
just because we want to release to PyPI, we are facing a situation
where
we need faster development and increased release cycles that we can't
get being in an already big repository.


>
> John
>
>> Even in some cases like pybind, it has been requested numerous times
>> to get them on separate package indexes like PyPI, but that has always
>> been
>> *tremendously* difficult: http://tracker.ceph.com/issues/5900
>>>>>  - I agree with others that a single entrypoint (i.e. executable) will
>>>>> be more manageable than having conspicuously separate tools, but we
>>>>> shouldn't worry too much about making things "plugins" as such -- they
>>>>> can just be distinct code inside one tool, sharing as much or as
>>>>> little as they need.
>>>>>
>>>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>>>>> ..." commands to minimise the impression that the tooling is changing,
>>>>> even if internally it's all new/distinct code?
>>>>
>>>> That sounded appealing initially, but because we are introducing a
>>>> very different API, it would look odd to interact
>>>> with other subcommands without a normalized interaction. For example,
>>>> for 'prepare' this would be:
>>>>
>>>> ceph-disk prepare [...]
>>>>
>>>> And for LVM it would possible be
>>>>
>>>> ceph-disk lvm prepare [...]
>>>>
>>>> The level at which these similar actions are presented imply that one
>>>> may be a preferred (or even default) one, while the other one
>>>> isn't.
>>>>
>>>> At one point we are going to add regular disk worfklows (replacing
>>>> ceph-disk functionality) and then it would become even more
>>>> confusing to keep it there (or do you think at that point we could split?)
>>>>
>>>>>
>>>>> At the risk of being a bit picky about language, I don't like calling
>>>>> this anything with "volume" in the name, because afaik we've never
>>>>> ever called OSDs or the drives they occupy "volumes", so we're
>>>>> introducing a whole new noun, and a widely used (to mean different
>>>>> things) one at that.
>>>>>
>>>>
>>>> We have never called them 'volumes' because there was never anything
>>>> to support something other than regular disks, the approach
>>>> has always been disks and partitions.
>>>>
>>>> A "volume" can be a physical volume (e.g. a disk) or a logical one
>>>> (lvm, dmcache). It is an all-encompassing name to allow different
>>>> device-like to work with.
>>>
>>> The trouble with "volume" is that it means so many things in so many
>>> different storage systems -- I haven't often seen it used to mean
>>> "block device" or "drive".  It's more often used to describe a logical
>>> entity.  I also think "disk" is fine -- most people get the idea that
>>> a disk is a hard drive but it could also be any block device.
>>
>> If your thinking is that a disk can be any block device then yes, we
>> are at opposite ends here of our naming. We are picking a
>> "widely used" term because it is not specific. "disk" sounds fairly
>> specific, and we don't want that.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 21:14                   ` Alfredo Deza
@ 2017-06-19 21:58                     ` Sage Weil
  2017-06-20 14:09                       ` Alfredo Deza
  2017-06-21 12:33                       ` Alfredo Deza
  0 siblings, 2 replies; 23+ messages in thread
From: Sage Weil @ 2017-06-19 21:58 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: John Spray, Ceph Development

On Mon, 19 Jun 2017, Alfredo Deza wrote:
> On Mon, Jun 19, 2017 at 4:24 PM, John Spray <jspray@redhat.com> wrote:
> > On Mon, Jun 19, 2017 at 6:53 PM, Alfredo Deza <adeza@redhat.com> wrote:
> >>>> * faster release cycles
> >>>> * easier and faster to test
> >>>
> >>> I think having one part of Ceph on a different release cycle to the
> >>> rest of Ceph is an even more dramatic thing than having it in a
> >>> separate git repository.
> >>>
> >>> It seems like there is some dissatisfaction with how the Ceph project
> >>> as whole is doing things that is driving you to try and do work
> >>> outside of the repo where the rest of the project lives -- if the
> >>> release cycles or test infrastructure within Ceph are not adequate for
> >>> the tool that formats drives for OSDs, what can we do to fix them?
> >>
> >> It isn't Ceph the project :)
> >>
> >> Not every tool about Ceph has to come from ceph.git, in which case the
> >> argument could be flipped around: why isn't ceph-installer,
> >> ceph-ansible, ceph-deploy, radosgw-agent, etc... all coming from
> >> within ceph.git ?
> >
> > ceph-installer, ceph-deploy and ceph-ansible are special cases because
> > they are installers, that operate before a particular version of Ceph
> > has been selected for installation, and might operate on two
> > differently versioned clusters at the same time.
> 
> This is a perfect use case for ceph-volume, the OSD doesn't (and in
> most cases this is true) care what is beneath it, as long
> as it is mounted and has what it needs to function. The rest is
> *almost like installation*.

This isn't really true.  ceph-volume (or ceph-disk lvm, or whatever we 
call it) is going to have specific knowledge about how to provision the 
OSD.  When we change the bootstrap-osd caps and change hte semantics of 
'osd new' (take, for example, teh change we just made from 'osd create' to 
'osd new'), then ceph-mon, the cephx caps, and ceph-disk all have to 
change in unison.  More concretely, with bluestore we have all kinds of 
choices of how we provision the volumes (what sizes, what options for 
rocksdb, whatever), those opinions will be enshrine in ceph-volume, and 
they will change from version to version... likely in unison with 
bluestore itself (as the code changes the best practice and 
recommendations change with it).

In contrast, I can't think of a reason why ceph-volume without change 
independently of ceph-osd.  There is no bootstrap issue like with 
installation.  And no reason why you would want to run different between 
different versions.



> > radosgw-agent, presumably (I haven't worked on it) is separate because
> > it sits between two clusters but is logically part of neither, and
> > those clusters could potentially be different-versioned too.
> >
> > ceph-disk, on the other hand, rides alongside ceph-osd, writes a
> > format that ceph-osd needs to understand, the two go together
> > everywhere.  You use whatever version of ceph-disk corresponds to the
> > ceph-osd package you have.  You run whatever ceph-osd corresponds to
> > the version of ceph-disk you just used.  The two things are not
> > separate, any more than ceph-objectstore-tool would be.
> 
> The OSD needs a mounted volume that has pieces that the OSD itself
> puts in there. It is a bit convoluted because
> there are other steps, but the tool itself isn't crucial for the OSD
> to function, it is borderline an orchestrator to get the volume
> where the OSD runs ready.
> 
> >
> > It would be more intuitive if we had called ceph-disk
> > "ceph-osd-format" or similar.  The utility that prepares drives for
> > use by the OSD naturally belongs in the same package (or at the very
> > least the same release!) as the OSD code that reads that on-disk
> > format.
> >
> > There is a very clear distinction in my mind between things that
> > install Ceph (i.e. they operate before the ceph packages are on the
> > system), and things that prepare the system (a particular Ceph version
> > is already installed, we're just getting ready to run it).
> > ceph-objectstore-tool would be another example of something that
> > operates on the drives, but is intimately coupled to the OSDs and
> > would not make sense as a separately released thing.
> 
> And ceph-disk isn't really coupled (maybe a tiny corner of it is). Or 
> maybe you can exemplify how those are tied? I've gone through every 
> single step to get an OSD, and although in some cases it is a bit more 
> complex, it isn't more than a few steps (6 in total from our own docs):
> 
> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#adding-an-osd-manual
> 
> ceph-ansible *does* prepare a system for running Ceph, so does 
> ceph-docker. ceph-disk has had some pitfalls that ceph-ansible has to 
> workaround, and has to implement other things as well to be able to 
> deploy OSDs.

Again, I think the 'osd create' -> 'osd new' is a perfect example of 
coupling.  And I anticipate others with bluestore.  For example, when we 
start supporting SPDK for NVMe (kernel bypass) the interface for setting 
that up will likely evolve and will need to match the behavior in 
ceph-volume.

[...]

Perhaps we can look at this from the other angle, though?  Why *should* 
this particular tool be separate?


> >>>> Is your argument only to have parity in Ceph's branching? That was
> >>>> never a problem with out-of-tree tools like ceph-deploy for example.
> >>>
> >>> I guess my argument isn't so much an argument as it is an assertion
> >>> that if you want to go your own way then you need to have a really
> >>> strong clear reason.
> >>
> >> Many! Like I mentioned: easier testing, faster release cycle, can
> >> publish in any package index, doesn't need anything in ceph.git to
> >> operate, etc..
> >
> > Testing: being separate is only easier if you're only doing python
> > unit testing.  If you're testing that ceph-disk/ceph-volume really
> > does its job, then you absolutely do want to be in the ceph tree, so
> > that you can fire up an OSD that checks that ceph-disk really did it's
> > job.
> >
> > Faster release cycle: we release pretty often.
> 
> Uh, it depends on what "fast" means for you. 4 months waiting on a 
> ceph-disk issue that was fixed/merged to have ceph-ansible not have that 
> bug is not really fast.

Can you be tell us more about this incident?  We are regular backport 
changes to the stable branches, and have a pretty regular cadence for 
stable release.

> >  We release often
> > enough to deal with critical OSD and mon bugs.  The tool that formats
> > OSDs doesn't need to be released more often than the OSD itself.
> 
> It does need to be released often when the tool is new!

For development, we are doing builds on a continuous basis, with new 
'master' or branch packages every few hours in most cases.  And all of our 
deployment tools can deploy those test branches...


> > I know how backwards that must sound, when you're looking at the
> > possibility of having a nice self contained git repo, that contains a
> > pypi-eligible python module, which has unit tests that run fast in
> > jenkins on every commit.  I get the appeal!  But for the sake of the
> > overall simplicity of Ceph, please think again, or if you really want
> > to convert us to a multi-repo model, then make that case for the
> > project as a whole rather than doing it individually on a bit-by-bit
> > basis.
> 
> We can't make the world of Ceph repos to abide today by a multi-repo 
> model. I would need to counter argue you for a few more months :)
> 
> The examples you give for ceph-disk, and how ceph-disk is today, is why 
> we want to change things.
> 
> It is not only faster unit tests, or a "nice self contained git repo" 
> just because we want to release to PyPI, we are facing a situation where 
> we need faster development and increased release cycles that we can't 
> get being in an already big repository.

If I'm reading this right, the core reason is "faster development and 
increased release cycles".  Can you explain what that means at a 
practical level?  We build packages all day every day, and don't generally 
need packages at all for development testing.  And any release that 
uses ceph-volume is weeks away, and will be followed up by a regular 
cadence of point releases.  Where is the limitation?

Thanks!
sage


> 
> 
> >
> > John
> >
> >> Even in some cases like pybind, it has been requested numerous times
> >> to get them on separate package indexes like PyPI, but that has always
> >> been
> >> *tremendously* difficult: http://tracker.ceph.com/issues/5900
> >>>>>  - I agree with others that a single entrypoint (i.e. executable) will
> >>>>> be more manageable than having conspicuously separate tools, but we
> >>>>> shouldn't worry too much about making things "plugins" as such -- they
> >>>>> can just be distinct code inside one tool, sharing as much or as
> >>>>> little as they need.
> >>>>>
> >>>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
> >>>>> ..." commands to minimise the impression that the tooling is changing,
> >>>>> even if internally it's all new/distinct code?
> >>>>
> >>>> That sounded appealing initially, but because we are introducing a
> >>>> very different API, it would look odd to interact
> >>>> with other subcommands without a normalized interaction. For example,
> >>>> for 'prepare' this would be:
> >>>>
> >>>> ceph-disk prepare [...]
> >>>>
> >>>> And for LVM it would possible be
> >>>>
> >>>> ceph-disk lvm prepare [...]
> >>>>
> >>>> The level at which these similar actions are presented imply that one
> >>>> may be a preferred (or even default) one, while the other one
> >>>> isn't.
> >>>>
> >>>> At one point we are going to add regular disk worfklows (replacing
> >>>> ceph-disk functionality) and then it would become even more
> >>>> confusing to keep it there (or do you think at that point we could split?)
> >>>>
> >>>>>
> >>>>> At the risk of being a bit picky about language, I don't like calling
> >>>>> this anything with "volume" in the name, because afaik we've never
> >>>>> ever called OSDs or the drives they occupy "volumes", so we're
> >>>>> introducing a whole new noun, and a widely used (to mean different
> >>>>> things) one at that.
> >>>>>
> >>>>
> >>>> We have never called them 'volumes' because there was never anything
> >>>> to support something other than regular disks, the approach
> >>>> has always been disks and partitions.
> >>>>
> >>>> A "volume" can be a physical volume (e.g. a disk) or a logical one
> >>>> (lvm, dmcache). It is an all-encompassing name to allow different
> >>>> device-like to work with.
> >>>
> >>> The trouble with "volume" is that it means so many things in so many
> >>> different storage systems -- I haven't often seen it used to mean
> >>> "block device" or "drive".  It's more often used to describe a logical
> >>> entity.  I also think "disk" is fine -- most people get the idea that
> >>> a disk is a hard drive but it could also be any block device.
> >>
> >> If your thinking is that a disk can be any block device then yes, we
> >> are at opposite ends here of our naming. We are picking a
> >> "widely used" term because it is not specific. "disk" sounds fairly
> >> specific, and we don't want that.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 20:24                 ` Fwd: " John Spray
  2017-06-19 21:14                   ` Alfredo Deza
@ 2017-06-20  7:12                   ` Nathan Cutler
  2017-06-20 14:43                     ` Andrew Schoen
  1 sibling, 1 reply; 23+ messages in thread
From: Nathan Cutler @ 2017-06-20  7:12 UTC (permalink / raw)
  To: Ceph Development

On 06/19/2017 10:24 PM, John Spray wrote:
> Testing: being separate is only easier if you're only doing python
> unit testing.

Hear hear! I do a lot of functional/integration/regression testing and 
can confirm that it became a whole lot easier when ceph-qa-suite was 
moved in-tree.

Nathan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 21:58                     ` Sage Weil
@ 2017-06-20 14:09                       ` Alfredo Deza
  2017-06-21 12:33                       ` Alfredo Deza
  1 sibling, 0 replies; 23+ messages in thread
From: Alfredo Deza @ 2017-06-20 14:09 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, Ceph Development

On Mon, Jun 19, 2017 at 5:58 PM, Sage Weil <sweil@redhat.com> wrote:
> On Mon, 19 Jun 2017, Alfredo Deza wrote:
>> On Mon, Jun 19, 2017 at 4:24 PM, John Spray <jspray@redhat.com> wrote:
>> > On Mon, Jun 19, 2017 at 6:53 PM, Alfredo Deza <adeza@redhat.com> wrote:
>> >>>> * faster release cycles
>> >>>> * easier and faster to test
>> >>>
>> >>> I think having one part of Ceph on a different release cycle to the
>> >>> rest of Ceph is an even more dramatic thing than having it in a
>> >>> separate git repository.
>> >>>
>> >>> It seems like there is some dissatisfaction with how the Ceph project
>> >>> as whole is doing things that is driving you to try and do work
>> >>> outside of the repo where the rest of the project lives -- if the
>> >>> release cycles or test infrastructure within Ceph are not adequate for
>> >>> the tool that formats drives for OSDs, what can we do to fix them?
>> >>
>> >> It isn't Ceph the project :)
>> >>
>> >> Not every tool about Ceph has to come from ceph.git, in which case the
>> >> argument could be flipped around: why isn't ceph-installer,
>> >> ceph-ansible, ceph-deploy, radosgw-agent, etc... all coming from
>> >> within ceph.git ?
>> >
>> > ceph-installer, ceph-deploy and ceph-ansible are special cases because
>> > they are installers, that operate before a particular version of Ceph
>> > has been selected for installation, and might operate on two
>> > differently versioned clusters at the same time.
>>
>> This is a perfect use case for ceph-volume, the OSD doesn't (and in
>> most cases this is true) care what is beneath it, as long
>> as it is mounted and has what it needs to function. The rest is
>> *almost like installation*.
>
> This isn't really true.  ceph-volume (or ceph-disk lvm, or whatever we
> call it) is going to have specific knowledge about how to provision the
> OSD.  When we change the bootstrap-osd caps and change hte semantics of
> 'osd new' (take, for example, teh change we just made from 'osd create' to
> 'osd new'), then ceph-mon, the cephx caps, and ceph-disk all have to
> change in unison.  More concretely, with bluestore we have all kinds of
> choices of how we provision the volumes (what sizes, what options for
> rocksdb, whatever), those opinions will be enshrine in ceph-volume, and
> they will change from version to version... likely in unison with
> bluestore itself (as the code changes the best practice and
> recommendations change with it).

I don't see this any different from provisioning. As Ceph changes, so
does provisioning in installers, provisioning an OSD may change
as Ceph adds more options, just like we've seen on installers.

You mention the bootstrap caps, and we've had logic to deal with those
types of changes before out of tree.

>
> In contrast, I can't think of a reason why ceph-volume without change
> independently of ceph-osd.  There is no bootstrap issue like with
> installation.  And no reason why you would want to run different between
> different versions.
>
>
>
>> > radosgw-agent, presumably (I haven't worked on it) is separate because
>> > it sits between two clusters but is logically part of neither, and
>> > those clusters could potentially be different-versioned too.
>> >
>> > ceph-disk, on the other hand, rides alongside ceph-osd, writes a
>> > format that ceph-osd needs to understand, the two go together
>> > everywhere.  You use whatever version of ceph-disk corresponds to the
>> > ceph-osd package you have.  You run whatever ceph-osd corresponds to
>> > the version of ceph-disk you just used.  The two things are not
>> > separate, any more than ceph-objectstore-tool would be.
>>
>> The OSD needs a mounted volume that has pieces that the OSD itself
>> puts in there. It is a bit convoluted because
>> there are other steps, but the tool itself isn't crucial for the OSD
>> to function, it is borderline an orchestrator to get the volume
>> where the OSD runs ready.
>>
>> >
>> > It would be more intuitive if we had called ceph-disk
>> > "ceph-osd-format" or similar.  The utility that prepares drives for
>> > use by the OSD naturally belongs in the same package (or at the very
>> > least the same release!) as the OSD code that reads that on-disk
>> > format.
>> >
>> > There is a very clear distinction in my mind between things that
>> > install Ceph (i.e. they operate before the ceph packages are on the
>> > system), and things that prepare the system (a particular Ceph version
>> > is already installed, we're just getting ready to run it).
>> > ceph-objectstore-tool would be another example of something that
>> > operates on the drives, but is intimately coupled to the OSDs and
>> > would not make sense as a separately released thing.
>>
>> And ceph-disk isn't really coupled (maybe a tiny corner of it is). Or
>> maybe you can exemplify how those are tied? I've gone through every
>> single step to get an OSD, and although in some cases it is a bit more
>> complex, it isn't more than a few steps (6 in total from our own docs):
>>
>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#adding-an-osd-manual
>>
>> ceph-ansible *does* prepare a system for running Ceph, so does
>> ceph-docker. ceph-disk has had some pitfalls that ceph-ansible has to
>> workaround, and has to implement other things as well to be able to
>> deploy OSDs.
>
> Again, I think the 'osd create' -> 'osd new' is a perfect example of
> coupling.  And I anticipate others with bluestore.  For example, when we
> start supporting SPDK for NVMe (kernel bypass) the interface for setting
> that up will likely evolve and will need to match the behavior in
> ceph-volume.
>
> [...]
>
> Perhaps we can look at this from the other angle, though?  Why *should*
> this particular tool be separate?
>

The push back for getting pulled into ceph.git was unexpected, and I
think that is partly because there is no clear guidelines
into what should (or shouldn't) go in. To me, a tool doesn't need to
be in tree if:

* it doesn't consume bindings (e.g. pybind)
* other parts of the project do not depend on it directly (for
example: ceph-osd calling ceph-volume)

This is similar to why I argued against including JS and CSS files for
the dashboard.

>
>> >>>> Is your argument only to have parity in Ceph's branching? That was
>> >>>> never a problem with out-of-tree tools like ceph-deploy for example.
>> >>>
>> >>> I guess my argument isn't so much an argument as it is an assertion
>> >>> that if you want to go your own way then you need to have a really
>> >>> strong clear reason.
>> >>
>> >> Many! Like I mentioned: easier testing, faster release cycle, can
>> >> publish in any package index, doesn't need anything in ceph.git to
>> >> operate, etc..
>> >
>> > Testing: being separate is only easier if you're only doing python
>> > unit testing.  If you're testing that ceph-disk/ceph-volume really
>> > does its job, then you absolutely do want to be in the ceph tree, so
>> > that you can fire up an OSD that checks that ceph-disk really did it's
>> > job.
>> >
>> > Faster release cycle: we release pretty often.
>>
>> Uh, it depends on what "fast" means for you. 4 months waiting on a
>> ceph-disk issue that was fixed/merged to have ceph-ansible not have that
>> bug is not really fast.
>
> Can you be tell us more about this incident?  We are regular backport
> changes to the stable branches, and have a pretty regular cadence for
> stable release.

ceph-disk reported issue on November 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1391920
Fix was merged on February: https://github.com/ceph/ceph/pull/13573
Backport ticket was created in February http://tracker.ceph.com/issues/18972
And was closed/merged in May

That is: after having a fix in February, it was finally backported in May.

With a decoupled project this could've been released in February. This
is exactly the same behavior we had in ceph-deploy,
when it had so many issues being fixed that it wasn't odd to see up to
two releases per week.

>
>> >  We release often
>> > enough to deal with critical OSD and mon bugs.  The tool that formats
>> > OSDs doesn't need to be released more often than the OSD itself.
>>
>> It does need to be released often when the tool is new!
>
> For development, we are doing builds on a continuous basis, with new
> 'master' or branch packages every few hours in most cases.  And all of our
> deployment tools can deploy those test branches...
>
>
>> > I know how backwards that must sound, when you're looking at the
>> > possibility of having a nice self contained git repo, that contains a
>> > pypi-eligible python module, which has unit tests that run fast in
>> > jenkins on every commit.  I get the appeal!  But for the sake of the
>> > overall simplicity of Ceph, please think again, or if you really want
>> > to convert us to a multi-repo model, then make that case for the
>> > project as a whole rather than doing it individually on a bit-by-bit
>> > basis.
>>
>> We can't make the world of Ceph repos to abide today by a multi-repo
>> model. I would need to counter argue you for a few more months :)
>>
>> The examples you give for ceph-disk, and how ceph-disk is today, is why
>> we want to change things.
>>
>> It is not only faster unit tests, or a "nice self contained git repo"
>> just because we want to release to PyPI, we are facing a situation where
>> we need faster development and increased release cycles that we can't
>> get being in an already big repository.
>
> If I'm reading this right, the core reason is "faster development and
> increased release cycles".  Can you explain what that means at a
> practical level?  We build packages all day every day, and don't generally
> need packages at all for development testing.  And any release that
> uses ceph-volume is weeks away, and will be followed up by a regular
> cadence of point releases.  Where is the limitation?

Anyone that is using LVM today with Ceph will be able to use
`ceph-volume lvm` to provision an OSD using filestore.

If the tool went into Ceph, it would mean it would not be possible to
use, up until they upgrade to that version of Ceph. There is no
reason to have to wait in this case, ceph-volume does not depend on
functionality in Ceph that is not yet released. If the tool was ready
today, any
LVM user could migrate right away.

The practical reasons for development (and again, given that this tool
doesn't depend on any internal APIs or bindings) are:

* submitting a PR doesn't need to wait for `make check` (about 1 hour
vs. just a couple of minutes) - there is no way to decouple, say,
ceph-disk tests from `make check`
  so that a PR that has code for ceph-disk can run only ceph-disk
tests. Our tooling, branch triggers, Github integration, and Jenkins
all look at the repository, they can't really
  determine what piece of code changed so it runs a subset of that.
* functional testing can be leveraged using ceph-ansible tests, that
can run anywhere, usually under 30 minutes
* building a binary (rpm/deb) means a few minutes, not 1 hour because
we need to wait for every other Ceph binary

None of these are meant to criticize Ceph release cycles, or
development workflows (I would know better since I implemented some of
them!)

Going back to the reasoning, I can see how John thinks in some cases
is best to have everything in one repository with the exception of
deployment tools, but that
is an assumption. There are things in the tree that just don't need to
be there (ceph-detect-init is a good example) but without a clear
guideline (is it a deployment tool?
a library? does it consume an internal API/binding?) or expectation
(wait for other ceph tests, can't run only tests for the tool on PRs)
we can't really make a point
about either way, and we are now mainly discussing a preference:

- As an LVM user today, I would prefer to not have to wait (without a
good reason) to upgrade to make use of LVM for an OSD.
- As a developer of the tool, I prefer faster tests, faster build
times, frequent release times

Whenever there is a ceph-deploy release, this is very transparent to
the end user: it gets included in every Ceph repository. So
installation/availability is the same as if it came
from Ceph itself.

That is why I believe that if we are so insisting in having it
in-tree, lets wait until it stabilizes, when we don't need to release
often, when the excuse of "I use LVM today and I don't want to upgrade
just to be
able to deploy an OSD with LVM" is no longer valid because the tool
has been around long enough.

No user is going to have to go through the exercise of having to
understand "where do I install $version of ceph-volume to work with
$version of ceph". The package is going to live in the same place in
the end.

The "overall simplicity of Ceph" will not change because this tool
lives in a separate git repository.

>
> Thanks!
> sage
>
>
>>
>>
>> >
>> > John
>> >
>> >> Even in some cases like pybind, it has been requested numerous times
>> >> to get them on separate package indexes like PyPI, but that has always
>> >> been
>> >> *tremendously* difficult: http://tracker.ceph.com/issues/5900
>> >>>>>  - I agree with others that a single entrypoint (i.e. executable) will
>> >>>>> be more manageable than having conspicuously separate tools, but we
>> >>>>> shouldn't worry too much about making things "plugins" as such -- they
>> >>>>> can just be distinct code inside one tool, sharing as much or as
>> >>>>> little as they need.
>> >>>>>
>> >>>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>> >>>>> ..." commands to minimise the impression that the tooling is changing,
>> >>>>> even if internally it's all new/distinct code?
>> >>>>
>> >>>> That sounded appealing initially, but because we are introducing a
>> >>>> very different API, it would look odd to interact
>> >>>> with other subcommands without a normalized interaction. For example,
>> >>>> for 'prepare' this would be:
>> >>>>
>> >>>> ceph-disk prepare [...]
>> >>>>
>> >>>> And for LVM it would possible be
>> >>>>
>> >>>> ceph-disk lvm prepare [...]
>> >>>>
>> >>>> The level at which these similar actions are presented imply that one
>> >>>> may be a preferred (or even default) one, while the other one
>> >>>> isn't.
>> >>>>
>> >>>> At one point we are going to add regular disk worfklows (replacing
>> >>>> ceph-disk functionality) and then it would become even more
>> >>>> confusing to keep it there (or do you think at that point we could split?)
>> >>>>
>> >>>>>
>> >>>>> At the risk of being a bit picky about language, I don't like calling
>> >>>>> this anything with "volume" in the name, because afaik we've never
>> >>>>> ever called OSDs or the drives they occupy "volumes", so we're
>> >>>>> introducing a whole new noun, and a widely used (to mean different
>> >>>>> things) one at that.
>> >>>>>
>> >>>>
>> >>>> We have never called them 'volumes' because there was never anything
>> >>>> to support something other than regular disks, the approach
>> >>>> has always been disks and partitions.
>> >>>>
>> >>>> A "volume" can be a physical volume (e.g. a disk) or a logical one
>> >>>> (lvm, dmcache). It is an all-encompassing name to allow different
>> >>>> device-like to work with.
>> >>>
>> >>> The trouble with "volume" is that it means so many things in so many
>> >>> different storage systems -- I haven't often seen it used to mean
>> >>> "block device" or "drive".  It's more often used to describe a logical
>> >>> entity.  I also think "disk" is fine -- most people get the idea that
>> >>> a disk is a hard drive but it could also be any block device.
>> >>
>> >> If your thinking is that a disk can be any block device then yes, we
>> >> are at opposite ends here of our naming. We are picking a
>> >> "widely used" term because it is not specific. "disk" sounds fairly
>> >> specific, and we don't want that.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-20  7:12                   ` Fwd: " Nathan Cutler
@ 2017-06-20 14:43                     ` Andrew Schoen
  2017-06-20 20:40                       ` Nathan Cutler
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Schoen @ 2017-06-20 14:43 UTC (permalink / raw)
  To: Nathan Cutler; +Cc: Ceph Development

On Tue, Jun 20, 2017 at 2:12 AM, Nathan Cutler <ncutler@suse.cz> wrote:
> On 06/19/2017 10:24 PM, John Spray wrote:
>>
>> Testing: being separate is only easier if you're only doing python
>> unit testing.
>
>
> Hear hear! I do a lot of functional/integration/regression testing and can
> confirm that it became a whole lot easier when ceph-qa-suite was moved
> in-tree.

Comparing ceph-qa-suite to a contained python module isn't quite a
fair comparison though. For upstream testing of ceph-volume (if it
existed outside ceph.git) you'd simply install it with pip. A jenkins
job that built every branch of ceph-volume.git and pushed them to
shaman would be easy enough as well if that's what's preferred.

Honestly, I think all manner of testing (including functional) becomes
easier with this out of ceph.git. Especially when it comes to testing
something inside of a container.

Best,
Andrew

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-20 14:43                     ` Andrew Schoen
@ 2017-06-20 20:40                       ` Nathan Cutler
  2017-06-20 21:38                         ` Alfredo Deza
  0 siblings, 1 reply; 23+ messages in thread
From: Nathan Cutler @ 2017-06-20 20:40 UTC (permalink / raw)
  To: Andrew Schoen; +Cc: Ceph Development

> For upstream testing of ceph-volume (if it
> existed outside ceph.git) you'd simply install it with pip.

Like, e.g., setuptools? Always use the latest version, and you're fine?

Seriously, this would be OK if the ceph-volume maintainers ensure that 
every new release is thoroughly regression-tested against all stable 
versions of Ceph that are in maintenance at the time and agree not to 
roll out any non-backward-compatible features (?).

I recall that with ceph-deploy we did get into a situation where the 
latest version did not support hammer, so hammer users could not simply 
grab the latest version of ceph-deploy. (I guess that has been fixed in 
the meantime?) Users had to "know" which version played nice with hammer 
and grab that one specifically. Also, whatever bugs were in that 
version, they were stuck with because nothing is backported.

I suppose the Ceph project could do the same - just maintain a single 
codestream ("master") and cut releases from time to time. It would be a 
lot simpler! And if a distro wanted to backport fixes to a previous 
release, nothing to stop them, right? But for some reason the Ceph 
project goes through the trouble to maintain multiple codestreams. And 
doesn't the Linux kernel project maintains stable branches, too?

Maintaining stable branches is a lot of work, so there must be good 
reasons for it. One reason I can think of is that it becomes possible to 
implement non-backward-compatible features in mainline/master, yet folks 
who value stability (i.e. distros and the vast majority of users) can 
continue to use the older codestreams and benefit from bugfixes that get 
backported.

Nathan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-20 20:40                       ` Nathan Cutler
@ 2017-06-20 21:38                         ` Alfredo Deza
  2017-06-21 11:25                           ` Nathan Cutler
  2017-06-21 11:47                           ` ceph-deploy hammer support Nathan Cutler
  0 siblings, 2 replies; 23+ messages in thread
From: Alfredo Deza @ 2017-06-20 21:38 UTC (permalink / raw)
  To: Nathan Cutler; +Cc: Andrew Schoen, Ceph Development

On Tue, Jun 20, 2017 at 4:40 PM, Nathan Cutler <ncutler@suse.cz> wrote:
>> For upstream testing of ceph-volume (if it
>> existed outside ceph.git) you'd simply install it with pip.
>
>
> Like, e.g., setuptools? Always use the latest version, and you're fine?
>
> Seriously, this would be OK if the ceph-volume maintainers ensure that every
> new release is thoroughly regression-tested against all stable versions of
> Ceph that are in maintenance at the time and agree not to roll out any
> non-backward-compatible features (?).
>
> I recall that with ceph-deploy we did get into a situation where the latest
> version did not support hammer, so hammer users could not simply grab the
> latest version of ceph-deploy. (I guess that has been fixed in the
> meantime?) Users had to "know" which version played nice with hammer and
> grab that one specifically. Also, whatever bugs were in that version, they
> were stuck with because nothing is backported.

ceph-deploy always supported up until the EOL'd versions. Not sure
what specific issue you are referring to.

I wasn't implying that there were no bugs in ceph-deploy :)


>
> I suppose the Ceph project could do the same - just maintain a single
> codestream ("master") and cut releases from time to time. It would be a lot
> simpler! And if a distro wanted to backport fixes to a previous release,
> nothing to stop them, right? But for some reason the Ceph project goes
> through the trouble to maintain multiple codestreams. And doesn't the Linux
> kernel project maintains stable branches, too?
>

Bummer, again, I wasn't implying such a thing (sarcasm?)

> Maintaining stable branches is a lot of work, so there must be good reasons
> for it. One reason I can think of is that it becomes possible to implement
> non-backward-compatible features in mainline/master, yet folks who value
> stability (i.e. distros and the vast majority of users) can continue to use
> the older codestreams and benefit from bugfixes that get backported.
>

I agree with you, stable branches is a lot of work, and the ceph
backporting team does a lot of excellent
work here. None of my comments should be read otherwise.

My position is that a small, python-only, with no direct bindings to
Ceph other than the CLI, should probably have the
freedom of staying out of tree, and that it can keep up with older
versions, and release often when bugs come up (as you've mentioned
happened
with ceph-deploy)



> Nathan
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-20 21:38                         ` Alfredo Deza
@ 2017-06-21 11:25                           ` Nathan Cutler
  2017-06-21 11:47                           ` ceph-deploy hammer support Nathan Cutler
  1 sibling, 0 replies; 23+ messages in thread
From: Nathan Cutler @ 2017-06-21 11:25 UTC (permalink / raw)
  To: Ceph Development

> My position is that a small, python-only, with no direct bindings to
> Ceph other than the CLI, should probably have the
> freedom of staying out of tree, and that it can keep up with older
> versions, and release often when bugs come up (as you've mentioned
> happened
> with ceph-deploy)

Release often in a single codestream? In other words, you are not 
planning to maintain separate codestreams for each stable branch of Ceph?

By the way, is there any other use for ceph-volume, other than as a tool 
for deploying OSDs? If not, I don't see the "no direct bindings to 
Ceph". It is bound to Ceph directly in that Ceph is the only reason for 
its existence.

Nathan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* ceph-deploy hammer support
  2017-06-20 21:38                         ` Alfredo Deza
  2017-06-21 11:25                           ` Nathan Cutler
@ 2017-06-21 11:47                           ` Nathan Cutler
  1 sibling, 0 replies; 23+ messages in thread
From: Nathan Cutler @ 2017-06-21 11:47 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: ceph-devel

> ceph-deploy always supported up until the EOL'd versions. Not sure
> what specific issue you are referring to.

http://tracker.ceph.com/issues/17128

The bug was closed without any indication that the latest version of 
ceph-deploy actually does support hammer. Although hopefully folks are 
not actively out there installing new hammer clusters, hammer itself is 
not yet EOL.

I cite this only as an example of how a tool with a single codestream 
can bring headaches for older (not-yet-EOL) versions of Ceph when new 
features are merged into that codestream and released. In other words, 
please don't construe this as an escalation of that old ceph-deploy bug :-)

Nathan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: EXT: [ceph-users] ceph-lvm - a tool to deploy OSDs from LVM volumes
  2017-06-19 21:58                     ` Sage Weil
  2017-06-20 14:09                       ` Alfredo Deza
@ 2017-06-21 12:33                       ` Alfredo Deza
  1 sibling, 0 replies; 23+ messages in thread
From: Alfredo Deza @ 2017-06-21 12:33 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, Ceph Development

On Mon, Jun 19, 2017 at 5:58 PM, Sage Weil <sweil@redhat.com> wrote:
> On Mon, 19 Jun 2017, Alfredo Deza wrote:
>> On Mon, Jun 19, 2017 at 4:24 PM, John Spray <jspray@redhat.com> wrote:
>> > On Mon, Jun 19, 2017 at 6:53 PM, Alfredo Deza <adeza@redhat.com> wrote:
>> >>>> * faster release cycles
>> >>>> * easier and faster to test
>> >>>
>> >>> I think having one part of Ceph on a different release cycle to the
>> >>> rest of Ceph is an even more dramatic thing than having it in a
>> >>> separate git repository.
>> >>>
>> >>> It seems like there is some dissatisfaction with how the Ceph project
>> >>> as whole is doing things that is driving you to try and do work
>> >>> outside of the repo where the rest of the project lives -- if the
>> >>> release cycles or test infrastructure within Ceph are not adequate for
>> >>> the tool that formats drives for OSDs, what can we do to fix them?
>> >>
>> >> It isn't Ceph the project :)
>> >>
>> >> Not every tool about Ceph has to come from ceph.git, in which case the
>> >> argument could be flipped around: why isn't ceph-installer,
>> >> ceph-ansible, ceph-deploy, radosgw-agent, etc... all coming from
>> >> within ceph.git ?
>> >
>> > ceph-installer, ceph-deploy and ceph-ansible are special cases because
>> > they are installers, that operate before a particular version of Ceph
>> > has been selected for installation, and might operate on two
>> > differently versioned clusters at the same time.
>>
>> This is a perfect use case for ceph-volume, the OSD doesn't (and in
>> most cases this is true) care what is beneath it, as long
>> as it is mounted and has what it needs to function. The rest is
>> *almost like installation*.
>
> This isn't really true.  ceph-volume (or ceph-disk lvm, or whatever we
> call it) is going to have specific knowledge about how to provision the
> OSD.  When we change the bootstrap-osd caps and change hte semantics of
> 'osd new' (take, for example, teh change we just made from 'osd create' to
> 'osd new'), then ceph-mon, the cephx caps, and ceph-disk all have to
> change in unison.  More concretely, with bluestore we have all kinds of
> choices of how we provision the volumes (what sizes, what options for
> rocksdb, whatever), those opinions will be enshrine in ceph-volume, and
> they will change from version to version... likely in unison with
> bluestore itself (as the code changes the best practice and
> recommendations change with it).
>
> In contrast, I can't think of a reason why ceph-volume without change
> independently of ceph-osd.  There is no bootstrap issue like with
> installation.  And no reason why you would want to run different between
> different versions.
>
>
>
>> > radosgw-agent, presumably (I haven't worked on it) is separate because
>> > it sits between two clusters but is logically part of neither, and
>> > those clusters could potentially be different-versioned too.
>> >
>> > ceph-disk, on the other hand, rides alongside ceph-osd, writes a
>> > format that ceph-osd needs to understand, the two go together
>> > everywhere.  You use whatever version of ceph-disk corresponds to the
>> > ceph-osd package you have.  You run whatever ceph-osd corresponds to
>> > the version of ceph-disk you just used.  The two things are not
>> > separate, any more than ceph-objectstore-tool would be.
>>
>> The OSD needs a mounted volume that has pieces that the OSD itself
>> puts in there. It is a bit convoluted because
>> there are other steps, but the tool itself isn't crucial for the OSD
>> to function, it is borderline an orchestrator to get the volume
>> where the OSD runs ready.
>>
>> >
>> > It would be more intuitive if we had called ceph-disk
>> > "ceph-osd-format" or similar.  The utility that prepares drives for
>> > use by the OSD naturally belongs in the same package (or at the very
>> > least the same release!) as the OSD code that reads that on-disk
>> > format.
>> >
>> > There is a very clear distinction in my mind between things that
>> > install Ceph (i.e. they operate before the ceph packages are on the
>> > system), and things that prepare the system (a particular Ceph version
>> > is already installed, we're just getting ready to run it).
>> > ceph-objectstore-tool would be another example of something that
>> > operates on the drives, but is intimately coupled to the OSDs and
>> > would not make sense as a separately released thing.
>>
>> And ceph-disk isn't really coupled (maybe a tiny corner of it is). Or
>> maybe you can exemplify how those are tied? I've gone through every
>> single step to get an OSD, and although in some cases it is a bit more
>> complex, it isn't more than a few steps (6 in total from our own docs):
>>
>> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#adding-an-osd-manual
>>
>> ceph-ansible *does* prepare a system for running Ceph, so does
>> ceph-docker. ceph-disk has had some pitfalls that ceph-ansible has to
>> workaround, and has to implement other things as well to be able to
>> deploy OSDs.
>
> Again, I think the 'osd create' -> 'osd new' is a perfect example of
> coupling.  And I anticipate others with bluestore.  For example, when we
> start supporting SPDK for NVMe (kernel bypass) the interface for setting
> that up will likely evolve and will need to match the behavior in
> ceph-volume.
>
> [...]
>
> Perhaps we can look at this from the other angle, though?  Why *should*
> this particular tool be separate?
>
>
>> >>>> Is your argument only to have parity in Ceph's branching? That was
>> >>>> never a problem with out-of-tree tools like ceph-deploy for example.
>> >>>
>> >>> I guess my argument isn't so much an argument as it is an assertion
>> >>> that if you want to go your own way then you need to have a really
>> >>> strong clear reason.
>> >>
>> >> Many! Like I mentioned: easier testing, faster release cycle, can
>> >> publish in any package index, doesn't need anything in ceph.git to
>> >> operate, etc..
>> >
>> > Testing: being separate is only easier if you're only doing python
>> > unit testing.  If you're testing that ceph-disk/ceph-volume really
>> > does its job, then you absolutely do want to be in the ceph tree, so
>> > that you can fire up an OSD that checks that ceph-disk really did it's
>> > job.
>> >
>> > Faster release cycle: we release pretty often.
>>
>> Uh, it depends on what "fast" means for you. 4 months waiting on a
>> ceph-disk issue that was fixed/merged to have ceph-ansible not have that
>> bug is not really fast.
>
> Can you be tell us more about this incident?  We are regular backport
> changes to the stable branches, and have a pretty regular cadence for
> stable release.

I discussed this with Ken at some point, and going through those notes
found a couple of things worth noting.

* keeping/syncing up with feature bits is going to be easier:
Although like I've mentioned this is not that hard to do, it will end
up being one less thing to worry about

* delays in backporting, or releases, is not something that should be
worked around:
In this case, it is clear I am trying to circumvent that to make other
consumer(s) life easier, like in the case with ceph-ansible. Ken
emphasized that issues like those need to bubble up so that leads are
aware. A separate tool doesn't really improve that process.

Since one of the initial agreements for this tool was to be outside of
Ceph to promote faster development to get to a release, it was
surprising to see
push back in something that I didn't see was up for discussion.

So I am still conflicted on ease of (initial) development. This will
make things a bit slower for me, but the offset is that at some point
it would make sense
for the tool to live along the rest of the Ceph components. I will
start working on porting the initial work over, and try to get some
ad-hoc test harness for pull requests (which
would also fix the issue of having to go through make check)

Having said that, I want to reiterate (seems it may have been lost in
the many back and forth) that it has not been my intention to
criticize the Ceph release cycle, or
any of the workflows for backporting features/fixes.

Thanks all for the feedback
>
>> >  We release often
>> > enough to deal with critical OSD and mon bugs.  The tool that formats
>> > OSDs doesn't need to be released more often than the OSD itself.
>>
>> It does need to be released often when the tool is new!
>
> For development, we are doing builds on a continuous basis, with new
> 'master' or branch packages every few hours in most cases.  And all of our
> deployment tools can deploy those test branches...
>
>
>> > I know how backwards that must sound, when you're looking at the
>> > possibility of having a nice self contained git repo, that contains a
>> > pypi-eligible python module, which has unit tests that run fast in
>> > jenkins on every commit.  I get the appeal!  But for the sake of the
>> > overall simplicity of Ceph, please think again, or if you really want
>> > to convert us to a multi-repo model, then make that case for the
>> > project as a whole rather than doing it individually on a bit-by-bit
>> > basis.
>>
>> We can't make the world of Ceph repos to abide today by a multi-repo
>> model. I would need to counter argue you for a few more months :)
>>
>> The examples you give for ceph-disk, and how ceph-disk is today, is why
>> we want to change things.
>>
>> It is not only faster unit tests, or a "nice self contained git repo"
>> just because we want to release to PyPI, we are facing a situation where
>> we need faster development and increased release cycles that we can't
>> get being in an already big repository.
>
> If I'm reading this right, the core reason is "faster development and
> increased release cycles".  Can you explain what that means at a
> practical level?  We build packages all day every day, and don't generally
> need packages at all for development testing.  And any release that
> uses ceph-volume is weeks away, and will be followed up by a regular
> cadence of point releases.  Where is the limitation?
>
> Thanks!
> sage
>
>
>>
>>
>> >
>> > John
>> >
>> >> Even in some cases like pybind, it has been requested numerous times
>> >> to get them on separate package indexes like PyPI, but that has always
>> >> been
>> >> *tremendously* difficult: http://tracker.ceph.com/issues/5900
>> >>>>>  - I agree with others that a single entrypoint (i.e. executable) will
>> >>>>> be more manageable than having conspicuously separate tools, but we
>> >>>>> shouldn't worry too much about making things "plugins" as such -- they
>> >>>>> can just be distinct code inside one tool, sharing as much or as
>> >>>>> little as they need.
>> >>>>>
>> >>>>> What if we delivered this set of LVM functionality as "ceph-disk lvm
>> >>>>> ..." commands to minimise the impression that the tooling is changing,
>> >>>>> even if internally it's all new/distinct code?
>> >>>>
>> >>>> That sounded appealing initially, but because we are introducing a
>> >>>> very different API, it would look odd to interact
>> >>>> with other subcommands without a normalized interaction. For example,
>> >>>> for 'prepare' this would be:
>> >>>>
>> >>>> ceph-disk prepare [...]
>> >>>>
>> >>>> And for LVM it would possible be
>> >>>>
>> >>>> ceph-disk lvm prepare [...]
>> >>>>
>> >>>> The level at which these similar actions are presented imply that one
>> >>>> may be a preferred (or even default) one, while the other one
>> >>>> isn't.
>> >>>>
>> >>>> At one point we are going to add regular disk worfklows (replacing
>> >>>> ceph-disk functionality) and then it would become even more
>> >>>> confusing to keep it there (or do you think at that point we could split?)
>> >>>>
>> >>>>>
>> >>>>> At the risk of being a bit picky about language, I don't like calling
>> >>>>> this anything with "volume" in the name, because afaik we've never
>> >>>>> ever called OSDs or the drives they occupy "volumes", so we're
>> >>>>> introducing a whole new noun, and a widely used (to mean different
>> >>>>> things) one at that.
>> >>>>>
>> >>>>
>> >>>> We have never called them 'volumes' because there was never anything
>> >>>> to support something other than regular disks, the approach
>> >>>> has always been disks and partitions.
>> >>>>
>> >>>> A "volume" can be a physical volume (e.g. a disk) or a logical one
>> >>>> (lvm, dmcache). It is an all-encompassing name to allow different
>> >>>> device-like to work with.
>> >>>
>> >>> The trouble with "volume" is that it means so many things in so many
>> >>> different storage systems -- I haven't often seen it used to mean
>> >>> "block device" or "drive".  It's more often used to describe a logical
>> >>> entity.  I also think "disk" is fine -- most people get the idea that
>> >>> a disk is a hard drive but it could also be any block device.
>> >>
>> >> If your thinking is that a disk can be any block device then yes, we
>> >> are at opposite ends here of our naming. We are picking a
>> >> "widely used" term because it is not specific. "disk" sounds fairly
>> >> specific, and we don't want that.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-06-21 12:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-16 18:11 EXT: ceph-lvm - a tool to deploy OSDs from LVM volumes Warren Wang - ISD
     [not found] ` <A5E80C1D-74F4-455B-8257-5B9E2FF6AB39-dFwxUrggiyBBDgjK7y7TUQ@public.gmane.org>
2017-06-16 18:23   ` Alfredo Deza
2017-06-16 18:42     ` EXT: [ceph-users] " Sage Weil
     [not found]     ` <CAC-Np1yqCZO7CzEjTV+hrNve857BtM_rZ+LxAi6Vf9UJkPM04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-16 19:37       ` EXT: " Willem Jan Withagen
2017-06-19 13:27       ` John Spray
     [not found]         ` <CALe9h7cV6U3A_OT9R8tv_yPoGG9zFaWF3qXV5cwYK0KM-NDu4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-19 14:13           ` Alfredo Deza
     [not found]             ` <CAC-Np1yiRgkmhZCOij9qSBmqUo-YYtErWXk2gevYuvWKrYFyeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-19 15:37               ` Willem Jan Withagen
2017-06-19 17:57                 ` EXT: [ceph-users] " Alfredo Deza
2017-06-19 18:57                   ` Willem Jan Withagen
2017-06-19 16:55             ` John Spray
2017-06-19 17:53               ` Alfredo Deza
2017-06-19 18:41                 ` Andrew Schoen
2017-06-19 20:24                 ` Fwd: " John Spray
2017-06-19 21:14                   ` Alfredo Deza
2017-06-19 21:58                     ` Sage Weil
2017-06-20 14:09                       ` Alfredo Deza
2017-06-21 12:33                       ` Alfredo Deza
2017-06-20  7:12                   ` Fwd: " Nathan Cutler
2017-06-20 14:43                     ` Andrew Schoen
2017-06-20 20:40                       ` Nathan Cutler
2017-06-20 21:38                         ` Alfredo Deza
2017-06-21 11:25                           ` Nathan Cutler
2017-06-21 11:47                           ` ceph-deploy hammer support Nathan Cutler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.