All of lore.kernel.org
 help / color / mirror / Atom feed
* systemd status
@ 2015-07-28 19:13 Sage Weil
  2015-07-28 22:16 ` Travis Rhoden
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Sage Weil @ 2015-07-28 19:13 UTC (permalink / raw)
  To: osynge, ceph-devel

Hey,

I've finally had some time to play with the systemd integration branch on 
fedora 22.  It's in wip-systemd and my current list of issues includes:

- after mon creation ceph-create-keys isn't run automagically
  - Personally I kind of hate how it was always run on mon startup and not 
just during cluster creation so I wouldn't mind *so* much if this became 
an explicit step, maybe triggered by ceph-deploy, after mon create.

- udev's attempt to trigger ceph-disk isn't working for me.  the osd 
service gets started but the mount isn't present and it fails to start.  
I'm a systemd noob and haven't sorted out how to get udev to log something 
meaningful to debug it.  Perhaps we should merge in the udev + 
systemd revamp patches here too...

- ceph-detect-init is only recently unbroken in master for fedora 22.

- ceph-deploy doesn't know that fedora should be systemd yet.

- ceph-deploy has a wip-systemd branch with a few things so far:
  - on mon create, we unconditionally systemctl enable ceph.target.  
i think osd create and mds create and rgw create should do the same thing, 
since the ceph.target is a catch-all bucket for any ceph service, and i 
don't think we want to enable it on install?
  - rgw create and mds create don't work yet
  - osd create doesn't enable ceph.target

- I'm guessing my ceph.spec changes to install teh systemd unit files 
aren't quite right... please review!  The gitbuilder turnaround is so slow 
it's hard to iterate and I don't really know what I'm doing here.

Owen, I'd like to get this just a tad bit more functional and then merge 
ASAP, then up any issues in the weeks leading up to infernalis.  What say 
ye?

sage


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-28 19:13 systemd status Sage Weil
@ 2015-07-28 22:16 ` Travis Rhoden
  2015-07-29 11:25   ` Alex Elsayed
  2015-07-31  8:11 ` Owen Synge
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Travis Rhoden @ 2015-07-28 22:16 UTC (permalink / raw)
  To: Sage Weil; +Cc: Owen Synge, ceph-devel

On Tue, Jul 28, 2015 at 12:13 PM, Sage Weil <sweil@redhat.com> wrote:
> Hey,
>
> I've finally had some time to play with the systemd integration branch on
> fedora 22.  It's in wip-systemd and my current list of issues includes:
>
> - after mon creation ceph-create-keys isn't run automagically
>   - Personally I kind of hate how it was always run on mon startup and not
> just during cluster creation so I wouldn't mind *so* much if this became
> an explicit step, maybe triggered by ceph-deploy, after mon create.

I would be happy to see this become an explicit step as well.  We
could make it conditional such that ceph-deploy only runs it if we are
dealing with systemd, but I think re-running ceph-create-keys is
always safe.  It just aborts if
/etc/ceph/{cluster}.client.admin.keyring is already present.

>
> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
> service gets started but the mount isn't present and it fails to start.
> I'm a systemd noob and haven't sorted out how to get udev to log something
> meaningful to debug it.  Perhaps we should merge in the udev +
> systemd revamp patches here too...
>
> - ceph-detect-init is only recently unbroken in master for fedora 22.
>
> - ceph-deploy doesn't know that fedora should be systemd yet.
>
> - ceph-deploy has a wip-systemd branch with a few things so far:
>   - on mon create, we unconditionally systemctl enable ceph.target.
> i think osd create and mds create and rgw create should do the same thing,
> since the ceph.target is a catch-all bucket for any ceph service, and i
> don't think we want to enable it on install?
>   - rgw create and mds create don't work yet
>   - osd create doesn't enable ceph.target

yeah, the ceph-deploy changes needed to properly support systemd are
pretty big. As part of that effort, ceph-deploy could use some
refactoring to more gracefully handle multiple init systems.  Owen has
proposed some of the changes necessary already, and there is some
current discussion about what constitutes minor vs major refactoring.
See https://github.com/ceph/ceph-deploy/pull/317 for some WIP toward
abstracting the init system in ceph-deploy.

>
> - I'm guessing my ceph.spec changes to install teh systemd unit files
> aren't quite right... please review!  The gitbuilder turnaround is so slow
> it's hard to iterate and I don't really know what I'm doing here.
>
> Owen, I'd like to get this just a tad bit more functional and then merge
> ASAP, then up any issues in the weeks leading up to infernalis.  What say
> ye?
>
> sage
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-28 22:16 ` Travis Rhoden
@ 2015-07-29 11:25   ` Alex Elsayed
  2015-07-29 12:55     ` Sage Weil
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Elsayed @ 2015-07-29 11:25 UTC (permalink / raw)
  To: ceph-devel

Travis Rhoden wrote:

> On Tue, Jul 28, 2015 at 12:13 PM, Sage Weil <sweil@redhat.com> wrote:
>> Hey,
>>
>> I've finally had some time to play with the systemd integration branch on
>> fedora 22.  It's in wip-systemd and my current list of issues includes:
>>
>> - after mon creation ceph-create-keys isn't run automagically
>>   - Personally I kind of hate how it was always run on mon startup and
>>   not
>> just during cluster creation so I wouldn't mind *so* much if this became
>> an explicit step, maybe triggered by ceph-deploy, after mon create.
> 
> I would be happy to see this become an explicit step as well.  We
> could make it conditional such that ceph-deploy only runs it if we are
> dealing with systemd, but I think re-running ceph-create-keys is
> always safe.  It just aborts if
> /etc/ceph/{cluster}.client.admin.keyring is already present.

Another option is to have the ceph-mon@.service have a Wants= and After= on 
ceph-create-keys@.service, which has a 
ConditionPathExists=!/path/to/key/from/templated/%I

With that, it would only run ceph-create-keys if the keys do not exist 
already - otherwise, it'd be skipped-as-successful.

>>
>> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
>> service gets started but the mount isn't present and it fails to start.
>> I'm a systemd noob and haven't sorted out how to get udev to log
>> something
>> meaningful to debug it.  Perhaps we should merge in the udev +
>> systemd revamp patches here too...

Personally, my opinion is that ceph-disk is doing too many things at once, 
and thus fits very poorly into the systemd architecture...

I mean, it tries to partition, format, mount, introspect the filesystem 
inside, and move the mount, depending on what the initial state was.

Now, part of the issue is that the final mountpoint depends on data inside 
the filesystem - OSD id, etc. To me, that seems... mildly absurd at least.

If the _mountpoint_ was only dependent on the partuuid, and the ceph OSD 
self-identified from the contents of the path it's passed, that would 
simplify things immensely IMO when it comes to systemd integration because 
the mount logic wouldn't need any hokey double-mounting, and could likely 
use the systemd mount machinery much more easily - thus avoiding race issues 
like the above.

>>
>> - ceph-detect-init is only recently unbroken in master for fedora 22.
>>
>> - ceph-deploy doesn't know that fedora should be systemd yet.
>>
>> - ceph-deploy has a wip-systemd branch with a few things so far:
>>   - on mon create, we unconditionally systemctl enable ceph.target.
>> i think osd create and mds create and rgw create should do the same
>> thing, since the ceph.target is a catch-all bucket for any ceph service,
>> and i don't think we want to enable it on install?
>>   - rgw create and mds create don't work yet
>>   - osd create doesn't enable ceph.target
> 
> yeah, the ceph-deploy changes needed to properly support systemd are
> pretty big. As part of that effort, ceph-deploy could use some
> refactoring to more gracefully handle multiple init systems.  Owen has
> proposed some of the changes necessary already, and there is some
> current discussion about what constitutes minor vs major refactoring.
> See https://github.com/ceph/ceph-deploy/pull/317 for some WIP toward
> abstracting the init system in ceph-deploy.
> 
>>
>> - I'm guessing my ceph.spec changes to install teh systemd unit files
>> aren't quite right... please review!  The gitbuilder turnaround is so
>> slow it's hard to iterate and I don't really know what I'm doing here.
>>
>> Owen, I'd like to get this just a tad bit more functional and then merge
>> ASAP, then up any issues in the weeks leading up to infernalis.  What say
>> ye?
>>
>> sage
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 11:25   ` Alex Elsayed
@ 2015-07-29 12:55     ` Sage Weil
  2015-07-29 13:09       ` Wyllys Ingersoll
  2015-07-29 14:08       ` Alex Elsayed
  0 siblings, 2 replies; 23+ messages in thread
From: Sage Weil @ 2015-07-29 12:55 UTC (permalink / raw)
  To: Alex Elsayed; +Cc: ceph-devel

On Wed, 29 Jul 2015, Alex Elsayed wrote:
> Travis Rhoden wrote:
> 
> > On Tue, Jul 28, 2015 at 12:13 PM, Sage Weil <sweil@redhat.com> wrote:
> >> Hey,
> >>
> >> I've finally had some time to play with the systemd integration branch on
> >> fedora 22.  It's in wip-systemd and my current list of issues includes:
> >>
> >> - after mon creation ceph-create-keys isn't run automagically
> >>   - Personally I kind of hate how it was always run on mon startup and
> >>   not
> >> just during cluster creation so I wouldn't mind *so* much if this became
> >> an explicit step, maybe triggered by ceph-deploy, after mon create.
> > 
> > I would be happy to see this become an explicit step as well.  We
> > could make it conditional such that ceph-deploy only runs it if we are
> > dealing with systemd, but I think re-running ceph-create-keys is
> > always safe.  It just aborts if
> > /etc/ceph/{cluster}.client.admin.keyring is already present.
> 
> Another option is to have the ceph-mon@.service have a Wants= and After= on 
> ceph-create-keys@.service, which has a 
> ConditionPathExists=!/path/to/key/from/templated/%I
> 
> With that, it would only run ceph-create-keys if the keys do not exist 
> already - otherwise, it'd be skipped-as-successful.

This sounds promising!

> >> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
> >> service gets started but the mount isn't present and it fails to start.
> >> I'm a systemd noob and haven't sorted out how to get udev to log
> >> something
> >> meaningful to debug it.  Perhaps we should merge in the udev +
> >> systemd revamp patches here too...
> 
> Personally, my opinion is that ceph-disk is doing too many things at once, 
> and thus fits very poorly into the systemd architecture...
> 
> I mean, it tries to partition, format, mount, introspect the filesystem 
> inside, and move the mount, depending on what the initial state was.

There is a series from David Disseldorp[1] that fixes much of this, by 
doing most of these steps in short-lived systemd tasks (instead of a 
complicated slow ceph-disk invocation directly from udev, which breaks 
udev).

> Now, part of the issue is that the final mountpoint depends on data inside 
> the filesystem - OSD id, etc. To me, that seems... mildly absurd at least.
> 
> If the _mountpoint_ was only dependent on the partuuid, and the ceph OSD 
> self-identified from the contents of the path it's passed, that would 
> simplify things immensely IMO when it comes to systemd integration because 
> the mount logic wouldn't need any hokey double-mounting, and could likely 
> use the systemd mount machinery much more easily - thus avoiding race issues 
> like the above.

Hmm.  Well, we could name the mount point with the uuid and symlink the 
osd id to that.  We could also do something sneaky like embed the osd id 
in the least significant bits of the uuid, but that throws away a lot of 
entropy and doesn't capture the cluster name (which also needs to be known 
before mount).

If the mounting and binding to the final location is done in a systemd job 
identified by the uuid, it seems like systemd would effectively handle the 
mutual exclusion and avoid races?

sage


[1] https://github.com/ddiss/ceph/tree/wip_bnc926756_split_udev_systemd_master

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 12:55     ` Sage Weil
@ 2015-07-29 13:09       ` Wyllys Ingersoll
  2015-07-29 14:08       ` Alex Elsayed
  1 sibling, 0 replies; 23+ messages in thread
From: Wyllys Ingersoll @ 2015-07-29 13:09 UTC (permalink / raw)
  To: Sage Weil; +Cc: Alex Elsayed, Ceph Development

On Wed, Jul 29, 2015 at 8:55 AM, Sage Weil <sage@newdream.net> wrote:
> On Wed, 29 Jul 2015, Alex Elsayed wrote:
>> Travis Rhoden wrote:
>>

[...]

>> >> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
>> >> service gets started but the mount isn't present and it fails to start.
>> >> I'm a systemd noob and haven't sorted out how to get udev to log
>> >> something
>> >> meaningful to debug it.  Perhaps we should merge in the udev +
>> >> systemd revamp patches here too...
>>
>> Personally, my opinion is that ceph-disk is doing too many things at once,
>> and thus fits very poorly into the systemd architecture...
>>
>> I mean, it tries to partition, format, mount, introspect the filesystem
>> inside, and move the mount, depending on what the initial state was.
>
> There is a series from David Disseldorp[1] that fixes much of this, by
> doing most of these steps in short-lived systemd tasks (instead of a
> complicated slow ceph-disk invocation directly from udev, which breaks
> udev).


Good stuff...

Is anyone working on something similar for upstart based systems?


>
>> Now, part of the issue is that the final mountpoint depends on data inside
>> the filesystem - OSD id, etc. To me, that seems... mildly absurd at least.
>>
>> If the _mountpoint_ was only dependent on the partuuid, and the ceph OSD
>> self-identified from the contents of the path it's passed, that would
>> simplify things immensely IMO when it comes to systemd integration because
>> the mount logic wouldn't need any hokey double-mounting, and could likely
>> use the systemd mount machinery much more easily - thus avoiding race issues
>> like the above.
>
> Hmm.  Well, we could name the mount point with the uuid and symlink the
> osd id to that.  We could also do something sneaky like embed the osd id
> in the least significant bits of the uuid, but that throws away a lot of
> entropy and doesn't capture the cluster name (which also needs to be known
> before mount).
>
> If the mounting and binding to the final location is done in a systemd job
> identified by the uuid, it seems like systemd would effectively handle the
> mutual exclusion and avoid races?
>
> sage

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 12:55     ` Sage Weil
  2015-07-29 13:09       ` Wyllys Ingersoll
@ 2015-07-29 14:08       ` Alex Elsayed
  2015-07-29 14:19         ` Sage Weil
  2015-08-03  8:53         ` Owen Synge
  1 sibling, 2 replies; 23+ messages in thread
From: Alex Elsayed @ 2015-07-29 14:08 UTC (permalink / raw)
  To: ceph-devel

Sage Weil wrote:

> On Wed, 29 Jul 2015, Alex Elsayed wrote:
>> Travis Rhoden wrote:
>> 
>> > On Tue, Jul 28, 2015 at 12:13 PM, Sage Weil <sweil@redhat.com> wrote:
>> >> Hey,
>> >>
>> >> I've finally had some time to play with the systemd integration branch
>> >> on
>> >> fedora 22.  It's in wip-systemd and my current list of issues
>> >> includes:
>> >>
>> >> - after mon creation ceph-create-keys isn't run automagically
>> >>   - Personally I kind of hate how it was always run on mon startup and
>> >>   not
>> >> just during cluster creation so I wouldn't mind *so* much if this
>> >> became an explicit step, maybe triggered by ceph-deploy, after mon
>> >> create.
>> > 
>> > I would be happy to see this become an explicit step as well.  We
>> > could make it conditional such that ceph-deploy only runs it if we are
>> > dealing with systemd, but I think re-running ceph-create-keys is
>> > always safe.  It just aborts if
>> > /etc/ceph/{cluster}.client.admin.keyring is already present.
>> 
>> Another option is to have the ceph-mon@.service have a Wants= and After=
>> on ceph-create-keys@.service, which has a
>> ConditionPathExists=!/path/to/key/from/templated/%I
>> 
>> With that, it would only run ceph-create-keys if the keys do not exist
>> already - otherwise, it'd be skipped-as-successful.
> 
> This sounds promising!
> 
>> >> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
>> >> service gets started but the mount isn't present and it fails to
>> >> start. I'm a systemd noob and haven't sorted out how to get udev to
>> >> log something
>> >> meaningful to debug it.  Perhaps we should merge in the udev +
>> >> systemd revamp patches here too...
>> 
>> Personally, my opinion is that ceph-disk is doing too many things at
>> once, and thus fits very poorly into the systemd architecture...
>> 
>> I mean, it tries to partition, format, mount, introspect the filesystem
>> inside, and move the mount, depending on what the initial state was.
> 
> There is a series from David Disseldorp[1] that fixes much of this, by
> doing most of these steps in short-lived systemd tasks (instead of a
> complicated slow ceph-disk invocation directly from udev, which breaks
> udev).
> 
>> Now, part of the issue is that the final mountpoint depends on data
>> inside the filesystem - OSD id, etc. To me, that seems... mildly absurd
>> at least.
>> 
>> If the _mountpoint_ was only dependent on the partuuid, and the ceph OSD
>> self-identified from the contents of the path it's passed, that would
>> simplify things immensely IMO when it comes to systemd integration
>> because the mount logic wouldn't need any hokey double-mounting, and
>> could likely use the systemd mount machinery much more easily - thus
>> avoiding race issues like the above.
> 
> Hmm.  Well, we could name the mount point with the uuid and symlink the
> osd id to that.  We could also do something sneaky like embed the osd id
> in the least significant bits of the uuid, but that throws away a lot of
> entropy and doesn't capture the cluster name (which also needs to be known
> before mount).

Does it?

If the mount point is (say) /var/ceph/$UUID, and ceph-osd can take a --
datadir parameter from which it _reads_ the cluster and ID if they aren't 
passed on the command line, I think that'd resolve the issue rather tidily 
_without_ requring that be known prior to mount.

And if I understand correctly, that data is _already in there_ for ceph-disk 
to mount it in the "final location" - it's just shuffling around who reads 
it.

> If the mounting and binding to the final location is done in a systemd job
> identified by the uuid, it seems like systemd would effectively handle the
> mutual exclusion and avoid races?

What I object to is the idea of a "final location" that depends on the 
contents of the filesystem - it's bass-ackwards IMO.

> sage
> 
> 
> [1]
> 
[https://github.com/ddiss/ceph/tree/wip_bnc926756_split_udev_systemd_master
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 14:08       ` Alex Elsayed
@ 2015-07-29 14:19         ` Sage Weil
  2015-07-29 14:30           ` Alex Elsayed
  2015-08-03  8:53         ` Owen Synge
  1 sibling, 1 reply; 23+ messages in thread
From: Sage Weil @ 2015-07-29 14:19 UTC (permalink / raw)
  To: Alex Elsayed; +Cc: ceph-devel

On Wed, 29 Jul 2015, Alex Elsayed wrote:
> Sage Weil wrote:
> 
> > On Wed, 29 Jul 2015, Alex Elsayed wrote:
> >> Travis Rhoden wrote:
> >> 
> >> > On Tue, Jul 28, 2015 at 12:13 PM, Sage Weil <sweil@redhat.com> wrote:
> >> >> Hey,
> >> >>
> >> >> I've finally had some time to play with the systemd integration branch
> >> >> on
> >> >> fedora 22.  It's in wip-systemd and my current list of issues
> >> >> includes:
> >> >>
> >> >> - after mon creation ceph-create-keys isn't run automagically
> >> >>   - Personally I kind of hate how it was always run on mon startup and
> >> >>   not
> >> >> just during cluster creation so I wouldn't mind *so* much if this
> >> >> became an explicit step, maybe triggered by ceph-deploy, after mon
> >> >> create.
> >> > 
> >> > I would be happy to see this become an explicit step as well.  We
> >> > could make it conditional such that ceph-deploy only runs it if we are
> >> > dealing with systemd, but I think re-running ceph-create-keys is
> >> > always safe.  It just aborts if
> >> > /etc/ceph/{cluster}.client.admin.keyring is already present.
> >> 
> >> Another option is to have the ceph-mon@.service have a Wants= and After=
> >> on ceph-create-keys@.service, which has a
> >> ConditionPathExists=!/path/to/key/from/templated/%I
> >> 
> >> With that, it would only run ceph-create-keys if the keys do not exist
> >> already - otherwise, it'd be skipped-as-successful.
> > 
> > This sounds promising!
> > 
> >> >> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
> >> >> service gets started but the mount isn't present and it fails to
> >> >> start. I'm a systemd noob and haven't sorted out how to get udev to
> >> >> log something
> >> >> meaningful to debug it.  Perhaps we should merge in the udev +
> >> >> systemd revamp patches here too...
> >> 
> >> Personally, my opinion is that ceph-disk is doing too many things at
> >> once, and thus fits very poorly into the systemd architecture...
> >> 
> >> I mean, it tries to partition, format, mount, introspect the filesystem
> >> inside, and move the mount, depending on what the initial state was.
> > 
> > There is a series from David Disseldorp[1] that fixes much of this, by
> > doing most of these steps in short-lived systemd tasks (instead of a
> > complicated slow ceph-disk invocation directly from udev, which breaks
> > udev).
> > 
> >> Now, part of the issue is that the final mountpoint depends on data
> >> inside the filesystem - OSD id, etc. To me, that seems... mildly absurd
> >> at least.
> >> 
> >> If the _mountpoint_ was only dependent on the partuuid, and the ceph OSD
> >> self-identified from the contents of the path it's passed, that would
> >> simplify things immensely IMO when it comes to systemd integration
> >> because the mount logic wouldn't need any hokey double-mounting, and
> >> could likely use the systemd mount machinery much more easily - thus
> >> avoiding race issues like the above.
> > 
> > Hmm.  Well, we could name the mount point with the uuid and symlink the
> > osd id to that.  We could also do something sneaky like embed the osd id
> > in the least significant bits of the uuid, but that throws away a lot of
> > entropy and doesn't capture the cluster name (which also needs to be known
> > before mount).
> 
> Does it?
> 
> If the mount point is (say) /var/ceph/$UUID, and ceph-osd can take a --
> datadir parameter from which it _reads_ the cluster and ID if they aren't 
> passed on the command line, I think that'd resolve the issue rather tidily 
> _without_ requring that be known prior to mount.
> 
> And if I understand correctly, that data is _already in there_ for ceph-disk 
> to mount it in the "final location" - it's just shuffling around who reads 
> it.

So, we could do this.  It would mean either futzing with the ceph-osd 
config variables so that they take a $uuid substitution (passed at 
startup) -or- have ceph-disk set up a symlink from the current 
/var/lib/ceph/osd/$cluster-$id location (instead of doing the bind mount 
it currently does).

But, it'll come at some cost to operators, who won't be able to 'df' or 
'mount' and see which OSD mounts are which... they'll have to poke around 
in each directory to see what mount is which.

> > If the mounting and binding to the final location is done in a systemd job
> > identified by the uuid, it seems like systemd would effectively handle the
> > mutual exclusion and avoid races?
> 
> What I object to is the idea of a "final location" that depends on the 
> contents of the filesystem - it's bass-ackwards IMO.

It's unusual, but I think it can be made to work reliably.

Are there any other opinions here?

sage

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 14:19         ` Sage Weil
@ 2015-07-29 14:30           ` Alex Elsayed
  2015-07-29 14:53             ` Sage Weil
  0 siblings, 1 reply; 23+ messages in thread
From: Alex Elsayed @ 2015-07-29 14:30 UTC (permalink / raw)
  To: ceph-devel

Sage Weil wrote:

> On Wed, 29 Jul 2015, Alex Elsayed wrote:
>> Sage Weil wrote:
>> 
>> > On Wed, 29 Jul 2015, Alex Elsayed wrote:
<snip some>
>> 
>> Does it?
>> 
>> If the mount point is (say) /var/ceph/$UUID, and ceph-osd can take a --
>> datadir parameter from which it _reads_ the cluster and ID if they aren't
>> passed on the command line, I think that'd resolve the issue rather
>> tidily _without_ requring that be known prior to mount.
>> 
>> And if I understand correctly, that data is _already in there_ for
>> ceph-disk to mount it in the "final location" - it's just shuffling
>> around who reads it.
> 
> So, we could do this.  It would mean either futzing with the ceph-osd
> config variables so that they take a $uuid substitution (passed at
> startup) -or- have ceph-disk set up a symlink from the current
> /var/lib/ceph/osd/$cluster-$id location (instead of doing the bind mount
> it currently does).

My thinking is more that the "osd data = " key makes a lot less sense in the 
systemd world overall - passing the OSD the full path on the commandline via 
some --datadir would mean you could trivially use systemd's instance 
templating, and just do

ExecStart=/usr/bin/ceph-osd -f --datadir=/var/lib/ceph/osd/%i

and be done with it. Could even do RequiresMountsFor=/var/lib/ceph/osd/%i 
too, which would order it after (and make it depend on) any systemd.mount 
units for that path.

If the path comes from ceph.conf, then the systemd unit can't do 
RequiresMountsFor, because it just plain doesn't have that information, and 
so forth. You wind up giving up various systemd capabilities because ceph's 
got its own custom-built wheel.

> But, it'll come at some cost to operators, who won't be able to 'df' or
> 'mount' and see which OSD mounts are which... they'll have to poke around
> in each directory to see what mount is which.

This is a fair point, though - however, if the symlinks just are for human 
inspection rather than critical to the operation of the system, it takes 
them out of the hot path and reduces the opportunities for failure due to 
unusual usage / extra middle steps.

Maybe put the UUID mounts in a uuid/ subdir, with $cluster-$id symlinks 
pointing into it.

>> > If the mounting and binding to the final location is done in a systemd
>> > job identified by the uuid, it seems like systemd would effectively
>> > handle the mutual exclusion and avoid races?
>> 
>> What I object to is the idea of a "final location" that depends on the
>> contents of the filesystem - it's bass-ackwards IMO.
> 
> It's unusual, but I think it can be made to work reliably.
> 
> Are there any other opinions here?



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 14:30           ` Alex Elsayed
@ 2015-07-29 14:53             ` Sage Weil
  2015-07-29 15:17               ` Alex Elsayed
  0 siblings, 1 reply; 23+ messages in thread
From: Sage Weil @ 2015-07-29 14:53 UTC (permalink / raw)
  To: Alex Elsayed; +Cc: ceph-devel

On Wed, 29 Jul 2015, Alex Elsayed wrote:
> Sage Weil wrote:
> 
> > On Wed, 29 Jul 2015, Alex Elsayed wrote:
> >> Sage Weil wrote:
> >> 
> >> > On Wed, 29 Jul 2015, Alex Elsayed wrote:
> <snip some>
> >> 
> >> Does it?
> >> 
> >> If the mount point is (say) /var/ceph/$UUID, and ceph-osd can take a --
> >> datadir parameter from which it _reads_ the cluster and ID if they aren't
> >> passed on the command line, I think that'd resolve the issue rather
> >> tidily _without_ requring that be known prior to mount.
> >> 
> >> And if I understand correctly, that data is _already in there_ for
> >> ceph-disk to mount it in the "final location" - it's just shuffling
> >> around who reads it.
> > 
> > So, we could do this.  It would mean either futzing with the ceph-osd
> > config variables so that they take a $uuid substitution (passed at
> > startup) -or- have ceph-disk set up a symlink from the current
> > /var/lib/ceph/osd/$cluster-$id location (instead of doing the bind mount
> > it currently does).
> 
> My thinking is more that the "osd data = " key makes a lot less sense in the 
> systemd world overall - passing the OSD the full path on the commandline via 
> some --datadir would mean you could trivially use systemd's instance 
> templating, and just do
> 
> ExecStart=/usr/bin/ceph-osd -f --datadir=/var/lib/ceph/osd/%i
> 
> and be done with it. Could even do RequiresMountsFor=/var/lib/ceph/osd/%i 
> too, which would order it after (and make it depend on) any systemd.mount 
> units for that path.

Note that there is a 1:1 equivalence between command line options and 
config options, so osd data = /foo and --osd-data foo are the same thing.  
Not that I think that matters here--although it's possible to manually 
specify paths in ceph.conf users can't do that if they want the udev magic 
to work (that's already true today, without systemd).

In any case, though, if your %i above is supposed to be the uuid, that's 
much less friendly than what we have now, where users can do

 systemctl stop ceph-osd@12

to stop osd.12.

I'm not sure it's worth giving up the bind mount complexity unless it 
really becomes painful to support, given how much nicer the admin 
experience is...

sage

> If the path comes from ceph.conf, then the systemd unit can't do 
> RequiresMountsFor, because it just plain doesn't have that information, and 
> so forth. You wind up giving up various systemd capabilities because ceph's 
> got its own custom-built wheel.
> 
> > But, it'll come at some cost to operators, who won't be able to 'df' or
> > 'mount' and see which OSD mounts are which... they'll have to poke around
> > in each directory to see what mount is which.
> 
> This is a fair point, though - however, if the symlinks just are for human 
> inspection rather than critical to the operation of the system, it takes 
> them out of the hot path and reduces the opportunities for failure due to 
> unusual usage / extra middle steps.
> 
> Maybe put the UUID mounts in a uuid/ subdir, with $cluster-$id symlinks 
> pointing into it.
> 
> >> > If the mounting and binding to the final location is done in a systemd
> >> > job identified by the uuid, it seems like systemd would effectively
> >> > handle the mutual exclusion and avoid races?
> >> 
> >> What I object to is the idea of a "final location" that depends on the
> >> contents of the filesystem - it's bass-ackwards IMO.
> > 
> > It's unusual, but I think it can be made to work reliably.
> > 
> > Are there any other opinions here?
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 14:53             ` Sage Weil
@ 2015-07-29 15:17               ` Alex Elsayed
  2015-07-29 16:50                 ` Vasiliy Angapov
  2015-07-30 12:45                 ` Sage Weil
  0 siblings, 2 replies; 23+ messages in thread
From: Alex Elsayed @ 2015-07-29 15:17 UTC (permalink / raw)
  To: ceph-devel

Sage Weil wrote:

> On Wed, 29 Jul 2015, Alex Elsayed wrote:
<snip for gmane>
>> My thinking is more that the "osd data = " key makes a lot less sense in
>> the systemd world overall - passing the OSD the full path on the
>> commandline via some --datadir would mean you could trivially use
>> systemd's instance templating, and just do
>> 
>> ExecStart=/usr/bin/ceph-osd -f --datadir=/var/lib/ceph/osd/%i
>> 
>> and be done with it. Could even do RequiresMountsFor=/var/lib/ceph/osd/%i
>> too, which would order it after (and make it depend on) any systemd.mount
>> units for that path.
> 
> Note that there is a 1:1 equivalence between command line options and
> config options, so osd data = /foo and --osd-data foo are the same thing.
> Not that I think that matters here--although it's possible to manually
> specify paths in ceph.conf users can't do that if they want the udev magic
> to work (that's already true today, without systemd).

Sure, though my thought was that the udev magic would work more sanely _via_ 
this. The missing part is loading the cluster and ID from the OSD data dir.

> In any case, though, if your %i above is supposed to be the uuid, that's
> much less friendly than what we have now, where users can do
> 
>  systemctl stop ceph-osd@12
> 
> to stop osd.12.
> 
> I'm not sure it's worth giving up the bind mount complexity unless it
> really becomes painful to support, given how much nicer the admin
> experience is...

Well, that does presuppose that they've either SSHed into the machine 
manually, or are using systemctl -H to do so via systemctl. That's already 
not an especially nice user experience, since they need to manually consider 
the cluster's structure.

Something more like 'ceph tell osd.N die' or similar could work, and 
SuccessExitStatus= could be used to make it even nicer (that even if it 
gives a different exit status for "die" as opposed to other successes, 
systemd can say "any of these exit codes are okay, don't autorestart")

However, neither of those handles unmounting, and it still doesn't handle 
starting. All of the above are still partial solutions; hopefully iteration 
can result in something better in all ways.

Also, note that if RequiresMountsFor= is used, unmounting the filesystem - 
by device or by mountpoint - will stop the unit due to proper dependency 
handling. (If RMF doesn't, BindsTo does - BindsTo will additionally do so if 
the device is unmounted or suddenly unplugged without systemd intervention)

systemctl stop dev-sdc.device # all OSDs running off of sdc stop
systemctl stop dev-sdd1.device # Just one partition this time

Nice and tidy.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 15:17               ` Alex Elsayed
@ 2015-07-29 16:50                 ` Vasiliy Angapov
  2015-07-31 13:29                   ` Owen Synge
  2015-07-30 12:45                 ` Sage Weil
  1 sibling, 1 reply; 23+ messages in thread
From: Vasiliy Angapov @ 2015-07-29 16:50 UTC (permalink / raw)
  To: ceph-devel

Hi colleagues,

I see some systemd-related actions here. Can you please also have a
look at how I managed to rule Ceph with systemd -
https://github.com/angapov/ceph-systemd/ ?
It uses systemd generator script, which is called every time host
boots up or when we issue "systemctl daemon-reload". It automates all
the routine job of adding/removing systemd unit files. It also has a
convenient ceph-osd and ceph-mon targets, which allows to start/stop
OSDs/MONs all at once.
I got production cluster working with it already, so this is fine for
me. It handles only OSD and MON daemons for the moment but RGW can be
added in a seconds.

The idea of systemd generators adds much more flexibility to Ceph like
the original init script has.

Best regards, Vasily.

On Wed, Jul 29, 2015 at 11:17 PM, Alex Elsayed <eternaleye@gmail.com> wrote:
> Sage Weil wrote:
>
>> On Wed, 29 Jul 2015, Alex Elsayed wrote:
> <snip for gmane>
>>> My thinking is more that the "osd data = " key makes a lot less sense in
>>> the systemd world overall - passing the OSD the full path on the
>>> commandline via some --datadir would mean you could trivially use
>>> systemd's instance templating, and just do
>>>
>>> ExecStart=/usr/bin/ceph-osd -f --datadir=/var/lib/ceph/osd/%i
>>>
>>> and be done with it. Could even do RequiresMountsFor=/var/lib/ceph/osd/%i
>>> too, which would order it after (and make it depend on) any systemd.mount
>>> units for that path.
>>
>> Note that there is a 1:1 equivalence between command line options and
>> config options, so osd data = /foo and --osd-data foo are the same thing.
>> Not that I think that matters here--although it's possible to manually
>> specify paths in ceph.conf users can't do that if they want the udev magic
>> to work (that's already true today, without systemd).
>
> Sure, though my thought was that the udev magic would work more sanely _via_
> this. The missing part is loading the cluster and ID from the OSD data dir.
>
>> In any case, though, if your %i above is supposed to be the uuid, that's
>> much less friendly than what we have now, where users can do
>>
>>  systemctl stop ceph-osd@12
>>
>> to stop osd.12.
>>
>> I'm not sure it's worth giving up the bind mount complexity unless it
>> really becomes painful to support, given how much nicer the admin
>> experience is...
>
> Well, that does presuppose that they've either SSHed into the machine
> manually, or are using systemctl -H to do so via systemctl. That's already
> not an especially nice user experience, since they need to manually consider
> the cluster's structure.
>
> Something more like 'ceph tell osd.N die' or similar could work, and
> SuccessExitStatus= could be used to make it even nicer (that even if it
> gives a different exit status for "die" as opposed to other successes,
> systemd can say "any of these exit codes are okay, don't autorestart")
>
> However, neither of those handles unmounting, and it still doesn't handle
> starting. All of the above are still partial solutions; hopefully iteration
> can result in something better in all ways.
>
> Also, note that if RequiresMountsFor= is used, unmounting the filesystem -
> by device or by mountpoint - will stop the unit due to proper dependency
> handling. (If RMF doesn't, BindsTo does - BindsTo will additionally do so if
> the device is unmounted or suddenly unplugged without systemd intervention)
>
> systemctl stop dev-sdc.device # all OSDs running off of sdc stop
> systemctl stop dev-sdd1.device # Just one partition this time
>
> Nice and tidy.
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 15:17               ` Alex Elsayed
  2015-07-29 16:50                 ` Vasiliy Angapov
@ 2015-07-30 12:45                 ` Sage Weil
  2015-07-30 19:40                   ` Robert LeBlanc
  1 sibling, 1 reply; 23+ messages in thread
From: Sage Weil @ 2015-07-30 12:45 UTC (permalink / raw)
  To: Alex Elsayed; +Cc: ceph-devel

On Wed, 29 Jul 2015, Alex Elsayed wrote:
> Sage Weil wrote:
> 
> > On Wed, 29 Jul 2015, Alex Elsayed wrote:
> <snip for gmane>
> >> My thinking is more that the "osd data = " key makes a lot less sense in
> >> the systemd world overall - passing the OSD the full path on the
> >> commandline via some --datadir would mean you could trivially use
> >> systemd's instance templating, and just do
> >> 
> >> ExecStart=/usr/bin/ceph-osd -f --datadir=/var/lib/ceph/osd/%i
> >> 
> >> and be done with it. Could even do RequiresMountsFor=/var/lib/ceph/osd/%i
> >> too, which would order it after (and make it depend on) any systemd.mount
> >> units for that path.
> > 
> > Note that there is a 1:1 equivalence between command line options and
> > config options, so osd data = /foo and --osd-data foo are the same thing.
> > Not that I think that matters here--although it's possible to manually
> > specify paths in ceph.conf users can't do that if they want the udev magic
> > to work (that's already true today, without systemd).
> 
> Sure, though my thought was that the udev magic would work more sanely _via_ 
> this. The missing part is loading the cluster and ID from the OSD data dir.
> 
> > In any case, though, if your %i above is supposed to be the uuid, that's
> > much less friendly than what we have now, where users can do
> > 
> >  systemctl stop ceph-osd@12
> > 
> > to stop osd.12.
> > 
> > I'm not sure it's worth giving up the bind mount complexity unless it
> > really becomes painful to support, given how much nicer the admin
> > experience is...
> 
> Well, that does presuppose that they've either SSHed into the machine 
> manually, or are using systemctl -H to do so via systemctl. That's already 
> not an especially nice user experience, since they need to manually consider 
> the cluster's structure.
> 
> Something more like 'ceph tell osd.N die' or similar could work, and 
> SuccessExitStatus= could be used to make it even nicer (that even if it 
> gives a different exit status for "die" as opposed to other successes, 
> systemd can say "any of these exit codes are okay, don't autorestart")
> 
> However, neither of those handles unmounting, and it still doesn't handle 
> starting. All of the above are still partial solutions; hopefully iteration 
> can result in something better in all ways.
> 
> Also, note that if RequiresMountsFor= is used, unmounting the filesystem - 
> by device or by mountpoint - will stop the unit due to proper dependency 
> handling. (If RMF doesn't, BindsTo does - BindsTo will additionally do so if 
> the device is unmounted or suddenly unplugged without systemd intervention)
> 
> systemctl stop dev-sdc.device # all OSDs running off of sdc stop
> systemctl stop dev-sdd1.device # Just one partition this time
> 
> Nice and tidy.

So, it seems like plan B would be something like:

- mounts on /var/lib/ceph/osd/data/$uuid.  For new backends that have 
multiple mounts (newstore likely will), we may also have something like 
/var/lib/ceph/osd/data-fast/$uuid as an SSD partition or something.

- systemd ceph-osd@$uuid task runs
    ceph-osd --cluster ceph --id 123 --osd-uuid $uuid

- simpler udev rules

- simpler ceph-disk behavior

- The 'one cluster per host' restriction would go away.  This is currently 
there because we only have a single systemd parameter for the @ services 
and we're using the osd id (which is not unique across clusters).  The 
uuid would be, so that's a win.

But,

- admin can't tell from 'systemctl | grep ceph' or from 'df' or 'mount' 
which OSD is which, but they could from 'ps ax | grep ceph-osd'.

- stopping an individual osd would be done by $uuid instead of osd id:

 systemctl stop ceph-osd@66f354f2-752e-409f-8194-be05f6b071d9

For an admin this is probably a cut&paste from ps ax output?

- we could perhaps make a 'ceph-disk stop' and 'ceph-disk umount' commands 
to make this a bit simpler?

What do people think?  I like simple, but I don't want to make life too 
hard on the admin.

sage

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-30 12:45                 ` Sage Weil
@ 2015-07-30 19:40                   ` Robert LeBlanc
  0 siblings, 0 replies; 23+ messages in thread
From: Robert LeBlanc @ 2015-07-30 19:40 UTC (permalink / raw)
  To: Sage Weil; +Cc: Alex Elsayed, ceph-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

I could do this if it means much better systemd
integration/monitoring/etc. Creating a directory of symlinks should be
trivial and I would use it for mapping/manual navigation.

I also like the idea of "ceph tell osd.x die". Could the OSD process
call systemctl to shut down the process and unmount itself?
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Jul 30, 2015 at 6:45 AM, Sage Weil  wrote:
> On Wed, 29 Jul 2015, Alex Elsayed wrote:
>> Sage Weil wrote:
>>
>> > On Wed, 29 Jul 2015, Alex Elsayed wrote:
>>
>> >> My thinking is more that the "osd data = " key makes a lot less sense in
>> >> the systemd world overall - passing the OSD the full path on the
>> >> commandline via some --datadir would mean you could trivially use
>> >> systemd's instance templating, and just do
>> >>
>> >> ExecStart=/usr/bin/ceph-osd -f --datadir=/var/lib/ceph/osd/%i
>> >>
>> >> and be done with it. Could even do RequiresMountsFor=/var/lib/ceph/osd/%i
>> >> too, which would order it after (and make it depend on) any systemd.mount
>> >> units for that path.
>> >
>> > Note that there is a 1:1 equivalence between command line options and
>> > config options, so osd data = /foo and --osd-data foo are the same thing.
>> > Not that I think that matters here--although it's possible to manually
>> > specify paths in ceph.conf users can't do that if they want the udev magic
>> > to work (that's already true today, without systemd).
>>
>> Sure, though my thought was that the udev magic would work more sanely _via_
>> this. The missing part is loading the cluster and ID from the OSD data dir.
>>
>> > In any case, though, if your %i above is supposed to be the uuid, that's
>> > much less friendly than what we have now, where users can do
>> >
>> >  systemctl stop ceph-osd@12
>> >
>> > to stop osd.12.
>> >
>> > I'm not sure it's worth giving up the bind mount complexity unless it
>> > really becomes painful to support, given how much nicer the admin
>> > experience is...
>>
>> Well, that does presuppose that they've either SSHed into the machine
>> manually, or are using systemctl -H to do so via systemctl. That's already
>> not an especially nice user experience, since they need to manually consider
>> the cluster's structure.
>>
>> Something more like 'ceph tell osd.N die' or similar could work, and
>> SuccessExitStatus= could be used to make it even nicer (that even if it
>> gives a different exit status for "die" as opposed to other successes,
>> systemd can say "any of these exit codes are okay, don't autorestart")
>>
>> However, neither of those handles unmounting, and it still doesn't handle
>> starting. All of the above are still partial solutions; hopefully iteration
>> can result in something better in all ways.
>>
>> Also, note that if RequiresMountsFor= is used, unmounting the filesystem -
>> by device or by mountpoint - will stop the unit due to proper dependency
>> handling. (If RMF doesn't, BindsTo does - BindsTo will additionally do so if
>> the device is unmounted or suddenly unplugged without systemd intervention)
>>
>> systemctl stop dev-sdc.device # all OSDs running off of sdc stop
>> systemctl stop dev-sdd1.device # Just one partition this time
>>
>> Nice and tidy.
>
> So, it seems like plan B would be something like:
>
> - mounts on /var/lib/ceph/osd/data/$uuid.  For new backends that have
> multiple mounts (newstore likely will), we may also have something like
> /var/lib/ceph/osd/data-fast/$uuid as an SSD partition or something.
>
> - systemd ceph-osd@$uuid task runs
>     ceph-osd --cluster ceph --id 123 --osd-uuid $uuid
>
> - simpler udev rules
>
> - simpler ceph-disk behavior
>
> - The 'one cluster per host' restriction would go away.  This is currently
> there because we only have a single systemd parameter for the @ services
> and we're using the osd id (which is not unique across clusters).  The
> uuid would be, so that's a win.
>
> But,
>
> - admin can't tell from 'systemctl | grep ceph' or from 'df' or 'mount'
> which OSD is which, but they could from 'ps ax | grep ceph-osd'.
>
> - stopping an individual osd would be done by $uuid instead of osd id:
>
>  systemctl stop ceph-osd@66f354f2-752e-409f-8194-be05f6b071d9
>
> For an admin this is probably a cut&paste from ps ax output?
>
> - we could perhaps make a 'ceph-disk stop' and 'ceph-disk umount' commands
> to make this a bit simpler?
>
> What do people think?  I like simple, but I don't want to make life too
> hard on the admin.
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVun2kCRDmVDuy+mK58QAAyuwP/A2iTot7iknxuCGdfpmV
Cl+zst2My4Ezw6n9T2r8aweRPmBdPTvz5NV6GeFcX2n3U5jBmC4DNee0ThZW
EYdCcsCG9rNP3Ej+hcRjxqQbkvIdyzRS1Uj0E6HFsM4aiH113FELsrwt/uRC
xKjh2UJCUy9Ur2LIxFSPATWQ1WzF9k4RQmSD4++nOCh+q5BkefRpp3C/UbOz
5lZpFLXn1phyEaHEeX8lCyCPM0UhypK0vDEhV6U7DH0h9XCn0sDJrTtvmgTr
Vpozl1kxxAaaVxw9dW4hhSyQZ2AuyxBnFwPgfwsFl9LzDtOmuf0jJS/mrYvj
6szc98mc9f1UvEb0VNT7AFhV32jZOoP95aVllL7EBhmOBoGMiO+BKgzFGXtI
fNu0z6N0kHnqtbN8XlkIkBs7DPmAoOyqcaCGkyyANLBQBa0ivnRmU0eF/9jI
HKzWkGmz2NLlhmCEms24xwSdc8A//VamNc7bpCsyGq8rDPpw0wlIMKvO6Iah
atTHkrTc8mL5EUQHw93N9Wt6MOEVnu6W7fJLqK1ATG1E8Jb+742LoNUqxx4L
HT9RVXC1xIO30MpMfQOd3wLC2FH9Pb/qfNIJGrn+0yCTHGLjqDTkuRl66CVu
S5tiyumUYlu3R1GaAj0gQDTyq5Gqwi5hk9plCiyPMXZSkhuBSpPRhtHjELsP
8s8F
=Mo2p
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-28 19:13 systemd status Sage Weil
  2015-07-28 22:16 ` Travis Rhoden
@ 2015-07-31  8:11 ` Owen Synge
  2015-07-31 13:23 ` Owen Synge
  2015-08-03 19:01 ` Fedora 22 systemd and ceph-deploy Owen Synge
  3 siblings, 0 replies; 23+ messages in thread
From: Owen Synge @ 2015-07-31  8:11 UTC (permalink / raw)
  To: Sage Weil, ceph-devel

> Owen, I'd like to get this just a tad bit more functional and then merge 
> ASAP, then up any issues in the weeks leading up to infernalis.  What say 
> ye?

I will look into this today and deploy a cluster on fedora that is close
to equivalent to what we have on SUSE.

Ill give you a report at the end of the day.

Best regards

Owen



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-28 19:13 systemd status Sage Weil
  2015-07-28 22:16 ` Travis Rhoden
  2015-07-31  8:11 ` Owen Synge
@ 2015-07-31 13:23 ` Owen Synge
  2015-08-03 19:01 ` Fedora 22 systemd and ceph-deploy Owen Synge
  3 siblings, 0 replies; 23+ messages in thread
From: Owen Synge @ 2015-07-31 13:23 UTC (permalink / raw)
  To: Sage Weil, ceph-devel

On 07/28/2015 09:13 PM, Sage Weil wrote:
> Hey,
> 
> I've finally had some time to play with the systemd integration branch on 
> fedora 22.  It's in wip-systemd and my current list of issues includes:
> 
> - after mon creation ceph-create-keys isn't run automagically
>   - Personally I kind of hate how it was always run on mon startup and not 
> just during cluster creation so I wouldn't mind *so* much if this became 
> an explicit step, maybe triggered by ceph-deploy, after mon create.

I agree.

> - udev's attempt to trigger ceph-disk isn't working for me.  the osd 
> service gets started but the mount isn't present and it fails to start.

I have not had this issue with suse, I will investigate soon.

> I'm a systemd noob and haven't sorted out how to get udev to log something 
> meaningful to debug it.  Perhaps we should merge in the udev + 
> systemd revamp patches here too...
> 
> - ceph-detect-init is only recently unbroken in master for fedora 22.

noted. I have never yet tested the systemd patches on anything but
opensuse and SLE12.

> - ceph-deploy doesn't know that fedora should be systemd yet.

One of ceph or ceph-deploy has to change this first. I propose ceph.

> - ceph-deploy has a wip-systemd branch with a few things so far:
>   - on mon create, we unconditionally systemctl enable ceph.target.  
> i think osd create and mds create and rgw create should do the same thing, 
> since the ceph.target is a catch-all bucket for any ceph service, and i 
> don't think we want to enable it on install?
>   - rgw create and mds create don't work yet
>   - osd create doesn't enable ceph.target

rgw does work on suse, but I noticed a bug in the patches you where
working on, I am looking at the other issues, you raise.

https://github.com/SUSE/ceph/commit/08c805c18dd994d7b2c4c521a209bb50310b1bbd


Note mds is some thing I have never tested but have never even used the mds.

> - I'm guessing my ceph.spec changes to install teh systemd unit files 
> aren't quite right... please review!  The gitbuilder turnaround is so slow 
> it's hard to iterate and I don't really know what I'm doing here.

Yes I found this too on suse, I think I am near fixing these issues for
suse. Once they are fixed I will once again look at fedora.


Best regards

Owen

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 16:50                 ` Vasiliy Angapov
@ 2015-07-31 13:29                   ` Owen Synge
  0 siblings, 0 replies; 23+ messages in thread
From: Owen Synge @ 2015-07-31 13:29 UTC (permalink / raw)
  To: Vasiliy Angapov, ceph-devel



On 07/29/2015 06:50 PM, Vasiliy Angapov wrote:
> Hi colleagues,
> 
> I see some systemd-related actions here. Can you please also have a
> look at how I managed to rule Ceph with systemd -
> https://github.com/angapov/ceph-systemd/ ?
> It uses systemd generator script, which is called every time host
> boots up or when we issue "systemctl daemon-reload". It automates all
> the routine job of adding/removing systemd unit files. It also has a
> convenient ceph-osd and ceph-mon targets, which allows to start/stop
> OSDs/MONs all at once.
> I got production cluster working with it already, so this is fine for
> me. It handles only OSD and MON daemons for the moment but RGW can be
> added in a seconds.
> 
> The idea of systemd generators adds much more flexibility to Ceph like
> the original init script has.

I have tried to use generators some time ago, but since all generators
where run before root was mounted, or disks where detected, this caused
issues on rebooting.

How did you work around this? If you didnt what OS and version of
systemd are you using?

Best regards

Owen

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: systemd status
  2015-07-29 14:08       ` Alex Elsayed
  2015-07-29 14:19         ` Sage Weil
@ 2015-08-03  8:53         ` Owen Synge
  1 sibling, 0 replies; 23+ messages in thread
From: Owen Synge @ 2015-08-03  8:53 UTC (permalink / raw)
  To: Alex Elsayed, ceph-devel



On 07/29/2015 04:08 PM, Alex Elsayed wrote:
> Sage Weil wrote:
> 
>> On Wed, 29 Jul 2015, Alex Elsayed wrote:
>>> Travis Rhoden wrote:
>>>
>>>> On Tue, Jul 28, 2015 at 12:13 PM, Sage Weil <sweil@redhat.com> wrote:
>>>>> Hey,
>>>>>
>>>>> I've finally had some time to play with the systemd integration branch
>>>>> on
>>>>> fedora 22.  It's in wip-systemd and my current list of issues
>>>>> includes:
>>>>>
>>>>> - after mon creation ceph-create-keys isn't run automagically
>>>>>   - Personally I kind of hate how it was always run on mon startup and
>>>>>   not
>>>>> just during cluster creation so I wouldn't mind *so* much if this
>>>>> became an explicit step, maybe triggered by ceph-deploy, after mon
>>>>> create.
>>>>
>>>> I would be happy to see this become an explicit step as well.  We
>>>> could make it conditional such that ceph-deploy only runs it if we are
>>>> dealing with systemd, but I think re-running ceph-create-keys is
>>>> always safe.  It just aborts if
>>>> /etc/ceph/{cluster}.client.admin.keyring is already present.
>>>
>>> Another option is to have the ceph-mon@.service have a Wants= and After=
>>> on ceph-create-keys@.service, which has a
>>> ConditionPathExists=!/path/to/key/from/templated/%I
>>>
>>> With that, it would only run ceph-create-keys if the keys do not exist
>>> already - otherwise, it'd be skipped-as-successful.
>>
>> This sounds promising!
>>
>>>>> - udev's attempt to trigger ceph-disk isn't working for me.  the osd
>>>>> service gets started but the mount isn't present and it fails to
>>>>> start. I'm a systemd noob and haven't sorted out how to get udev to
>>>>> log something
>>>>> meaningful to debug it.  Perhaps we should merge in the udev +
>>>>> systemd revamp patches here too...
>>>
>>> Personally, my opinion is that ceph-disk is doing too many things at
>>> once, and thus fits very poorly into the systemd architecture...
>>>
>>> I mean, it tries to partition, format, mount, introspect the filesystem
>>> inside, and move the mount, depending on what the initial state was.
>>
>> There is a series from David Disseldorp[1] that fixes much of this, by
>> doing most of these steps in short-lived systemd tasks (instead of a
>> complicated slow ceph-disk invocation directly from udev, which breaks
>> udev).
>>
>>> Now, part of the issue is that the final mountpoint depends on data
>>> inside the filesystem - OSD id, etc. To me, that seems... mildly absurd
>>> at least.
>>>
>>> If the _mountpoint_ was only dependent on the partuuid, and the ceph OSD
>>> self-identified from the contents of the path it's passed, that would
>>> simplify things immensely IMO when it comes to systemd integration
>>> because the mount logic wouldn't need any hokey double-mounting, and
>>> could likely use the systemd mount machinery much more easily - thus
>>> avoiding race issues like the above.
>>
>> Hmm.  Well, we could name the mount point with the uuid and symlink the
>> osd id to that.  We could also do something sneaky like embed the osd id
>> in the least significant bits of the uuid, but that throws away a lot of
>> entropy and doesn't capture the cluster name (which also needs to be known
>> before mount).
> 
> Does it?
> 
> If the mount point is (say) /var/ceph/$UUID, and ceph-osd can take a --
> datadir parameter from which it _reads_ the cluster and ID if they aren't 
> passed on the command line, I think that'd resolve the issue rather tidily 
> _without_ requring that be known prior to mount.
> 
> And if I understand correctly, that data is _already in there_ for ceph-disk 
> to mount it in the "final location" - it's just shuffling around who reads 
> it.
> 
>> If the mounting and binding to the final location is done in a systemd job
>> identified by the uuid, it seems like systemd would effectively handle the
>> mutual exclusion and avoid races?
> 
> What I object to is the idea of a "final location" that depends on the 
> contents of the filesystem - it's bass-ackwards IMO.

As I understand it this discussion is about:

	systemctl start ceph-osd@12

Vs:

	systemctl start ceph-osd@354a1e62-6f35-4b74-b633-3a8ac302cd77

I think you have a very sound argument that "12" is not unambiguous to
cluster name as 2 different "12" OSD's. Personally I do not think the
complexity of using ceph-disk is too important, as we can improve this
later.

I also worry you are at the same time not considering just how ugly
having to type UUID's without cut and paste.

Can we square the circle and get systemd plus some helper scripts to
overcome the requirement that UUID's _have_ to be used.

To me the perfect end result would be that system admins can use both
UUID, and ID to describe the service they wish to start and stop, we can
unambiguously start and stop different clusters OSD's and not _have_to
type much when their is no ambiguity.

Best regards

Owen





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Fedora 22 systemd and ceph-deploy
  2015-07-28 19:13 systemd status Sage Weil
                   ` (2 preceding siblings ...)
  2015-07-31 13:23 ` Owen Synge
@ 2015-08-03 19:01 ` Owen Synge
  2015-08-03 19:07   ` Sage Weil
  3 siblings, 1 reply; 23+ messages in thread
From: Owen Synge @ 2015-08-03 19:01 UTC (permalink / raw)
  To: Sage Weil, ceph-devel

Dear all,

My plan is to make a fedora22-systemd branch. I will leave fedora 20 as
sysvinit.

Ok just done my first proper install of systemd ceph branch on fedora22.

I can confirm most of the issues.

I am giving up for the day, but so far applying SUSE/opensuse code to
Fedora ceph-deploy code in ceph-deploy has helped a lot.

    cp /usr/lib/python2.7/site-packages/ceph_deploy/hosts/suse/mon/* \
      /usr/lib/python2.7/site-packages/ceph_deploy/hosts/fedora/mon/

(Running on suse version patched release)

It can now set up the mon daemons correctly its self.

I will look into the udev rules, tomorrow morning, and remove some more
fedora hard coding to "sysvinit".

Best regards

Owen


On 07/28/2015 09:13 PM, Sage Weil wrote:
> Hey,
> 
> I've finally had some time to play with the systemd integration branch on 
> fedora 22.  It's in wip-systemd and my current list of issues includes:
> 
> - after mon creation ceph-create-keys isn't run automagically
>   - Personally I kind of hate how it was always run on mon startup and not 
> just during cluster creation so I wouldn't mind *so* much if this became 
> an explicit step, maybe triggered by ceph-deploy, after mon create.
> 
> - udev's attempt to trigger ceph-disk isn't working for me.  the osd 
> service gets started but the mount isn't present and it fails to start.  
> I'm a systemd noob and haven't sorted out how to get udev to log something 
> meaningful to debug it.  Perhaps we should merge in the udev + 
> systemd revamp patches here too...
> 
> - ceph-detect-init is only recently unbroken in master for fedora 22.
> 
> - ceph-deploy doesn't know that fedora should be systemd yet.
> 
> - ceph-deploy has a wip-systemd branch with a few things so far:
>   - on mon create, we unconditionally systemctl enable ceph.target.  
> i think osd create and mds create and rgw create should do the same thing, 
> since the ceph.target is a catch-all bucket for any ceph service, and i 
> don't think we want to enable it on install?
>   - rgw create and mds create don't work yet
>   - osd create doesn't enable ceph.target
> 
> - I'm guessing my ceph.spec changes to install teh systemd unit files 
> aren't quite right... please review!  The gitbuilder turnaround is so slow 
> it's hard to iterate and I don't really know what I'm doing here.
> 
> Owen, I'd like to get this just a tad bit more functional and then merge 
> ASAP, then up any issues in the weeks leading up to infernalis.  What say 
> ye?
> 
> sage
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB
21284 (AG
Nürnberg)

Maxfeldstraße 5

90409 Nürnberg

Germany
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fedora 22 systemd and ceph-deploy
  2015-08-03 19:01 ` Fedora 22 systemd and ceph-deploy Owen Synge
@ 2015-08-03 19:07   ` Sage Weil
  2015-08-04 10:13     ` Owen Synge
  0 siblings, 1 reply; 23+ messages in thread
From: Sage Weil @ 2015-08-03 19:07 UTC (permalink / raw)
  To: Owen Synge; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3823 bytes --]

On Mon, 3 Aug 2015, Owen Synge wrote:
> Dear all,
> 
> My plan is to make a fedora22-systemd branch. I will leave fedora 20 as
> sysvinit.
> 
> Ok just done my first proper install of systemd ceph branch on fedora22.
> 
> I can confirm most of the issues.
> 
> I am giving up for the day, but so far applying SUSE/opensuse code to
> Fedora ceph-deploy code in ceph-deploy has helped a lot.
> 
>     cp /usr/lib/python2.7/site-packages/ceph_deploy/hosts/suse/mon/* \
>       /usr/lib/python2.7/site-packages/ceph_deploy/hosts/fedora/mon/
> 
> (Running on suse version patched release)
> 
> It can now set up the mon daemons correctly its self.
> 
> I will look into the udev rules, tomorrow morning, and remove some more
> fedora hard coding to "sysvinit".

There is a wip-systemd branch ceph-deploy that has enough ceph-deploy 
changes for me to successfully do the deployment of mon, osd, mds, and 
rgw.  The main thing it doesn't do is figure out which version of Ceph 
you're installing to decide whether to do systemd (post-hammer) or 
sysvinit (hammer and earlier).  That's going to be annoying, I'm afraid...

I suspect what we really want to do is abstract out the systemd behavior 
into something that the distros opt in to so that we aren't duplicating 
code across the suse, centos, rhel, and fedora host types...

sage


> 
> Best regards
> 
> Owen
> 
> 
> On 07/28/2015 09:13 PM, Sage Weil wrote:
> > Hey,
> > 
> > I've finally had some time to play with the systemd integration branch on 
> > fedora 22.  It's in wip-systemd and my current list of issues includes:
> > 
> > - after mon creation ceph-create-keys isn't run automagically
> >   - Personally I kind of hate how it was always run on mon startup and not 
> > just during cluster creation so I wouldn't mind *so* much if this became 
> > an explicit step, maybe triggered by ceph-deploy, after mon create.
> > 
> > - udev's attempt to trigger ceph-disk isn't working for me.  the osd 
> > service gets started but the mount isn't present and it fails to start.  
> > I'm a systemd noob and haven't sorted out how to get udev to log something 
> > meaningful to debug it.  Perhaps we should merge in the udev + 
> > systemd revamp patches here too...
> > 
> > - ceph-detect-init is only recently unbroken in master for fedora 22.
> > 
> > - ceph-deploy doesn't know that fedora should be systemd yet.
> > 
> > - ceph-deploy has a wip-systemd branch with a few things so far:
> >   - on mon create, we unconditionally systemctl enable ceph.target.  
> > i think osd create and mds create and rgw create should do the same thing, 
> > since the ceph.target is a catch-all bucket for any ceph service, and i 
> > don't think we want to enable it on install?
> >   - rgw create and mds create don't work yet
> >   - osd create doesn't enable ceph.target
> > 
> > - I'm guessing my ceph.spec changes to install teh systemd unit files 
> > aren't quite right... please review!  The gitbuilder turnaround is so slow 
> > it's hard to iterate and I don't really know what I'm doing here.
> > 
> > Owen, I'd like to get this just a tad bit more functional and then merge 
> > ASAP, then up any issues in the weeks leading up to infernalis.  What say 
> > ye?
> > 
> > sage
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> -- 
> SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB
> 21284 (AG
> Nürnberg)
> 
> Maxfeldstraße 5
> 
> 90409 Nürnberg
> 
> Germany
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fedora 22 systemd and ceph-deploy
  2015-08-03 19:07   ` Sage Weil
@ 2015-08-04 10:13     ` Owen Synge
  2015-08-04 12:32       ` Fedora 22 systemd and rgw Owen Synge
  0 siblings, 1 reply; 23+ messages in thread
From: Owen Synge @ 2015-08-04 10:13 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 08/03/2015 09:07 PM, Sage Weil wrote:
> On Mon, 3 Aug 2015, Owen Synge wrote:
>> Dear all,
>>
>> My plan is to make a fedora22-systemd branch. I will leave fedora 20 as
>> sysvinit.
>>
>> Ok just done my first proper install of systemd ceph branch on fedora22.
>>
>> I can confirm most of the issues.
>>
>> I am giving up for the day, but so far applying SUSE/opensuse code to
>> Fedora ceph-deploy code in ceph-deploy has helped a lot.
>>
>>     cp /usr/lib/python2.7/site-packages/ceph_deploy/hosts/suse/mon/* \
>>       /usr/lib/python2.7/site-packages/ceph_deploy/hosts/fedora/mon/
>>
>> (Running on suse version patched release)
>>
>> It can now set up the mon daemons correctly its self.
>>
>> I will look into the udev rules, tomorrow morning, and remove some more
>> fedora hard coding to "sysvinit".

I made a mistake, just doing the copy is enough to deploy mon and osd's
I had made a typo to make me believe osd deployment was not.

I will check the rgw.

> There is a wip-systemd branch ceph-deploy that has enough ceph-deploy 
> changes for me to successfully do the deployment of mon, osd, mds, and 
> rgw.  

So can we merge the wip-systemd branch of ceph?

This would then allow us to fix ceph-detect-init.

This would also make it clearer to ceph-deploy developers what they need
to do.

> The main thing it doesn't do is figure out which version of Ceph
> you're installing to decide whether to do systemd (post-hammer) or 
> sysvinit (hammer and earlier).  That's going to be annoying, I'm afraid...

I have not looked at the  wip-systemd branch ceph-deploy, as I knew the
SUSE version supported systemd, and knew their is no SUSE magic in the
code as I wrote it, except supporting systemd.

In principle, I agree it is a nice to have release version of ceph,
changes in behaviour for ceph-deploy. I have great fears under the
existing code style that adding an extra dimension will cause an
explosion of branching code, in ceph-deploy which is in my opinion
already far too large in terms of LOC/functionality.

> I suspect what we really want to do is abstract out the systemd behavior 
> into something that the distros opt in to so that we aren't duplicating 
> code across the suse, centos, rhel, and fedora host types...

Adding command line options to override "default" behaviour (or having a
model as in MVC design pattern) might be a good way to start improving
ceph-deploy to this objective.

My opinion is better use of façade pattern to contain the explosion of
conditionals and duplication that this would cause if the code is using
current coding style, but moving toward MVC will also help.

These 2 patches go a long way to solve the issues of duplicating code
across the suse, centos, rhel, and fedora host types:

https://github.com/ceph/ceph-deploy/pull/317
https://github.com/ceph/ceph-deploy/pull/318

Pull request #317 is how I should like to see the new code base evolve.
Pull request #318 is within the current ceph-deploy code structure to
see variables derived on a per distribution way.

Combined these two patches show how I would move forward on ceph-deploy.

This sort of façade pattern seems to avoid code duplication, already
present all over the ceph-deploy code base.

For now I suspect it is far easier to back port the option to use
systemd to supported releases of hammer and earlier.

To delay progress of ceph on ceph-deploy, is in my opinion the wrong
thing to do, and I get the impression it is being maintained rather than
pushing ceph or its self forward.

Best regards

Owen
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fedora 22 systemd and rgw
  2015-08-04 10:13     ` Owen Synge
@ 2015-08-04 12:32       ` Owen Synge
  2015-08-04 13:07         ` Sage Weil
  0 siblings, 1 reply; 23+ messages in thread
From: Owen Synge @ 2015-08-04 12:32 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 08/04/2015 12:13 PM, Owen Synge wrote:
> On 08/03/2015 09:07 PM, Sage Weil wrote:
>> On Mon, 3 Aug 2015, Owen Synge wrote:

> I will check the rgw.

It is not working due to missing:

/usr/lib/ceph-radosgw/ceph-radosgw-prestart.sh

which is a useful check tool, available in this commit:

https://github.com/SUSE/ceph/commit/92ef2ecfe0c0c0b0df4c9349310f930057202305

I assume this got lost some where earlier, and should be placed in the
systemd directory in my opinion.

and some small permission issues resolved all is working fine.

(core dump on failing to read the key is a bit ugly)

Best regards

Owen

PS for other comments more important than this detail please see earlier
emails in this thread.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fedora 22 systemd and rgw
  2015-08-04 12:32       ` Fedora 22 systemd and rgw Owen Synge
@ 2015-08-04 13:07         ` Sage Weil
  2015-08-04 16:10           ` Owen Synge
  0 siblings, 1 reply; 23+ messages in thread
From: Sage Weil @ 2015-08-04 13:07 UTC (permalink / raw)
  To: Owen Synge; +Cc: ceph-devel

On Tue, 4 Aug 2015, Owen Synge wrote:
> On 08/04/2015 12:13 PM, Owen Synge wrote:
> > On 08/03/2015 09:07 PM, Sage Weil wrote:
> >> On Mon, 3 Aug 2015, Owen Synge wrote:
> 
> > I will check the rgw.
> 
> It is not working due to missing:
> 
> /usr/lib/ceph-radosgw/ceph-radosgw-prestart.sh
> 
> which is a useful check tool, available in this commit:
> 
> https://github.com/SUSE/ceph/commit/92ef2ecfe0c0c0b0df4c9349310f930057202305

I removed this reference from the unit file, see

	https://github.com/ceph/ceph/commit/4d10dc134b817160bab6aecb9f5c08fb2d4f08e6

mainly because I didn't have a copy of the prestart script in my tree.  
Looking at it now, though, it's not clear to me that any of those steps 
are necessary.  They might be useful in making a legacy install/config 
continue functioning, but I don't think any of the complexity is needed 
for newly created rgw instances, and I'd prefer to make the upgrade 
process look like

 - upgrade ceph
 - killall radosgw
 - [optional] rename and sanitize ceph.conf section if there are special 
configs
 - ceph-deploy rgw create $hostname

than worry about trying to keep supporting the kludgey way we used 
to deploy rgw.

FWIW with the wip-systemd branch I was able to deploy new rgw instances 
on fc22 without any issues...

sage

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fedora 22 systemd and rgw
  2015-08-04 13:07         ` Sage Weil
@ 2015-08-04 16:10           ` Owen Synge
  0 siblings, 0 replies; 23+ messages in thread
From: Owen Synge @ 2015-08-04 16:10 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 08/04/2015 03:07 PM, Sage Weil wrote:
> On Tue, 4 Aug 2015, Owen Synge wrote:
>> On 08/04/2015 12:13 PM, Owen Synge wrote:
>>> On 08/03/2015 09:07 PM, Sage Weil wrote:
>>>> On Mon, 3 Aug 2015, Owen Synge wrote:
>>
>>> I will check the rgw.
>>
>> It is not working due to missing:
>>
>> /usr/lib/ceph-radosgw/ceph-radosgw-prestart.sh
>>
>> which is a useful check tool, available in this commit:
>>
>> https://github.com/SUSE/ceph/commit/92ef2ecfe0c0c0b0df4c9349310f930057202305
> 
> I removed this reference from the unit file, see
> 
> 	https://github.com/ceph/ceph/commit/4d10dc134b817160bab6aecb9f5c08fb2d4f08e6
> 
> mainly because I didn't have a copy of the prestart script in my tree.  

Ah ok, I lost that during my merges.

> Looking at it now, though, it's not clear to me that any of those steps 
> are necessary.

You are correct, the code will run without the prestart script.

the prestart script is _only_ validating that the deamon should run
because the rgw often fails with confusing to end user errors.

The prestart code shoudl in my opinion be mostly implement in C++ in rgw.


> They might be useful in making a legacy install/config 
> continue functioning, 

No its just validation.

> but I don't think any of the complexity is needed
> for newly created rgw instances, and I'd prefer to make the upgrade 
> process look like
> 
>  - upgrade ceph
>  - killall radosgw
>  - [optional] rename and sanitize ceph.conf section if there are special 
> configs
>  - ceph-deploy rgw create $hostname
> 
> than worry about trying to keep supporting the kludgey way we used 
> to deploy rgw.

Yes, no objections at all.

> FWIW with the wip-systemd branch I was able to deploy new rgw instances 
> on fc22 without any issues...

Great news, I hope the wip-systemd branch of ceph-deploy adds hardly any
branching, like the SUSE version adds little, so it should have no
justification to do so, and could even reduce branching in ceph-deploy.

Best regards

Owen

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-08-04 16:11 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-28 19:13 systemd status Sage Weil
2015-07-28 22:16 ` Travis Rhoden
2015-07-29 11:25   ` Alex Elsayed
2015-07-29 12:55     ` Sage Weil
2015-07-29 13:09       ` Wyllys Ingersoll
2015-07-29 14:08       ` Alex Elsayed
2015-07-29 14:19         ` Sage Weil
2015-07-29 14:30           ` Alex Elsayed
2015-07-29 14:53             ` Sage Weil
2015-07-29 15:17               ` Alex Elsayed
2015-07-29 16:50                 ` Vasiliy Angapov
2015-07-31 13:29                   ` Owen Synge
2015-07-30 12:45                 ` Sage Weil
2015-07-30 19:40                   ` Robert LeBlanc
2015-08-03  8:53         ` Owen Synge
2015-07-31  8:11 ` Owen Synge
2015-07-31 13:23 ` Owen Synge
2015-08-03 19:01 ` Fedora 22 systemd and ceph-deploy Owen Synge
2015-08-03 19:07   ` Sage Weil
2015-08-04 10:13     ` Owen Synge
2015-08-04 12:32       ` Fedora 22 systemd and rgw Owen Synge
2015-08-04 13:07         ` Sage Weil
2015-08-04 16:10           ` Owen Synge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.