All of lore.kernel.org
 help / color / mirror / Atom feed
From: Loic Dachary <loic@dachary.org>
To: Travis Rhoden <trhoden@gmail.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: ceph-deploy osd destroy feature
Date: Mon, 05 Jan 2015 18:32:10 +0100	[thread overview]
Message-ID: <54AACA9A.4080205@dachary.org> (raw)
In-Reply-To: <CACkq2mr9yxmiLd028hQmsC4TLqb3FBgww-6O_pML3ubyybFo5g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6003 bytes --]

Hi Travis,

Just one comment inline, in addition to what Sage wrote.

On 05/01/2015 18:14, Travis Rhoden wrote:
> Hi Loic and Wido,
> 
> Loic - I agree with you that it makes more sense to implement the core
> of the logic in ceph-disk where it can be re-used by other tools (like
> ceph-deploy) or by administrators directly.  There are a lot of
> conventions put in place by ceph-disk such that ceph-disk is the best
> place to undo them as part of clean-up.  I'll pursue this with other
> Ceph devs to see if I can get agreement on the best approach.
> 
> At a high-level, ceph-disk has two commands that I think could have a
> corollary -- prepare, and activate.
> 
> Prepare will format and mkfs a disk/dir as needed to make it usable by Ceph.
> Activate will put the resulting disk/dir into service by allocating an
> OSD ID, creating the cephx key, and marking the init system as needed,
> and finally starting the ceph-osd service.
> 
> It seems like there could be two opposite commands that do the following:
> 
> deactivate:
>  - set "ceph osd out"
>  - stop ceph-osd service if needed
>  - remove OSD from CRUSH map
>  - remove OSD cephx key
>  - deallocate OSD ID
>  - remove 'ready', 'active', and INIT-specific files (to Wido's point)
>  - umount device and remove mount point
> 
> destroy:
>  - zap disk (removes partition table and disk content)
> 
> A few questions I have from this, though.  Is this granular enough?
> If all the steps listed above are done in deactivate, is it useful?
> Or are there usecases we need to cover where some of those steps need
> to be done but not all?  Deactivating in this case would be
> permanently removing the disk from the cluster.  If you are just
> moving a disk from one host to another, Ceph already supports that
> with no additional steps other than stop service, move disk, start
> service.

It is useful for test purposes. For instance, the puppet-ceph integration tests can use it to ensure the osd is removed properly with no knowledge of the details.

> Is "destroy" even necessary?  It's really just zap at that point,
> which already exists.  It only seems necessary to me if we add extra
> functionality, like the ability to do a wipe of some kind first.  If
> it is just zap, you could call zap separate or with --zap as an option
> to deactivate.
>
> 
> And all of this would need to be able to fail somewhat gracefully, as
> you would often be dealing with dead/failed disks that may not allow
> these commands to run successfully.  That's why I'm wondering if it
> would be best to break the steps currently in "deactivate" into two
> commands -- (1) deactivate: which would deal with commands specific to
> the disk (osd out, stop service, remove marker files, umount) and (2)
> remove: which would undefine the OSD within the cluster (remove from
> CRUSH, remove cephx key, deallocate OSD ID).
> 
> I'm mostly talking out loud here.  Looking for more ideas, input.  :)
> 
>  - Travis
> 
> 
> On Sun, Jan 4, 2015 at 6:07 AM, Wido den Hollander <wido@42on.com> wrote:
>> On 01/02/2015 10:31 PM, Travis Rhoden wrote:
>>> Hi everyone,
>>>
>>> There has been a long-standing request [1] to implement an OSD
>>> "destroy" capability to ceph-deploy.  A community user has submitted a
>>> pull request implementing this feature [2].  While the code needs a
>>> bit of work (there are a few things to work out before it would be
>>> ready to merge), I want to verify that the approach is sound before
>>> diving into it.
>>>
>>> As it currently stands, the new feature would do allow for the following:
>>>
>>> ceph-deploy osd destroy <host> --osd-id <id>
>>>
>>> From that command, ceph-deploy would reach out to the host, do "ceph
>>> osd out", stop the ceph-osd service for the OSD, then finish by doing
>>> "ceph osd crush remove", "ceph auth del", and "ceph osd rm".  Finally,
>>> it would umount the OSD, typically in /var/lib/ceph/osd/...
>>>
>>
>> Prior to the unmount, shouldn't it also clean up the 'ready' file to
>> prevent the OSD from starting after a reboot?
>>
>> Although it's key has been removed from the cluster it shouldn't matter
>> that much, but it seems a bit cleaner.
>>
>> It could even be more destructive, that if you pass --zap-disk to it, it
>> also runs wipefs or something to clean the whole disk.
>>
>>>
>>> Does this high-level approach seem sane?  Anything that is missing
>>> when trying to remove an OSD?
>>>
>>>
>>> There are a few specifics to the current PR that jump out to me as
>>> things to address.  The format of the command is a bit rough, as other
>>> "ceph-deploy osd" commands take a list of [host[:disk[:journal]]] args
>>> to specify a bunch of disks/osds to act on at one.  But this command
>>> only allows one at a time, by virtue of the --osd-id argument.  We
>>> could try to accept [host:disk] and look up the OSD ID from that, or
>>> potentially take [host:ID] as input.
>>>
>>> Additionally, what should be done with the OSD's journal during the
>>> destroy process?  Should it be left untouched?
>>>
>>> Should there be any additional barriers to performing such a
>>> destructive command?  User confirmation?
>>>
>>>
>>>  - Travis
>>>
>>> [1] http://tracker.ceph.com/issues/3480
>>> [2] https://github.com/ceph/ceph-deploy/pull/254
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> --
>> Wido den Hollander
>> 42on B.V.
>> Ceph trainer and consultant
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

      parent reply	other threads:[~2015-01-05 17:32 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-02 21:31 ceph-deploy osd destroy feature Travis Rhoden
2015-01-02 22:29 ` Loic Dachary
2015-01-04 11:07 ` Wido den Hollander
2015-01-05 17:14   ` Travis Rhoden
2015-01-05 17:27     ` Sage Weil
2015-01-05 17:53       ` Travis Rhoden
2015-01-05 18:18         ` Sage Weil
2015-01-06  0:42           ` Robert LeBlanc
2015-01-06  4:21             ` Wei-Chung Cheng
2015-01-06  5:08               ` Sage Weil
2015-01-06  6:34                 ` Wei-Chung Cheng
2015-01-06 14:28                   ` Sage Weil
2015-01-06 16:19                     ` Travis Rhoden
2015-01-06 16:23                       ` Sage Weil
2015-01-06 16:30                         ` Travis Rhoden
2015-01-07  2:18                           ` Wei-Chung Cheng
2015-01-05 17:32     ` Loic Dachary [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54AACA9A.4080205@dachary.org \
    --to=loic@dachary.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=trhoden@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.