All of lore.kernel.org
 help / color / mirror / Atom feed
* ceph status reporting non-existing osd
@ 2012-07-13  8:17 Andrey Korolyov
  2012-07-13 17:03 ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-13  8:17 UTC (permalink / raw)
  To: ceph-devel

Hi,

Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
six-node,
and I have removed a bunch of rbd objects during recovery to avoid
overfill.
Right now I`m constantly receiving a warn about nearfull state on
non-existing osd:

   health HEALTH_WARN 1 near full osd(s)
   monmap e3: 3 mons at
{0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
election epoch 240, quorum 0,1,2 0,1,2
   osdmap e2098: 4 osds: 4 up, 4 in
    pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
used, 143 GB / 324 GB avail
   mdsmap e181: 1/1/1 up {0=a=up:active}

HEALTH_WARN 1 near full osd(s)
osd.4 is near full at 89%

Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
Reducing has been done 'on-line', e.g. without restart entire cluster.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-13  8:17 ceph status reporting non-existing osd Andrey Korolyov
@ 2012-07-13 17:03 ` Gregory Farnum
  2012-07-13 17:09   ` Sage Weil
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-13 17:03 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: ceph-devel

On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
> Hi,
>
> Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
> six-node,
> and I have removed a bunch of rbd objects during recovery to avoid
> overfill.
> Right now I`m constantly receiving a warn about nearfull state on
> non-existing osd:
>
>    health HEALTH_WARN 1 near full osd(s)
>    monmap e3: 3 mons at
> {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> election epoch 240, quorum 0,1,2 0,1,2
>    osdmap e2098: 4 osds: 4 up, 4 in
>     pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
> used, 143 GB / 324 GB avail
>    mdsmap e181: 1/1/1 up {0=a=up:active}
>
> HEALTH_WARN 1 near full osd(s)
> osd.4 is near full at 89%
>
> Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
> Reducing has been done 'on-line', e.g. without restart entire cluster.

Whoops! It looks like Sage has written some patches to fix this, but
for now you should be good if you just update your ratios to a larger
number, and then bring them back down again. :)
-Greg

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-13 17:03 ` Gregory Farnum
@ 2012-07-13 17:09   ` Sage Weil
  2012-07-14 14:20     ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Sage Weil @ 2012-07-13 17:09 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Andrey Korolyov, ceph-devel

On Fri, 13 Jul 2012, Gregory Farnum wrote:
> On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
> > Hi,
> >
> > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
> > six-node,
> > and I have removed a bunch of rbd objects during recovery to avoid
> > overfill.
> > Right now I`m constantly receiving a warn about nearfull state on
> > non-existing osd:
> >
> >    health HEALTH_WARN 1 near full osd(s)
> >    monmap e3: 3 mons at
> > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> > election epoch 240, quorum 0,1,2 0,1,2
> >    osdmap e2098: 4 osds: 4 up, 4 in
> >     pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
> > used, 143 GB / 324 GB avail
> >    mdsmap e181: 1/1/1 up {0=a=up:active}
> >
> > HEALTH_WARN 1 near full osd(s)
> > osd.4 is near full at 89%
> >
> > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
> > Reducing has been done 'on-line', e.g. without restart entire cluster.
> 
> Whoops! It looks like Sage has written some patches to fix this, but
> for now you should be good if you just update your ratios to a larger
> number, and then bring them back down again. :)

Restarting ceph-mon should also do the trick.

Thanks for the bug report!
sage

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-13 17:09   ` Sage Weil
@ 2012-07-14 14:20     ` Andrey Korolyov
  2012-07-16 16:12       ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-14 14:20 UTC (permalink / raw)
  To: Sage Weil; +Cc: Gregory Farnum, ceph-devel

On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com> wrote:
> On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>> > Hi,
>> >
>> > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
>> > six-node,
>> > and I have removed a bunch of rbd objects during recovery to avoid
>> > overfill.
>> > Right now I`m constantly receiving a warn about nearfull state on
>> > non-existing osd:
>> >
>> >    health HEALTH_WARN 1 near full osd(s)
>> >    monmap e3: 3 mons at
>> > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > election epoch 240, quorum 0,1,2 0,1,2
>> >    osdmap e2098: 4 osds: 4 up, 4 in
>> >     pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
>> > used, 143 GB / 324 GB avail
>> >    mdsmap e181: 1/1/1 up {0=a=up:active}
>> >
>> > HEALTH_WARN 1 near full osd(s)
>> > osd.4 is near full at 89%
>> >
>> > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
>> > Reducing has been done 'on-line', e.g. without restart entire cluster.
>>
>> Whoops! It looks like Sage has written some patches to fix this, but
>> for now you should be good if you just update your ratios to a larger
>> number, and then bring them back down again. :)
>
> Restarting ceph-mon should also do the trick.
>
> Thanks for the bug report!
> sage

Should I restart mons simultaneously? Restarting one by one has no
effect, same as filling up data pool up to ~95 percent(btw, when I
deleted this 50Gb file on cephfs, mds was stuck permanently and usage
remained same until I dropped and recreated data pool - hope it`s one
of known posix layer bugs). I also deleted entry from config, and then
restarted mons, with no effect. Any suggestions?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-14 14:20     ` Andrey Korolyov
@ 2012-07-16 16:12       ` Gregory Farnum
  2012-07-16 18:42         ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-16 16:12 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
> On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
> > On Fri, 13 Jul 2012, Gregory Farnum wrote:
> > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
> > > > Hi,
> > > >  
> > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
> > > > six-node,
> > > > and I have removed a bunch of rbd objects during recovery to avoid
> > > > overfill.
> > > > Right now I`m constantly receiving a warn about nearfull state on
> > > > non-existing osd:
> > > >  
> > > > health HEALTH_WARN 1 near full osd(s)
> > > > monmap e3: 3 mons at
> > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> > > > election epoch 240, quorum 0,1,2 0,1,2
> > > > osdmap e2098: 4 osds: 4 up, 4 in
> > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
> > > > used, 143 GB / 324 GB avail
> > > > mdsmap e181: 1/1/1 up {0=a=up:active}
> > > >  
> > > > HEALTH_WARN 1 near full osd(s)
> > > > osd.4 is near full at 89%
> > > >  
> > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
> > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
> > >  
> > >  
> > >  
> > > Whoops! It looks like Sage has written some patches to fix this, but
> > > for now you should be good if you just update your ratios to a larger
> > > number, and then bring them back down again. :)
> >  
> >  
> >  
> > Restarting ceph-mon should also do the trick.
> >  
> > Thanks for the bug report!
> > sage
>  
>  
>  
> Should I restart mons simultaneously?
I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
  
> Restarting one by one has no
> effect, same as filling up data pool up to ~95 percent(btw, when I
> deleted this 50Gb file on cephfs, mds was stuck permanently and usage
> remained same until I dropped and recreated data pool - hope it`s one
> of known posix layer bugs). I also deleted entry from config, and then
> restarted mons, with no effect. Any suggestions?

I'm not sure what you're asking about here?  
-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-16 16:12       ` Gregory Farnum
@ 2012-07-16 18:42         ` Andrey Korolyov
  2012-07-16 18:48           ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-16 18:42 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com> wrote:
> On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
>> On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
>> > On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
>> > > > Hi,
>> > > >
>> > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
>> > > > six-node,
>> > > > and I have removed a bunch of rbd objects during recovery to avoid
>> > > > overfill.
>> > > > Right now I`m constantly receiving a warn about nearfull state on
>> > > > non-existing osd:
>> > > >
>> > > > health HEALTH_WARN 1 near full osd(s)
>> > > > monmap e3: 3 mons at
>> > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > > > election epoch 240, quorum 0,1,2 0,1,2
>> > > > osdmap e2098: 4 osds: 4 up, 4 in
>> > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
>> > > > used, 143 GB / 324 GB avail
>> > > > mdsmap e181: 1/1/1 up {0=a=up:active}
>> > > >
>> > > > HEALTH_WARN 1 near full osd(s)
>> > > > osd.4 is near full at 89%
>> > > >
>> > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
>> > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
>> > >
>> > >
>> > >
>> > > Whoops! It looks like Sage has written some patches to fix this, but
>> > > for now you should be good if you just update your ratios to a larger
>> > > number, and then bring them back down again. :)
>> >
>> >
>> >
>> > Restarting ceph-mon should also do the trick.
>> >
>> > Thanks for the bug report!
>> > sage
>>
>>
>>
>> Should I restart mons simultaneously?
> I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
>
>> Restarting one by one has no
>> effect, same as filling up data pool up to ~95 percent(btw, when I
>> deleted this 50Gb file on cephfs, mds was stuck permanently and usage
>> remained same until I dropped and recreated data pool - hope it`s one
>> of known posix layer bugs). I also deleted entry from config, and then
>> restarted mons, with no effect. Any suggestions?
>
> I'm not sure what you're asking about here?
> -Greg
>

Oh, sorry, I have mislooked and thought that you suggested filling up
osds. How do I can set full/nearfull ratios correctly?

$ceph injectargs '--mon_osd_full_ratio 96'
parsed options
$ ceph injectargs '--mon_osd_near_full_ratio 94'
parsed options

ceph pg dump | grep 'full'
full_ratio 0.95
nearfull_ratio 0.85

Setting parameters in the ceph.conf and then restarting mons does not
affect ratios either.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-16 18:42         ` Andrey Korolyov
@ 2012-07-16 18:48           ` Gregory Farnum
  2012-07-16 18:55             ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-16 18:48 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

"ceph pg set_full_ratio 0.95"  
"ceph pg set_nearfull_ratio 0.94"


On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:

> On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
> > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
> > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
> > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
> > > > > > Hi,
> > > > > >  
> > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
> > > > > > six-node,
> > > > > > and I have removed a bunch of rbd objects during recovery to avoid
> > > > > > overfill.
> > > > > > Right now I`m constantly receiving a warn about nearfull state on
> > > > > > non-existing osd:
> > > > > >  
> > > > > > health HEALTH_WARN 1 near full osd(s)
> > > > > > monmap e3: 3 mons at
> > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> > > > > > election epoch 240, quorum 0,1,2 0,1,2
> > > > > > osdmap e2098: 4 osds: 4 up, 4 in
> > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
> > > > > > used, 143 GB / 324 GB avail
> > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
> > > > > >  
> > > > > > HEALTH_WARN 1 near full osd(s)
> > > > > > osd.4 is near full at 89%
> > > > > >  
> > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
> > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > Whoops! It looks like Sage has written some patches to fix this, but
> > > > > for now you should be good if you just update your ratios to a larger
> > > > > number, and then bring them back down again. :)
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > > Restarting ceph-mon should also do the trick.
> > > >  
> > > > Thanks for the bug report!
> > > > sage
> > >  
> > >  
> > >  
> > >  
> > >  
> > > Should I restart mons simultaneously?
> > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
> >  
> > > Restarting one by one has no
> > > effect, same as filling up data pool up to ~95 percent(btw, when I
> > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
> > > remained same until I dropped and recreated data pool - hope it`s one
> > > of known posix layer bugs). I also deleted entry from config, and then
> > > restarted mons, with no effect. Any suggestions?
> >  
> >  
> >  
> > I'm not sure what you're asking about here?
> > -Greg
>  
>  
>  
> Oh, sorry, I have mislooked and thought that you suggested filling up
> osds. How do I can set full/nearfull ratios correctly?
>  
> $ceph injectargs '--mon_osd_full_ratio 96'
> parsed options
> $ ceph injectargs '--mon_osd_near_full_ratio 94'
> parsed options
>  
> ceph pg dump | grep 'full'
> full_ratio 0.95
> nearfull_ratio 0.85
>  
> Setting parameters in the ceph.conf and then restarting mons does not
> affect ratios either.



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-16 18:48           ` Gregory Farnum
@ 2012-07-16 18:55             ` Andrey Korolyov
  2012-07-18  6:09               ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-16 18:55 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <greg@inktank.com> wrote:
> "ceph pg set_full_ratio 0.95"
> "ceph pg set_nearfull_ratio 0.94"
>
>
> On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
>
>> On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
>> > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
>> > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
>> > > > > > six-node,
>> > > > > > and I have removed a bunch of rbd objects during recovery to avoid
>> > > > > > overfill.
>> > > > > > Right now I`m constantly receiving a warn about nearfull state on
>> > > > > > non-existing osd:
>> > > > > >
>> > > > > > health HEALTH_WARN 1 near full osd(s)
>> > > > > > monmap e3: 3 mons at
>> > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > > > > > election epoch 240, quorum 0,1,2 0,1,2
>> > > > > > osdmap e2098: 4 osds: 4 up, 4 in
>> > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
>> > > > > > used, 143 GB / 324 GB avail
>> > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
>> > > > > >
>> > > > > > HEALTH_WARN 1 near full osd(s)
>> > > > > > osd.4 is near full at 89%
>> > > > > >
>> > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
>> > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Whoops! It looks like Sage has written some patches to fix this, but
>> > > > > for now you should be good if you just update your ratios to a larger
>> > > > > number, and then bring them back down again. :)
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Restarting ceph-mon should also do the trick.
>> > > >
>> > > > Thanks for the bug report!
>> > > > sage
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Should I restart mons simultaneously?
>> > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
>> >
>> > > Restarting one by one has no
>> > > effect, same as filling up data pool up to ~95 percent(btw, when I
>> > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
>> > > remained same until I dropped and recreated data pool - hope it`s one
>> > > of known posix layer bugs). I also deleted entry from config, and then
>> > > restarted mons, with no effect. Any suggestions?
>> >
>> >
>> >
>> > I'm not sure what you're asking about here?
>> > -Greg
>>
>>
>>
>> Oh, sorry, I have mislooked and thought that you suggested filling up
>> osds. How do I can set full/nearfull ratios correctly?
>>
>> $ceph injectargs '--mon_osd_full_ratio 96'
>> parsed options
>> $ ceph injectargs '--mon_osd_near_full_ratio 94'
>> parsed options
>>
>> ceph pg dump | grep 'full'
>> full_ratio 0.95
>> nearfull_ratio 0.85
>>
>> Setting parameters in the ceph.conf and then restarting mons does not
>> affect ratios either.
>
>
>

Thanks, it worked, but setting values back result to turn warning back.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-16 18:55             ` Andrey Korolyov
@ 2012-07-18  6:09               ` Gregory Farnum
  2012-07-18  6:22                 ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-18  6:09 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov wrote:
> On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > "ceph pg set_full_ratio 0.95"
> > "ceph pg set_nearfull_ratio 0.94"
> >  
> >  
> > On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
> >  
> > > On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > > > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
> > > > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
> > > > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
> > > > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
> > > > > > > > Hi,
> > > > > > > >  
> > > > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
> > > > > > > > six-node,
> > > > > > > > and I have removed a bunch of rbd objects during recovery to avoid
> > > > > > > > overfill.
> > > > > > > > Right now I`m constantly receiving a warn about nearfull state on
> > > > > > > > non-existing osd:
> > > > > > > >  
> > > > > > > > health HEALTH_WARN 1 near full osd(s)
> > > > > > > > monmap e3: 3 mons at
> > > > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> > > > > > > > election epoch 240, quorum 0,1,2 0,1,2
> > > > > > > > osdmap e2098: 4 osds: 4 up, 4 in
> > > > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
> > > > > > > > used, 143 GB / 324 GB avail
> > > > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
> > > > > > > >  
> > > > > > > > HEALTH_WARN 1 near full osd(s)
> > > > > > > > osd.4 is near full at 89%
> > > > > > > >  
> > > > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
> > > > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > Whoops! It looks like Sage has written some patches to fix this, but
> > > > > > > for now you should be good if you just update your ratios to a larger
> > > > > > > number, and then bring them back down again. :)
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > Restarting ceph-mon should also do the trick.
> > > > > >  
> > > > > > Thanks for the bug report!
> > > > > > sage
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > Should I restart mons simultaneously?
> > > > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
> > > >  
> > > > > Restarting one by one has no
> > > > > effect, same as filling up data pool up to ~95 percent(btw, when I
> > > > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
> > > > > remained same until I dropped and recreated data pool - hope it`s one
> > > > > of known posix layer bugs). I also deleted entry from config, and then
> > > > > restarted mons, with no effect. Any suggestions?
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > > I'm not sure what you're asking about here?
> > > > -Greg
> > >  
> > >  
> > >  
> > >  
> > >  
> > > Oh, sorry, I have mislooked and thought that you suggested filling up
> > > osds. How do I can set full/nearfull ratios correctly?
> > >  
> > > $ceph injectargs '--mon_osd_full_ratio 96'
> > > parsed options
> > > $ ceph injectargs '--mon_osd_near_full_ratio 94'
> > > parsed options
> > >  
> > > ceph pg dump | grep 'full'
> > > full_ratio 0.95
> > > nearfull_ratio 0.85
> > >  
> > > Setting parameters in the ceph.conf and then restarting mons does not
> > > affect ratios either.
> >  
>  
>  
>  
> Thanks, it worked, but setting values back result to turn warning back.  
Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?  
-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18  6:09               ` Gregory Farnum
@ 2012-07-18  6:22                 ` Andrey Korolyov
  2012-07-18  7:18                   ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-18  6:22 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com> wrote:
> On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov wrote:
>> On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > "ceph pg set_full_ratio 0.95"
>> > "ceph pg set_nearfull_ratio 0.94"
>> >
>> >
>> > On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
>> >
>> > > On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > > > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
>> > > > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
>> > > > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> > > > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
>> > > > > > > > Hi,
>> > > > > > > >
>> > > > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
>> > > > > > > > six-node,
>> > > > > > > > and I have removed a bunch of rbd objects during recovery to avoid
>> > > > > > > > overfill.
>> > > > > > > > Right now I`m constantly receiving a warn about nearfull state on
>> > > > > > > > non-existing osd:
>> > > > > > > >
>> > > > > > > > health HEALTH_WARN 1 near full osd(s)
>> > > > > > > > monmap e3: 3 mons at
>> > > > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > > > > > > > election epoch 240, quorum 0,1,2 0,1,2
>> > > > > > > > osdmap e2098: 4 osds: 4 up, 4 in
>> > > > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
>> > > > > > > > used, 143 GB / 324 GB avail
>> > > > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
>> > > > > > > >
>> > > > > > > > HEALTH_WARN 1 near full osd(s)
>> > > > > > > > osd.4 is near full at 89%
>> > > > > > > >
>> > > > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
>> > > > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > Whoops! It looks like Sage has written some patches to fix this, but
>> > > > > > > for now you should be good if you just update your ratios to a larger
>> > > > > > > number, and then bring them back down again. :)
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Restarting ceph-mon should also do the trick.
>> > > > > >
>> > > > > > Thanks for the bug report!
>> > > > > > sage
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Should I restart mons simultaneously?
>> > > > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
>> > > >
>> > > > > Restarting one by one has no
>> > > > > effect, same as filling up data pool up to ~95 percent(btw, when I
>> > > > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
>> > > > > remained same until I dropped and recreated data pool - hope it`s one
>> > > > > of known posix layer bugs). I also deleted entry from config, and then
>> > > > > restarted mons, with no effect. Any suggestions?
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > I'm not sure what you're asking about here?
>> > > > -Greg
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Oh, sorry, I have mislooked and thought that you suggested filling up
>> > > osds. How do I can set full/nearfull ratios correctly?
>> > >
>> > > $ceph injectargs '--mon_osd_full_ratio 96'
>> > > parsed options
>> > > $ ceph injectargs '--mon_osd_near_full_ratio 94'
>> > > parsed options
>> > >
>> > > ceph pg dump | grep 'full'
>> > > full_ratio 0.95
>> > > nearfull_ratio 0.85
>> > >
>> > > Setting parameters in the ceph.conf and then restarting mons does not
>> > > affect ratios either.
>> >
>>
>>
>>
>> Thanks, it worked, but setting values back result to turn warning back.
> Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
> -Greg
>

As I have did removal, it was definitely not like that - at first
place, I have marked osds(4 and 5 on same host) out, then rebuilt
crushmap and then kill osd processes. As I mentioned before, osd.4
doest not exist in crushmap and therefore it shouldn`t be reported at
all(theoretically).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18  6:22                 ` Andrey Korolyov
@ 2012-07-18  7:18                   ` Gregory Farnum
  2012-07-18  7:47                     ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-18  7:18 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov wrote:
> > > On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > > > "ceph pg set_full_ratio 0.95"
> > > > "ceph pg set_nearfull_ratio 0.94"
> > > >  
> > > >  
> > > > On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
> > > >  
> > > > > On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > > > > > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
> > > > > > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
> > > > > > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
> > > > > > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
> > > > > > > > > > Hi,
> > > > > > > > > >  
> > > > > > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
> > > > > > > > > > six-node,
> > > > > > > > > > and I have removed a bunch of rbd objects during recovery to avoid
> > > > > > > > > > overfill.
> > > > > > > > > > Right now I`m constantly receiving a warn about nearfull state on
> > > > > > > > > > non-existing osd:
> > > > > > > > > >  
> > > > > > > > > > health HEALTH_WARN 1 near full osd(s)
> > > > > > > > > > monmap e3: 3 mons at
> > > > > > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> > > > > > > > > > election epoch 240, quorum 0,1,2 0,1,2
> > > > > > > > > > osdmap e2098: 4 osds: 4 up, 4 in
> > > > > > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
> > > > > > > > > > used, 143 GB / 324 GB avail
> > > > > > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
> > > > > > > > > >  
> > > > > > > > > > HEALTH_WARN 1 near full osd(s)
> > > > > > > > > > osd.4 is near full at 89%
> > > > > > > > > >  
> > > > > > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
> > > > > > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Whoops! It looks like Sage has written some patches to fix this, but
> > > > > > > > > for now you should be good if you just update your ratios to a larger
> > > > > > > > > number, and then bring them back down again. :)
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Restarting ceph-mon should also do the trick.
> > > > > > > >  
> > > > > > > > Thanks for the bug report!
> > > > > > > > sage
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > Should I restart mons simultaneously?
> > > > > > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
> > > > > >  
> > > > > > > Restarting one by one has no
> > > > > > > effect, same as filling up data pool up to ~95 percent(btw, when I
> > > > > > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
> > > > > > > remained same until I dropped and recreated data pool - hope it`s one
> > > > > > > of known posix layer bugs). I also deleted entry from config, and then
> > > > > > > restarted mons, with no effect. Any suggestions?
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > I'm not sure what you're asking about here?
> > > > > > -Greg
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > Oh, sorry, I have mislooked and thought that you suggested filling up
> > > > > osds. How do I can set full/nearfull ratios correctly?
> > > > >  
> > > > > $ceph injectargs '--mon_osd_full_ratio 96'
> > > > > parsed options
> > > > > $ ceph injectargs '--mon_osd_near_full_ratio 94'
> > > > > parsed options
> > > > >  
> > > > > ceph pg dump | grep 'full'
> > > > > full_ratio 0.95
> > > > > nearfull_ratio 0.85
> > > > >  
> > > > > Setting parameters in the ceph.conf and then restarting mons does not
> > > > > affect ratios either.
> > > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > > Thanks, it worked, but setting values back result to turn warning back.
> > Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
> > -Greg
>  
>  
>  
> As I have did removal, it was definitely not like that - at first
> place, I have marked osds(4 and 5 on same host) out, then rebuilt
> crushmap and then kill osd processes. As I mentioned before, osd.4
> doest not exist in crushmap and therefore it shouldn`t be reported at
> all(theoretically).

Okay, that's what happened — marking an OSD out in the CRUSH map means all the data gets moved off it, but that doesn't remove it from all the places where it's registered in the monitor and in the map, for a couple reasons:  
1) You might want to mark an OSD out before taking it down, to allow for more orderly data movement.
2) OSDs can get marked out automatically, but the system shouldn't be able to forget about them on its own.
3) You might want to remove an OSD from the CRUSH map in the process of placing it somewhere else (perhaps you moved the physical machine to a new location).
etc.

You want to run "ceph osd rm 4 5" and that should unregister both of them from everything[1]. :)
-Greg
[1]: Except for the full lists, which have a bug in the version of code you're running — remove the OSDs, then adjust the full ratios again, and all will be well.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18  7:18                   ` Gregory Farnum
@ 2012-07-18  7:47                     ` Andrey Korolyov
  2012-07-18 18:30                       ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-18  7:47 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum <greg@inktank.com> wrote:
> On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
>> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov wrote:
>> > > On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > > > "ceph pg set_full_ratio 0.95"
>> > > > "ceph pg set_nearfull_ratio 0.94"
>> > > >
>> > > >
>> > > > On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
>> > > >
>> > > > > On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > > > > > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
>> > > > > > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
>> > > > > > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> > > > > > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xdel.ru (mailto:andrey@xdel.ru)> wrote:
>> > > > > > > > > > Hi,
>> > > > > > > > > >
>> > > > > > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
>> > > > > > > > > > six-node,
>> > > > > > > > > > and I have removed a bunch of rbd objects during recovery to avoid
>> > > > > > > > > > overfill.
>> > > > > > > > > > Right now I`m constantly receiving a warn about nearfull state on
>> > > > > > > > > > non-existing osd:
>> > > > > > > > > >
>> > > > > > > > > > health HEALTH_WARN 1 near full osd(s)
>> > > > > > > > > > monmap e3: 3 mons at
>> > > > > > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > > > > > > > > > election epoch 240, quorum 0,1,2 0,1,2
>> > > > > > > > > > osdmap e2098: 4 osds: 4 up, 4 in
>> > > > > > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
>> > > > > > > > > > used, 143 GB / 324 GB avail
>> > > > > > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
>> > > > > > > > > >
>> > > > > > > > > > HEALTH_WARN 1 near full osd(s)
>> > > > > > > > > > osd.4 is near full at 89%
>> > > > > > > > > >
>> > > > > > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
>> > > > > > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Whoops! It looks like Sage has written some patches to fix this, but
>> > > > > > > > > for now you should be good if you just update your ratios to a larger
>> > > > > > > > > number, and then bring them back down again. :)
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Restarting ceph-mon should also do the trick.
>> > > > > > > >
>> > > > > > > > Thanks for the bug report!
>> > > > > > > > sage
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > Should I restart mons simultaneously?
>> > > > > > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
>> > > > > >
>> > > > > > > Restarting one by one has no
>> > > > > > > effect, same as filling up data pool up to ~95 percent(btw, when I
>> > > > > > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
>> > > > > > > remained same until I dropped and recreated data pool - hope it`s one
>> > > > > > > of known posix layer bugs). I also deleted entry from config, and then
>> > > > > > > restarted mons, with no effect. Any suggestions?
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > I'm not sure what you're asking about here?
>> > > > > > -Greg
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Oh, sorry, I have mislooked and thought that you suggested filling up
>> > > > > osds. How do I can set full/nearfull ratios correctly?
>> > > > >
>> > > > > $ceph injectargs '--mon_osd_full_ratio 96'
>> > > > > parsed options
>> > > > > $ ceph injectargs '--mon_osd_near_full_ratio 94'
>> > > > > parsed options
>> > > > >
>> > > > > ceph pg dump | grep 'full'
>> > > > > full_ratio 0.95
>> > > > > nearfull_ratio 0.85
>> > > > >
>> > > > > Setting parameters in the ceph.conf and then restarting mons does not
>> > > > > affect ratios either.
>> > > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Thanks, it worked, but setting values back result to turn warning back.
>> > Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
>> > -Greg
>>
>>
>>
>> As I have did removal, it was definitely not like that - at first
>> place, I have marked osds(4 and 5 on same host) out, then rebuilt
>> crushmap and then kill osd processes. As I mentioned before, osd.4
>> doest not exist in crushmap and therefore it shouldn`t be reported at
>> all(theoretically).
>
> Okay, that's what happened — marking an OSD out in the CRUSH map means all the data gets moved off it, but that doesn't remove it from all the places where it's registered in the monitor and in the map, for a couple reasons:
> 1) You might want to mark an OSD out before taking it down, to allow for more orderly data movement.
> 2) OSDs can get marked out automatically, but the system shouldn't be able to forget about them on its own.
> 3) You might want to remove an OSD from the CRUSH map in the process of placing it somewhere else (perhaps you moved the physical machine to a new location).
> etc.
>
> You want to run "ceph osd rm 4 5" and that should unregister both of them from everything[1]. :)
> -Greg
> [1]: Except for the full lists, which have a bug in the version of code you're running — remove the OSDs, then adjust the full ratios again, and all will be well.
>

$ ceph osd rm 4
osd.4 does not exist
$ ceph -s
   health HEALTH_WARN 1 near full osd(s)
   monmap e3: 3 mons at
{0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
election epoch 58, quorum 0,1,2 0,1,2
   osdmap e2198: 4 osds: 4 up, 4 in
    pgmap v586056: 464 pgs: 464 active+clean; 66645 MB data, 231 GB
used, 95877 MB / 324 GB avail
   mdsmap e207: 1/1/1 up {0=a=up:active}

$ ceph health detail
HEALTH_WARN 1 near full osd(s)
osd.4 is near full at 89%

$ ceph osd dump
....
max_osd 4
osd.0 up   in  weight 1 up_from 2183 up_thru 2187 down_at 2172
last_clean_interval [2136,2171) 192.168.10.128:6800/4030
192.168.10.128:6801/4030 192.168.10.128:6802/4030 exists,up
68b3deec-e80a-48b7-9c29-1b98f5de4f62
osd.1 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
last_clean_interval [2115,2134) 192.168.10.129:6800/2980
192.168.10.129:6801/2980 192.168.10.129:6802/2980 exists,up
b2a26fe9-aaa8-445f-be1f-fa7d2a283b57
osd.2 up   in  weight 1 up_from 2181 up_thru 2187 down_at 2172
last_clean_interval [2136,2171) 192.168.10.128:6803/4128
192.168.10.128:6804/4128 192.168.10.128:6805/4128 exists,up
378d367a-f7fb-4892-9ec9-db8ffdd2eb20
osd.3 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
last_clean_interval [2115,2134) 192.168.10.129:6803/3069
192.168.10.129:6804/3069 192.168.10.129:6805/3069 exists,up
faf8eda8-55fc-4a0e-899f-47dbd32b81b8
....
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18  7:47                     ` Andrey Korolyov
@ 2012-07-18 18:30                       ` Gregory Farnum
  2012-07-18 19:07                         ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-18 18:30 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

On Wed, Jul 18, 2012 at 12:47 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
> On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum <greg@inktank.com> wrote:
>> On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
>>> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>>> > Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
>>> > -Greg
>>>
>>>
>>>
>>> As I have did removal, it was definitely not like that - at first
>>> place, I have marked osds(4 and 5 on same host) out, then rebuilt
>>> crushmap and then kill osd processes. As I mentioned before, osd.4
>>> doest not exist in crushmap and therefore it shouldn`t be reported at
>>> all(theoretically).
>>
>> Okay, that's what happened — marking an OSD out in the CRUSH map means all the data gets moved off it, but that doesn't remove it from all the places where it's registered in the monitor and in the map, for a couple reasons:
>> 1) You might want to mark an OSD out before taking it down, to allow for more orderly data movement.
>> 2) OSDs can get marked out automatically, but the system shouldn't be able to forget about them on its own.
>> 3) You might want to remove an OSD from the CRUSH map in the process of placing it somewhere else (perhaps you moved the physical machine to a new location).
>> etc.
>>
>> You want to run "ceph osd rm 4 5" and that should unregister both of them from everything[1]. :)
>> -Greg
>> [1]: Except for the full lists, which have a bug in the version of code you're running — remove the OSDs, then adjust the full ratios again, and all will be well.
>>
>
> $ ceph osd rm 4
> osd.4 does not exist
> $ ceph -s
>    health HEALTH_WARN 1 near full osd(s)
>    monmap e3: 3 mons at
> {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
> election epoch 58, quorum 0,1,2 0,1,2
>    osdmap e2198: 4 osds: 4 up, 4 in
>     pgmap v586056: 464 pgs: 464 active+clean; 66645 MB data, 231 GB
> used, 95877 MB / 324 GB avail
>    mdsmap e207: 1/1/1 up {0=a=up:active}
>
> $ ceph health detail
> HEALTH_WARN 1 near full osd(s)
> osd.4 is near full at 89%
>
> $ ceph osd dump
> ....
> max_osd 4
> osd.0 up   in  weight 1 up_from 2183 up_thru 2187 down_at 2172
> last_clean_interval [2136,2171) 192.168.10.128:6800/4030
> 192.168.10.128:6801/4030 192.168.10.128:6802/4030 exists,up
> 68b3deec-e80a-48b7-9c29-1b98f5de4f62
> osd.1 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
> last_clean_interval [2115,2134) 192.168.10.129:6800/2980
> 192.168.10.129:6801/2980 192.168.10.129:6802/2980 exists,up
> b2a26fe9-aaa8-445f-be1f-fa7d2a283b57
> osd.2 up   in  weight 1 up_from 2181 up_thru 2187 down_at 2172
> last_clean_interval [2136,2171) 192.168.10.128:6803/4128
> 192.168.10.128:6804/4128 192.168.10.128:6805/4128 exists,up
> 378d367a-f7fb-4892-9ec9-db8ffdd2eb20
> osd.3 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
> last_clean_interval [2115,2134) 192.168.10.129:6803/3069
> 192.168.10.129:6804/3069 192.168.10.129:6805/3069 exists,up
> faf8eda8-55fc-4a0e-899f-47dbd32b81b8
> ....

Hrm. How did you create your new crush map? All the normal avenues of
removing an OSD from the map set a flag which the PGMap uses to delete
its records (which would prevent it reappearing in the full list), and
I can't see how setcrushmap would remove an OSD from the map (although
there might be a code path I haven't found).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18 18:30                       ` Gregory Farnum
@ 2012-07-18 19:07                         ` Andrey Korolyov
  2012-07-18 21:28                           ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-18 19:07 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

On Wed, Jul 18, 2012 at 10:30 PM, Gregory Farnum <greg@inktank.com> wrote:
> On Wed, Jul 18, 2012 at 12:47 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>> On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum <greg@inktank.com> wrote:
>>> On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
>>>> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>>>> > Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
>>>> > -Greg
>>>>
>>>>
>>>>
>>>> As I have did removal, it was definitely not like that - at first
>>>> place, I have marked osds(4 and 5 on same host) out, then rebuilt
>>>> crushmap and then kill osd processes. As I mentioned before, osd.4
>>>> doest not exist in crushmap and therefore it shouldn`t be reported at
>>>> all(theoretically).
>>>
>>> Okay, that's what happened — marking an OSD out in the CRUSH map means all the data gets moved off it, but that doesn't remove it from all the places where it's registered in the monitor and in the map, for a couple reasons:
>>> 1) You might want to mark an OSD out before taking it down, to allow for more orderly data movement.
>>> 2) OSDs can get marked out automatically, but the system shouldn't be able to forget about them on its own.
>>> 3) You might want to remove an OSD from the CRUSH map in the process of placing it somewhere else (perhaps you moved the physical machine to a new location).
>>> etc.
>>>
>>> You want to run "ceph osd rm 4 5" and that should unregister both of them from everything[1]. :)
>>> -Greg
>>> [1]: Except for the full lists, which have a bug in the version of code you're running — remove the OSDs, then adjust the full ratios again, and all will be well.
>>>
>>
>> $ ceph osd rm 4
>> osd.4 does not exist
>> $ ceph -s
>>    health HEALTH_WARN 1 near full osd(s)
>>    monmap e3: 3 mons at
>> {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> election epoch 58, quorum 0,1,2 0,1,2
>>    osdmap e2198: 4 osds: 4 up, 4 in
>>     pgmap v586056: 464 pgs: 464 active+clean; 66645 MB data, 231 GB
>> used, 95877 MB / 324 GB avail
>>    mdsmap e207: 1/1/1 up {0=a=up:active}
>>
>> $ ceph health detail
>> HEALTH_WARN 1 near full osd(s)
>> osd.4 is near full at 89%
>>
>> $ ceph osd dump
>> ....
>> max_osd 4
>> osd.0 up   in  weight 1 up_from 2183 up_thru 2187 down_at 2172
>> last_clean_interval [2136,2171) 192.168.10.128:6800/4030
>> 192.168.10.128:6801/4030 192.168.10.128:6802/4030 exists,up
>> 68b3deec-e80a-48b7-9c29-1b98f5de4f62
>> osd.1 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
>> last_clean_interval [2115,2134) 192.168.10.129:6800/2980
>> 192.168.10.129:6801/2980 192.168.10.129:6802/2980 exists,up
>> b2a26fe9-aaa8-445f-be1f-fa7d2a283b57
>> osd.2 up   in  weight 1 up_from 2181 up_thru 2187 down_at 2172
>> last_clean_interval [2136,2171) 192.168.10.128:6803/4128
>> 192.168.10.128:6804/4128 192.168.10.128:6805/4128 exists,up
>> 378d367a-f7fb-4892-9ec9-db8ffdd2eb20
>> osd.3 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
>> last_clean_interval [2115,2134) 192.168.10.129:6803/3069
>> 192.168.10.129:6804/3069 192.168.10.129:6805/3069 exists,up
>> faf8eda8-55fc-4a0e-899f-47dbd32b81b8
>> ....
>
> Hrm. How did you create your new crush map? All the normal avenues of
> removing an OSD from the map set a flag which the PGMap uses to delete
> its records (which would prevent it reappearing in the full list), and
> I can't see how setcrushmap would remove an OSD from the map (although
> there might be a code path I haven't found).

Manually, by deleting osd4|5 entries and reweighing remaining nodes.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18 19:07                         ` Andrey Korolyov
@ 2012-07-18 21:28                           ` Gregory Farnum
  2012-07-19  6:17                             ` Andrey Korolyov
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2012-07-18 21:28 UTC (permalink / raw)
  To: Andrey Korolyov; +Cc: Sage Weil, ceph-devel

On Wed, Jul 18, 2012 at 12:07 PM, Andrey Korolyov <andrey@xdel.ru> wrote:
> On Wed, Jul 18, 2012 at 10:30 PM, Gregory Farnum <greg@inktank.com> wrote:
>> On Wed, Jul 18, 2012 at 12:47 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>>> On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum <greg@inktank.com> wrote:
>>>> On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
>>>>> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>>>>> > Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
>>>>> > -Greg
>>>>>
>>>>>
>>>>>
>>>>> As I have did removal, it was definitely not like that - at first
>>>>> place, I have marked osds(4 and 5 on same host) out, then rebuilt
>>>>> crushmap and then kill osd processes. As I mentioned before, osd.4
>>>>> doest not exist in crushmap and therefore it shouldn`t be reported at
>>>>> all(theoretically).
>>>>
>>>> Okay, that's what happened — marking an OSD out in the CRUSH map means all the data gets moved off it, but that doesn't remove it from all the places where it's registered in the monitor and in the map, for a couple reasons:
>>>> 1) You might want to mark an OSD out before taking it down, to allow for more orderly data movement.
>>>> 2) OSDs can get marked out automatically, but the system shouldn't be able to forget about them on its own.
>>>> 3) You might want to remove an OSD from the CRUSH map in the process of placing it somewhere else (perhaps you moved the physical machine to a new location).
>>>> etc.
>>>>
>>>> You want to run "ceph osd rm 4 5" and that should unregister both of them from everything[1]. :)
>>>> -Greg
>>>> [1]: Except for the full lists, which have a bug in the version of code you're running — remove the OSDs, then adjust the full ratios again, and all will be well.
>>>>
>>>
>>> $ ceph osd rm 4
>>> osd.4 does not exist
>>> $ ceph -s
>>>    health HEALTH_WARN 1 near full osd(s)
>>>    monmap e3: 3 mons at
>>> {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>>> election epoch 58, quorum 0,1,2 0,1,2
>>>    osdmap e2198: 4 osds: 4 up, 4 in
>>>     pgmap v586056: 464 pgs: 464 active+clean; 66645 MB data, 231 GB
>>> used, 95877 MB / 324 GB avail
>>>    mdsmap e207: 1/1/1 up {0=a=up:active}
>>>
>>> $ ceph health detail
>>> HEALTH_WARN 1 near full osd(s)
>>> osd.4 is near full at 89%
>>>
>>> $ ceph osd dump
>>> ....
>>> max_osd 4
>>> osd.0 up   in  weight 1 up_from 2183 up_thru 2187 down_at 2172
>>> last_clean_interval [2136,2171) 192.168.10.128:6800/4030
>>> 192.168.10.128:6801/4030 192.168.10.128:6802/4030 exists,up
>>> 68b3deec-e80a-48b7-9c29-1b98f5de4f62
>>> osd.1 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
>>> last_clean_interval [2115,2134) 192.168.10.129:6800/2980
>>> 192.168.10.129:6801/2980 192.168.10.129:6802/2980 exists,up
>>> b2a26fe9-aaa8-445f-be1f-fa7d2a283b57
>>> osd.2 up   in  weight 1 up_from 2181 up_thru 2187 down_at 2172
>>> last_clean_interval [2136,2171) 192.168.10.128:6803/4128
>>> 192.168.10.128:6804/4128 192.168.10.128:6805/4128 exists,up
>>> 378d367a-f7fb-4892-9ec9-db8ffdd2eb20
>>> osd.3 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
>>> last_clean_interval [2115,2134) 192.168.10.129:6803/3069
>>> 192.168.10.129:6804/3069 192.168.10.129:6805/3069 exists,up
>>> faf8eda8-55fc-4a0e-899f-47dbd32b81b8
>>> ....
>>
>> Hrm. How did you create your new crush map? All the normal avenues of
>> removing an OSD from the map set a flag which the PGMap uses to delete
>> its records (which would prevent it reappearing in the full list), and
>> I can't see how setcrushmap would remove an OSD from the map (although
>> there might be a code path I haven't found).
>
> Manually, by deleting osd4|5 entries and reweighing remaining nodes.

So you extracted the CRUSH map, edited it, and injected it using "ceph
osd setrcrushmap"?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ceph status reporting non-existing osd
  2012-07-18 21:28                           ` Gregory Farnum
@ 2012-07-19  6:17                             ` Andrey Korolyov
  0 siblings, 0 replies; 16+ messages in thread
From: Andrey Korolyov @ 2012-07-19  6:17 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel

On Thu, Jul 19, 2012 at 1:28 AM, Gregory Farnum <greg@inktank.com> wrote:
> On Wed, Jul 18, 2012 at 12:07 PM, Andrey Korolyov <andrey@xdel.ru> wrote:
>> On Wed, Jul 18, 2012 at 10:30 PM, Gregory Farnum <greg@inktank.com> wrote:
>>> On Wed, Jul 18, 2012 at 12:47 AM, Andrey Korolyov <andrey@xdel.ru> wrote:
>>>> On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum <greg@inktank.com> wrote:
>>>>> On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
>>>>>> On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>>>>>> > Hrm. That shouldn't be possible if the OSD has been removed. How did you take it out? It sounds like maybe you just marked it in the OUT state (and turned it off quite quickly) without actually taking it out of the cluster?
>>>>>> > -Greg
>>>>>>
>>>>>>
>>>>>>
>>>>>> As I have did removal, it was definitely not like that - at first
>>>>>> place, I have marked osds(4 and 5 on same host) out, then rebuilt
>>>>>> crushmap and then kill osd processes. As I mentioned before, osd.4
>>>>>> doest not exist in crushmap and therefore it shouldn`t be reported at
>>>>>> all(theoretically).
>>>>>
>>>>> Okay, that's what happened — marking an OSD out in the CRUSH map means all the data gets moved off it, but that doesn't remove it from all the places where it's registered in the monitor and in the map, for a couple reasons:
>>>>> 1) You might want to mark an OSD out before taking it down, to allow for more orderly data movement.
>>>>> 2) OSDs can get marked out automatically, but the system shouldn't be able to forget about them on its own.
>>>>> 3) You might want to remove an OSD from the CRUSH map in the process of placing it somewhere else (perhaps you moved the physical machine to a new location).
>>>>> etc.
>>>>>
>>>>> You want to run "ceph osd rm 4 5" and that should unregister both of them from everything[1]. :)
>>>>> -Greg
>>>>> [1]: Except for the full lists, which have a bug in the version of code you're running — remove the OSDs, then adjust the full ratios again, and all will be well.
>>>>>
>>>>
>>>> $ ceph osd rm 4
>>>> osd.4 does not exist
>>>> $ ceph -s
>>>>    health HEALTH_WARN 1 near full osd(s)
>>>>    monmap e3: 3 mons at
>>>> {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>>>> election epoch 58, quorum 0,1,2 0,1,2
>>>>    osdmap e2198: 4 osds: 4 up, 4 in
>>>>     pgmap v586056: 464 pgs: 464 active+clean; 66645 MB data, 231 GB
>>>> used, 95877 MB / 324 GB avail
>>>>    mdsmap e207: 1/1/1 up {0=a=up:active}
>>>>
>>>> $ ceph health detail
>>>> HEALTH_WARN 1 near full osd(s)
>>>> osd.4 is near full at 89%
>>>>
>>>> $ ceph osd dump
>>>> ....
>>>> max_osd 4
>>>> osd.0 up   in  weight 1 up_from 2183 up_thru 2187 down_at 2172
>>>> last_clean_interval [2136,2171) 192.168.10.128:6800/4030
>>>> 192.168.10.128:6801/4030 192.168.10.128:6802/4030 exists,up
>>>> 68b3deec-e80a-48b7-9c29-1b98f5de4f62
>>>> osd.1 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
>>>> last_clean_interval [2115,2134) 192.168.10.129:6800/2980
>>>> 192.168.10.129:6801/2980 192.168.10.129:6802/2980 exists,up
>>>> b2a26fe9-aaa8-445f-be1f-fa7d2a283b57
>>>> osd.2 up   in  weight 1 up_from 2181 up_thru 2187 down_at 2172
>>>> last_clean_interval [2136,2171) 192.168.10.128:6803/4128
>>>> 192.168.10.128:6804/4128 192.168.10.128:6805/4128 exists,up
>>>> 378d367a-f7fb-4892-9ec9-db8ffdd2eb20
>>>> osd.3 up   in  weight 1 up_from 2136 up_thru 2186 down_at 2135
>>>> last_clean_interval [2115,2134) 192.168.10.129:6803/3069
>>>> 192.168.10.129:6804/3069 192.168.10.129:6805/3069 exists,up
>>>> faf8eda8-55fc-4a0e-899f-47dbd32b81b8
>>>> ....
>>>
>>> Hrm. How did you create your new crush map? All the normal avenues of
>>> removing an OSD from the map set a flag which the PGMap uses to delete
>>> its records (which would prevent it reappearing in the full list), and
>>> I can't see how setcrushmap would remove an OSD from the map (although
>>> there might be a code path I haven't found).
>>
>> Manually, by deleting osd4|5 entries and reweighing remaining nodes.
>
> So you extracted the CRUSH map, edited it, and injected it using "ceph
> osd setrcrushmap"?

Yep, exactly.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-07-19  6:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-13  8:17 ceph status reporting non-existing osd Andrey Korolyov
2012-07-13 17:03 ` Gregory Farnum
2012-07-13 17:09   ` Sage Weil
2012-07-14 14:20     ` Andrey Korolyov
2012-07-16 16:12       ` Gregory Farnum
2012-07-16 18:42         ` Andrey Korolyov
2012-07-16 18:48           ` Gregory Farnum
2012-07-16 18:55             ` Andrey Korolyov
2012-07-18  6:09               ` Gregory Farnum
2012-07-18  6:22                 ` Andrey Korolyov
2012-07-18  7:18                   ` Gregory Farnum
2012-07-18  7:47                     ` Andrey Korolyov
2012-07-18 18:30                       ` Gregory Farnum
2012-07-18 19:07                         ` Andrey Korolyov
2012-07-18 21:28                           ` Gregory Farnum
2012-07-19  6:17                             ` Andrey Korolyov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.