All of lore.kernel.org
 help / color / mirror / Atom feed
* monitor not starting
@ 2012-07-04 11:45 Smart Weblications GmbH - Florian Wiessner
  2012-07-04 16:25 ` Gregory Farnum
  0 siblings, 1 reply; 6+ messages in thread
From: Smart Weblications GmbH - Florian Wiessner @ 2012-07-04 11:45 UTC (permalink / raw)
  To: ceph-devel

Hi List,


i today upgraded from 0.43 to 0.48 and now i have one monitor which does not
want to start up anymore:

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x52f9c9]
 2: (()+0xeff0) [0x7fb08dd11ff0]
 3: (gsignal()+0x35) [0x7fb08c4f41b5]
 4: (abort()+0x180) [0x7fb08c4f6fc0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5]
 6: (()+0xcb166) [0x7fb08cd87166]
 7: (()+0xcb193) [0x7fb08cd87193]
 8: (()+0xcb28e) [0x7fb08cd8728e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
[0x55b310]
 10: /usr/bin/ceph-mon() [0x497317]
 11: (Monitor::init()+0xc5a) [0x4857fa]
 12: (main()+0x2789) [0x46ac79]
 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d]
 14: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- end dump of recent events ---


How can i find out why it does not startup anymore? osd and mds is running fine..
-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: monitor not starting
  2012-07-04 11:45 monitor not starting Smart Weblications GmbH - Florian Wiessner
@ 2012-07-04 16:25 ` Gregory Farnum
  2012-07-04 17:02   ` Smart Weblications GmbH - Florian Wiessner
  0 siblings, 1 reply; 6+ messages in thread
From: Gregory Farnum @ 2012-07-04 16:25 UTC (permalink / raw)
  To: f.wiessner; +Cc: ceph-devel



On Wednesday, July 4, 2012 at 4:45 AM, Smart Weblications GmbH - Florian Wiessner wrote:

> Hi List,
> 
> 
> i today upgraded from 0.43 to 0.48 and now i have one monitor which does not
> want to start up anymore:
> 
> ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
> 1: /usr/bin/ceph-mon() [0x52f9c9]
> 2: (()+0xeff0) [0x7fb08dd11ff0]
> 3: (gsignal()+0x35) [0x7fb08c4f41b5]
> 4: (abort()+0x180) [0x7fb08c4f6fc0]
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5]
> 6: (()+0xcb166) [0x7fb08cd87166]
> 7: (()+0xcb193) [0x7fb08cd87193]
> 8: (()+0xcb28e) [0x7fb08cd8728e]
> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
> [0x55b310]
> 10: /usr/bin/ceph-mon() [0x497317]
> 11: (Monitor::init()+0xc5a) [0x4857fa]
> 12: (main()+0x2789) [0x46ac79]
> 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d]
> 14: /usr/bin/ceph-mon() [0x468309]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> 
> --- end dump of recent events ---
> 
> 
> How can i find out why it does not startup anymore? osd and mds is running fine..
Is that all the output you get? There should be a line somewhere which says what the assert is, and what line number it's on. :)

And while you're at it, is the rest of the cluster in fact working? I don't think 0.43 to 0.48 is an upgrade path we tested.

-Greg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: monitor not starting
  2012-07-04 16:25 ` Gregory Farnum
@ 2012-07-04 17:02   ` Smart Weblications GmbH - Florian Wiessner
  2012-07-04 19:05     ` Gregory Farnum
  0 siblings, 1 reply; 6+ messages in thread
From: Smart Weblications GmbH - Florian Wiessner @ 2012-07-04 17:02 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

Am 04.07.2012 18:25, schrieb Gregory Farnum:
> 
> 
> On Wednesday, July 4, 2012 at 4:45 AM, Smart Weblications GmbH - Florian Wiessner wrote:
> 
>> Hi List,
>>
>>
>> i today upgraded from 0.43 to 0.48 and now i have one monitor which does not
>> want to start up anymore:
>>
>> ceph version 0.48argonaut-125-g4e774fb
>> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
>> 1: /usr/bin/ceph-mon() [0x52f9c9]
>> 2: (()+0xeff0) [0x7fb08dd11ff0]
>> 3: (gsignal()+0x35) [0x7fb08c4f41b5]
>> 4: (abort()+0x180) [0x7fb08c4f6fc0]
>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5]
>> 6: (()+0xcb166) [0x7fb08cd87166]
>> 7: (()+0xcb193) [0x7fb08cd87193]
>> 8: (()+0xcb28e) [0x7fb08cd8728e]
>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
>> [0x55b310]
>> 10: /usr/bin/ceph-mon() [0x497317]
>> 11: (Monitor::init()+0xc5a) [0x4857fa]
>> 12: (main()+0x2789) [0x46ac79]
>> 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d]
>> 14: /usr/bin/ceph-mon() [0x468309]
>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>>
>> --- end dump of recent events ---
>>
>>
>> How can i find out why it does not startup anymore? osd and mds is running fine..
> Is that all the output you get? There should be a line somewhere which says what the assert is, and what line number it's on. :)


Is this what you are looking for:
2012-07-04 11:20:24.448430 7f423d943780  1 mon.3@-1(probing) e1 init fsid
4553d0f6-1b31-4ba5-9d97-edae55bcaab4
2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc: In function 'bool
Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637
mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1))

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x497317]
 2: (Monitor::init()+0xc5a) [0x4857fa]
 3: (main()+0x2789) [0x46ac79]
 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d]
 5: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- begin dump of recent events ---
    -3> 2012-07-04 11:20:24.447613 7f423d943780  1 store(/data/ceph/mon) mount
    -2> 2012-07-04 11:20:24.447722 7f423d943780  0 ceph version
0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8),
process ceph-mon, pid 7436
    -1> 2012-07-04 11:20:24.448430 7f423d943780  1 mon.3@-1(probing) e1 init
fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4
     0> 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc: In function
'bool Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637
mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1))

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x497317]
 2: (Monitor::init()+0xc5a) [0x4857fa]
 3: (main()+0x2789) [0x46ac79]
 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d]
 5: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- end dump of recent events ---
2012-07-04 11:20:24.449567 7f423d943780 -1 *** Caught signal (Aborted) **
 in thread 7f423d943780


> 
> And while you're at it, is the rest of the cluster in fact working? I don't think 0.43 to 0.48 is an upgrade path we tested.
> 

Anyway, i removed the mon and did a ceph-mon --mkfs with the 3 mons that were
still working after the upgrade and got it up and running again.

Yes, the cluster is still working after the upgrade. Also upgraded to linux
3.4.4 - it feels like the ceph-fuse and kernel ceph client is a little less
robust than in 0.43...

when i start copying from /ceph to other mp, then it seems that for the copy
operation or in general for any operation, /ceph is unusable to other processes
which then makes the client behave very sluggish... :(

i can send you the contents of the monitor directory where it did not work after
the upgrade if you want me to..
-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: monitor not starting
  2012-07-04 17:02   ` Smart Weblications GmbH - Florian Wiessner
@ 2012-07-04 19:05     ` Gregory Farnum
  2012-07-24 15:14       ` Smart Weblications GmbH - Florian Wiessner
  0 siblings, 1 reply; 6+ messages in thread
From: Gregory Farnum @ 2012-07-04 19:05 UTC (permalink / raw)
  To: f.wiessner; +Cc: ceph-devel

On Wednesday, July 4, 2012 at 10:02 AM, Smart Weblications GmbH - Florian Wiessner wrote:
> Am 04.07.2012 18:25, schrieb Gregory Farnum:
> >  
> >  
> > On Wednesday, July 4, 2012 at 4:45 AM, Smart Weblications GmbH - Florian Wiessner wrote:
> >  
> > > Hi List,
> > >  
> > >  
> > > i today upgraded from 0.43 to 0.48 and now i have one monitor which does not
> > > want to start up anymore:
> > >  
> > > ceph version 0.48argonaut-125-g4e774fb
> > > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
> > > 1: /usr/bin/ceph-mon() [0x52f9c9]
> > > 2: (()+0xeff0) [0x7fb08dd11ff0]
> > > 3: (gsignal()+0x35) [0x7fb08c4f41b5]
> > > 4: (abort()+0x180) [0x7fb08c4f6fc0]
> > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5]
> > > 6: (()+0xcb166) [0x7fb08cd87166]
> > > 7: (()+0xcb193) [0x7fb08cd87193]
> > > 8: (()+0xcb28e) [0x7fb08cd8728e]
> > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
> > > [0x55b310]
> > > 10: /usr/bin/ceph-mon() [0x497317]
> > > 11: (Monitor::init()+0xc5a) [0x4857fa]
> > > 12: (main()+0x2789) [0x46ac79]
> > > 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d]
> > > 14: /usr/bin/ceph-mon() [0x468309]
> > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> > > interpret this.
> > >  
> > > --- end dump of recent events ---
> > >  
> > >  
> > > How can i find out why it does not startup anymore? osd and mds is running fine..
> > Is that all the output you get? There should be a line somewhere which says what the assert is, and what line number it's on. :)
>  
>  
>  
>  
> Is this what you are looking for:
> 2012-07-04 11:20:24.448430 7f423d943780 1 mon.3@-1(probing) e1 init fsid
> 4553d0f6-1b31-4ba5-9d97-edae55bcaab4
> 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc (http://Paxos.cc): In function 'bool
> Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637
> mon/Paxos.cc (http://Paxos.cc): 1031: FAILED assert(consistent || (slurping == 1))

Yep, that line. This means the monitor's on-disk state is inconsistent, but I can think of a number of scenarios which could have caused this, depending on how you upgraded your cluster (older monitors didn't mark on-disk whenever they deliberately went inconsistent on a catchup, which I bet is what happened here).
  
> ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
> 1: /usr/bin/ceph-mon() [0x497317]
> 2: (Monitor::init()+0xc5a) [0x4857fa]
> 3: (main()+0x2789) [0x46ac79]
> 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d]
> 5: /usr/bin/ceph-mon() [0x468309]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>  
> --- begin dump of recent events ---
> -3> 2012-07-04 11:20:24.447613 7f423d943780 1 store(/data/ceph/mon) mount
> -2> 2012-07-04 11:20:24.447722 7f423d943780 0 ceph version
> 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8),
> process ceph-mon, pid 7436
> -1> 2012-07-04 11:20:24.448430 7f423d943780 1 mon.3@-1(probing) e1 init
> fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4
> 0> 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc (http://Paxos.cc): In function
> 'bool Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637
> mon/Paxos.cc (http://Paxos.cc): 1031: FAILED assert(consistent || (slurping == 1))
>  
> ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
> 1: /usr/bin/ceph-mon() [0x497317]
> 2: (Monitor::init()+0xc5a) [0x4857fa]
> 3: (main()+0x2789) [0x46ac79]
> 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d]
> 5: /usr/bin/ceph-mon() [0x468309]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>  
> --- end dump of recent events ---
> 2012-07-04 11:20:24.449567 7f423d943780 -1 *** Caught signal (Aborted) **
> in thread 7f423d943780
>  
>  
> >  
> > And while you're at it, is the rest of the cluster in fact working? I don't think 0.43 to 0.48 is an upgrade path we tested.
>  
> Anyway, i removed the mon and did a ceph-mon --mkfs with the 3 mons that were
> still working after the upgrade and got it up and running again.
>  
> Yes, the cluster is still working after the upgrade. Also upgraded to linux
> 3.4.4 - it feels like the ceph-fuse and kernel ceph client is a little less
> robust than in 0.43...
>  
> when i start copying from /ceph to other mp, then it seems that for the copy
> operation or in general for any operation, /ceph is unusable to other processes
> which then makes the client behave very sluggish... :(

Well, it shouldn't have gotten less stable since we haven't made any big changes there…but you aren't the only one reporting that things seem to be a little bit slower. We're going to have to look at that once people are back in the office after Independence Day.
  
>  
> i can send you the contents of the monitor directory where it did not work after
> the upgrade if you want me to..

No, that won't be necessary. Thanks though!  

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: monitor not starting
  2012-07-04 19:05     ` Gregory Farnum
@ 2012-07-24 15:14       ` Smart Weblications GmbH - Florian Wiessner
  2012-07-24 23:55         ` Sage Weil
  0 siblings, 1 reply; 6+ messages in thread
From: Smart Weblications GmbH - Florian Wiessner @ 2012-07-24 15:14 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

Am 04.07.2012 21:05, schrieb Gregory Farnum:

> 
> Yep, that line. This means the monitor's on-disk state is inconsistent, but I can think of a number of scenarios which could have caused this, depending on how you upgraded your cluster (older monitors didn't mark on-disk whenever they deliberately went inconsistent on a catchup, which I bet is what happened here).
>   
>> ceph version 0.48argonaut-125-g4e774fb
>> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
>> 1: /usr/bin/ceph-mon() [0x497317]
>> 2: (Monitor::init()+0xc5a) [0x4857fa]
>> 3: (main()+0x2789) [0x46ac79]
>> 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d]
>> 5: /usr/bin/ceph-mon() [0x468309]
>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>>  

> 
> No, that won't be necessary. Thanks though!  

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x52f9c9]
 2: (()+0xeff0) [0x7fe93db6dff0]
 3: (gsignal()+0x35) [0x7fe93c3501b5]
 4: (abort()+0x180) [0x7fe93c352fc0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fe93cbe4dc5]
 6: (()+0xcb166) [0x7fe93cbe3166]
 7: (()+0xcb193) [0x7fe93cbe3193]
 8: (()+0xcb28e) [0x7fe93cbe328e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
[0x55b310]
 10: /usr/bin/ceph-mon() [0x497317]
 11: (Monitor::init()+0xc5a) [0x4857fa]
 12: (main()+0x2789) [0x46ac79]
 13: (__libc_start_main()+0xfd) [0x7fe93c33cc8d]
 14: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- end dump of recent events ---
2012-07-24 17:03:22.791401 7fd3045af780  1 mon.1@-1(probing) e1 init fsid
4553d0f6-1b31-4ba5-9d97-edae55bcaab4
2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function 'bool
Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528
mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1))

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x497317]
 2: (Monitor::init()+0xc5a) [0x4857fa]
 3: (main()+0x2789) [0x46ac79]
 4: (__libc_start_main()+0xfd) [0x7fd302967c8d]
 5: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.


Well, again my cluster rebootet and now only 1 of 4 monitors is willing to start...

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x497317]
 2: (Monitor::init()+0xc5a) [0x4857fa]
 3: (main()+0x2789) [0x46ac79]
 4: (__libc_start_main()+0xfd) [0x7fd302967c8d]
 5: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- begin dump of recent events ---
    -3> 2012-07-24 17:03:22.729549 7fd3045af780  1 store(/data/ceph/mon) mount
    -2> 2012-07-24 17:03:22.729667 7fd3045af780  0 ceph version
0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8),
process ceph-mon, pid 6962
    -1> 2012-07-24 17:03:22.791401 7fd3045af780  1 mon.1@-1(probing) e1 init
fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4
     0> 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function
'bool Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528
mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1))


--- end dump of recent events ---
2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) **
 in thread 7fd3045af780

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x52f9c9]
 2: (()+0xeff0) [0x7fd304198ff0]
 3: (gsignal()+0x35) [0x7fd30297b1b5]
 4: (abort()+0x180) [0x7fd30297dfc0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5]
 6: (()+0xcb166) [0x7fd30320e166]
 7: (()+0xcb193) [0x7fd30320e193]
 8: (()+0xcb28e) [0x7fd30320e28e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
[0x55b310]
 10: /usr/bin/ceph-mon() [0x497317]
 11: (Monitor::init()+0xc5a) [0x4857fa]
 12: (main()+0x2789) [0x46ac79]
 13: (__libc_start_main()+0xfd) [0x7fd302967c8d]
 14: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- begin dump of recent events ---
     0> 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) **
 in thread 7fd3045af780

 ceph version 0.48argonaut-125-g4e774fb
(commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
 1: /usr/bin/ceph-mon() [0x52f9c9]
 2: (()+0xeff0) [0x7fd304198ff0]
 3: (gsignal()+0x35) [0x7fd30297b1b5]
 4: (abort()+0x180) [0x7fd30297dfc0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5]
 6: (()+0xcb166) [0x7fd30320e166]
 7: (()+0xcb193) [0x7fd30320e193]
 8: (()+0xcb28e) [0x7fd30320e28e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
[0x55b310]
 10: /usr/bin/ceph-mon() [0x497317]
 11: (Monitor::init()+0xc5a) [0x4857fa]
 12: (main()+0x2789) [0x46ac79]
 13: (__libc_start_main()+0xfd) [0x7fd302967c8d]
 14: /usr/bin/ceph-mon() [0x468309]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

--- end dump of recent events ---



How can i fix this or prevent this from happening?

-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: monitor not starting
  2012-07-24 15:14       ` Smart Weblications GmbH - Florian Wiessner
@ 2012-07-24 23:55         ` Sage Weil
  0 siblings, 0 replies; 6+ messages in thread
From: Sage Weil @ 2012-07-24 23:55 UTC (permalink / raw)
  To: Smart Weblications GmbH - Florian Wiessner; +Cc: Gregory Farnum, ceph-devel

On Tue, 24 Jul 2012, Smart Weblications GmbH - Florian Wiessner wrote:
> --- end dump of recent events ---
> 2012-07-24 17:03:22.791401 7fd3045af780  1 mon.1@-1(probing) e1 init fsid
> 4553d0f6-1b31-4ba5-9d97-edae55bcaab4
> 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function 'bool
> Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528
> mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1))

Was this monitor previously starting with 0.48?  This looks a lot like 
older bugs that were fixed before then.

If you can attach a tarball of the mon data directory and send it to me 
directly (off-list :), I can find out exactly what is inconsistent.

Thanks!
sage


> 
>  ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
>  1: /usr/bin/ceph-mon() [0x497317]
>  2: (Monitor::init()+0xc5a) [0x4857fa]
>  3: (main()+0x2789) [0x46ac79]
>  4: (__libc_start_main()+0xfd) [0x7fd302967c8d]
>  5: /usr/bin/ceph-mon() [0x468309]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> 
> 
> Well, again my cluster rebootet and now only 1 of 4 monitors is willing to start...
> 
>  ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
>  1: /usr/bin/ceph-mon() [0x497317]
>  2: (Monitor::init()+0xc5a) [0x4857fa]
>  3: (main()+0x2789) [0x46ac79]
>  4: (__libc_start_main()+0xfd) [0x7fd302967c8d]
>  5: /usr/bin/ceph-mon() [0x468309]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> 
> --- begin dump of recent events ---
>     -3> 2012-07-24 17:03:22.729549 7fd3045af780  1 store(/data/ceph/mon) mount
>     -2> 2012-07-24 17:03:22.729667 7fd3045af780  0 ceph version
> 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8),
> process ceph-mon, pid 6962
>     -1> 2012-07-24 17:03:22.791401 7fd3045af780  1 mon.1@-1(probing) e1 init
> fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4
>      0> 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function
> 'bool Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528
> mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1))
> 
> 
> --- end dump of recent events ---
> 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) **
>  in thread 7fd3045af780
> 
>  ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
>  1: /usr/bin/ceph-mon() [0x52f9c9]
>  2: (()+0xeff0) [0x7fd304198ff0]
>  3: (gsignal()+0x35) [0x7fd30297b1b5]
>  4: (abort()+0x180) [0x7fd30297dfc0]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5]
>  6: (()+0xcb166) [0x7fd30320e166]
>  7: (()+0xcb193) [0x7fd30320e193]
>  8: (()+0xcb28e) [0x7fd30320e28e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
> [0x55b310]
>  10: /usr/bin/ceph-mon() [0x497317]
>  11: (Monitor::init()+0xc5a) [0x4857fa]
>  12: (main()+0x2789) [0x46ac79]
>  13: (__libc_start_main()+0xfd) [0x7fd302967c8d]
>  14: /usr/bin/ceph-mon() [0x468309]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> 
> --- begin dump of recent events ---
>      0> 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) **
>  in thread 7fd3045af780
> 
>  ceph version 0.48argonaut-125-g4e774fb
> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8)
>  1: /usr/bin/ceph-mon() [0x52f9c9]
>  2: (()+0xeff0) [0x7fd304198ff0]
>  3: (gsignal()+0x35) [0x7fd30297b1b5]
>  4: (abort()+0x180) [0x7fd30297dfc0]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5]
>  6: (()+0xcb166) [0x7fd30320e166]
>  7: (()+0xcb193) [0x7fd30320e193]
>  8: (()+0xcb28e) [0x7fd30320e28e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940)
> [0x55b310]
>  10: /usr/bin/ceph-mon() [0x497317]
>  11: (Monitor::init()+0xc5a) [0x4857fa]
>  12: (main()+0x2789) [0x46ac79]
>  13: (__libc_start_main()+0xfd) [0x7fd302967c8d]
>  14: /usr/bin/ceph-mon() [0x468309]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> 
> --- end dump of recent events ---
> 
> 
> 
> How can i fix this or prevent this from happening?
> 
> -- 
> 
> Mit freundlichen Gr??en,
> 
> Florian Wiessner
> 
> Smart Weblications GmbH
> Martinsberger Str. 1
> D-95119 Naila
> 
> fon.: +49 9282 9638 200
> fax.: +49 9282 9638 205
> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
> http://www.smart-weblications.de
> 
> --
> Sitz der Gesellschaft: Naila
> Gesch?ftsf?hrer: Florian Wiessner
> HRB-Nr.: HRB 3840 Amtsgericht Hof
> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-24 23:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-04 11:45 monitor not starting Smart Weblications GmbH - Florian Wiessner
2012-07-04 16:25 ` Gregory Farnum
2012-07-04 17:02   ` Smart Weblications GmbH - Florian Wiessner
2012-07-04 19:05     ` Gregory Farnum
2012-07-24 15:14       ` Smart Weblications GmbH - Florian Wiessner
2012-07-24 23:55         ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.