* monitor not starting @ 2012-07-04 11:45 Smart Weblications GmbH - Florian Wiessner 2012-07-04 16:25 ` Gregory Farnum 0 siblings, 1 reply; 6+ messages in thread From: Smart Weblications GmbH - Florian Wiessner @ 2012-07-04 11:45 UTC (permalink / raw) To: ceph-devel Hi List, i today upgraded from 0.43 to 0.48 and now i have one monitor which does not want to start up anymore: ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x52f9c9] 2: (()+0xeff0) [0x7fb08dd11ff0] 3: (gsignal()+0x35) [0x7fb08c4f41b5] 4: (abort()+0x180) [0x7fb08c4f6fc0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5] 6: (()+0xcb166) [0x7fb08cd87166] 7: (()+0xcb193) [0x7fb08cd87193] 8: (()+0xcb28e) [0x7fb08cd8728e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x55b310] 10: /usr/bin/ceph-mon() [0x497317] 11: (Monitor::init()+0xc5a) [0x4857fa] 12: (main()+0x2789) [0x46ac79] 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d] 14: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- end dump of recent events --- How can i find out why it does not startup anymore? osd and mds is running fine.. -- Mit freundlichen Grüßen, Florian Wiessner Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: monitor not starting 2012-07-04 11:45 monitor not starting Smart Weblications GmbH - Florian Wiessner @ 2012-07-04 16:25 ` Gregory Farnum 2012-07-04 17:02 ` Smart Weblications GmbH - Florian Wiessner 0 siblings, 1 reply; 6+ messages in thread From: Gregory Farnum @ 2012-07-04 16:25 UTC (permalink / raw) To: f.wiessner; +Cc: ceph-devel On Wednesday, July 4, 2012 at 4:45 AM, Smart Weblications GmbH - Florian Wiessner wrote: > Hi List, > > > i today upgraded from 0.43 to 0.48 and now i have one monitor which does not > want to start up anymore: > > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x52f9c9] > 2: (()+0xeff0) [0x7fb08dd11ff0] > 3: (gsignal()+0x35) [0x7fb08c4f41b5] > 4: (abort()+0x180) [0x7fb08c4f6fc0] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5] > 6: (()+0xcb166) [0x7fb08cd87166] > 7: (()+0xcb193) [0x7fb08cd87193] > 8: (()+0xcb28e) [0x7fb08cd8728e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) > [0x55b310] > 10: /usr/bin/ceph-mon() [0x497317] > 11: (Monitor::init()+0xc5a) [0x4857fa] > 12: (main()+0x2789) [0x46ac79] > 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d] > 14: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- end dump of recent events --- > > > How can i find out why it does not startup anymore? osd and mds is running fine.. Is that all the output you get? There should be a line somewhere which says what the assert is, and what line number it's on. :) And while you're at it, is the rest of the cluster in fact working? I don't think 0.43 to 0.48 is an upgrade path we tested. -Greg ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: monitor not starting 2012-07-04 16:25 ` Gregory Farnum @ 2012-07-04 17:02 ` Smart Weblications GmbH - Florian Wiessner 2012-07-04 19:05 ` Gregory Farnum 0 siblings, 1 reply; 6+ messages in thread From: Smart Weblications GmbH - Florian Wiessner @ 2012-07-04 17:02 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel Am 04.07.2012 18:25, schrieb Gregory Farnum: > > > On Wednesday, July 4, 2012 at 4:45 AM, Smart Weblications GmbH - Florian Wiessner wrote: > >> Hi List, >> >> >> i today upgraded from 0.43 to 0.48 and now i have one monitor which does not >> want to start up anymore: >> >> ceph version 0.48argonaut-125-g4e774fb >> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) >> 1: /usr/bin/ceph-mon() [0x52f9c9] >> 2: (()+0xeff0) [0x7fb08dd11ff0] >> 3: (gsignal()+0x35) [0x7fb08c4f41b5] >> 4: (abort()+0x180) [0x7fb08c4f6fc0] >> 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5] >> 6: (()+0xcb166) [0x7fb08cd87166] >> 7: (()+0xcb193) [0x7fb08cd87193] >> 8: (()+0xcb28e) [0x7fb08cd8728e] >> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) >> [0x55b310] >> 10: /usr/bin/ceph-mon() [0x497317] >> 11: (Monitor::init()+0xc5a) [0x4857fa] >> 12: (main()+0x2789) [0x46ac79] >> 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d] >> 14: /usr/bin/ceph-mon() [0x468309] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to >> interpret this. >> >> --- end dump of recent events --- >> >> >> How can i find out why it does not startup anymore? osd and mds is running fine.. > Is that all the output you get? There should be a line somewhere which says what the assert is, and what line number it's on. :) Is this what you are looking for: 2012-07-04 11:20:24.448430 7f423d943780 1 mon.3@-1(probing) e1 init fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x497317] 2: (Monitor::init()+0xc5a) [0x4857fa] 3: (main()+0x2789) [0x46ac79] 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d] 5: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -3> 2012-07-04 11:20:24.447613 7f423d943780 1 store(/data/ceph/mon) mount -2> 2012-07-04 11:20:24.447722 7f423d943780 0 ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8), process ceph-mon, pid 7436 -1> 2012-07-04 11:20:24.448430 7f423d943780 1 mon.3@-1(probing) e1 init fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 0> 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x497317] 2: (Monitor::init()+0xc5a) [0x4857fa] 3: (main()+0x2789) [0x46ac79] 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d] 5: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- end dump of recent events --- 2012-07-04 11:20:24.449567 7f423d943780 -1 *** Caught signal (Aborted) ** in thread 7f423d943780 > > And while you're at it, is the rest of the cluster in fact working? I don't think 0.43 to 0.48 is an upgrade path we tested. > Anyway, i removed the mon and did a ceph-mon --mkfs with the 3 mons that were still working after the upgrade and got it up and running again. Yes, the cluster is still working after the upgrade. Also upgraded to linux 3.4.4 - it feels like the ceph-fuse and kernel ceph client is a little less robust than in 0.43... when i start copying from /ceph to other mp, then it seems that for the copy operation or in general for any operation, /ceph is unusable to other processes which then makes the client behave very sluggish... :( i can send you the contents of the monitor directory where it did not work after the upgrade if you want me to.. -- Mit freundlichen Grüßen, Florian Wiessner Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: monitor not starting 2012-07-04 17:02 ` Smart Weblications GmbH - Florian Wiessner @ 2012-07-04 19:05 ` Gregory Farnum 2012-07-24 15:14 ` Smart Weblications GmbH - Florian Wiessner 0 siblings, 1 reply; 6+ messages in thread From: Gregory Farnum @ 2012-07-04 19:05 UTC (permalink / raw) To: f.wiessner; +Cc: ceph-devel On Wednesday, July 4, 2012 at 10:02 AM, Smart Weblications GmbH - Florian Wiessner wrote: > Am 04.07.2012 18:25, schrieb Gregory Farnum: > > > > > > On Wednesday, July 4, 2012 at 4:45 AM, Smart Weblications GmbH - Florian Wiessner wrote: > > > > > Hi List, > > > > > > > > > i today upgraded from 0.43 to 0.48 and now i have one monitor which does not > > > want to start up anymore: > > > > > > ceph version 0.48argonaut-125-g4e774fb > > > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > > > 1: /usr/bin/ceph-mon() [0x52f9c9] > > > 2: (()+0xeff0) [0x7fb08dd11ff0] > > > 3: (gsignal()+0x35) [0x7fb08c4f41b5] > > > 4: (abort()+0x180) [0x7fb08c4f6fc0] > > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fb08cd88dc5] > > > 6: (()+0xcb166) [0x7fb08cd87166] > > > 7: (()+0xcb193) [0x7fb08cd87193] > > > 8: (()+0xcb28e) [0x7fb08cd8728e] > > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) > > > [0x55b310] > > > 10: /usr/bin/ceph-mon() [0x497317] > > > 11: (Monitor::init()+0xc5a) [0x4857fa] > > > 12: (main()+0x2789) [0x46ac79] > > > 13: (__libc_start_main()+0xfd) [0x7fb08c4e0c8d] > > > 14: /usr/bin/ceph-mon() [0x468309] > > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > > > interpret this. > > > > > > --- end dump of recent events --- > > > > > > > > > How can i find out why it does not startup anymore? osd and mds is running fine.. > > Is that all the output you get? There should be a line somewhere which says what the assert is, and what line number it's on. :) > > > > > Is this what you are looking for: > 2012-07-04 11:20:24.448430 7f423d943780 1 mon.3@-1(probing) e1 init fsid > 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 > 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc (http://Paxos.cc): In function 'bool > Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637 > mon/Paxos.cc (http://Paxos.cc): 1031: FAILED assert(consistent || (slurping == 1)) Yep, that line. This means the monitor's on-disk state is inconsistent, but I can think of a number of scenarios which could have caused this, depending on how you upgraded your cluster (older monitors didn't mark on-disk whenever they deliberately went inconsistent on a catchup, which I bet is what happened here). > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x497317] > 2: (Monitor::init()+0xc5a) [0x4857fa] > 3: (main()+0x2789) [0x46ac79] > 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d] > 5: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- begin dump of recent events --- > -3> 2012-07-04 11:20:24.447613 7f423d943780 1 store(/data/ceph/mon) mount > -2> 2012-07-04 11:20:24.447722 7f423d943780 0 ceph version > 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8), > process ceph-mon, pid 7436 > -1> 2012-07-04 11:20:24.448430 7f423d943780 1 mon.3@-1(probing) e1 init > fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 > 0> 2012-07-04 11:20:24.448994 7f423d943780 -1 mon/Paxos.cc (http://Paxos.cc): In function > 'bool Paxos::is_consistent()' thread 7f423d943780 time 2012-07-04 11:20:24.448637 > mon/Paxos.cc (http://Paxos.cc): 1031: FAILED assert(consistent || (slurping == 1)) > > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x497317] > 2: (Monitor::init()+0xc5a) [0x4857fa] > 3: (main()+0x2789) [0x46ac79] > 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d] > 5: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- end dump of recent events --- > 2012-07-04 11:20:24.449567 7f423d943780 -1 *** Caught signal (Aborted) ** > in thread 7f423d943780 > > > > > > And while you're at it, is the rest of the cluster in fact working? I don't think 0.43 to 0.48 is an upgrade path we tested. > > Anyway, i removed the mon and did a ceph-mon --mkfs with the 3 mons that were > still working after the upgrade and got it up and running again. > > Yes, the cluster is still working after the upgrade. Also upgraded to linux > 3.4.4 - it feels like the ceph-fuse and kernel ceph client is a little less > robust than in 0.43... > > when i start copying from /ceph to other mp, then it seems that for the copy > operation or in general for any operation, /ceph is unusable to other processes > which then makes the client behave very sluggish... :( Well, it shouldn't have gotten less stable since we haven't made any big changes there…but you aren't the only one reporting that things seem to be a little bit slower. We're going to have to look at that once people are back in the office after Independence Day. > > i can send you the contents of the monitor directory where it did not work after > the upgrade if you want me to.. No, that won't be necessary. Thanks though! -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: monitor not starting 2012-07-04 19:05 ` Gregory Farnum @ 2012-07-24 15:14 ` Smart Weblications GmbH - Florian Wiessner 2012-07-24 23:55 ` Sage Weil 0 siblings, 1 reply; 6+ messages in thread From: Smart Weblications GmbH - Florian Wiessner @ 2012-07-24 15:14 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel Am 04.07.2012 21:05, schrieb Gregory Farnum: > > Yep, that line. This means the monitor's on-disk state is inconsistent, but I can think of a number of scenarios which could have caused this, depending on how you upgraded your cluster (older monitors didn't mark on-disk whenever they deliberately went inconsistent on a catchup, which I bet is what happened here). > >> ceph version 0.48argonaut-125-g4e774fb >> (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) >> 1: /usr/bin/ceph-mon() [0x497317] >> 2: (Monitor::init()+0xc5a) [0x4857fa] >> 3: (main()+0x2789) [0x46ac79] >> 4: (__libc_start_main()+0xfd) [0x7f423bcfbc8d] >> 5: /usr/bin/ceph-mon() [0x468309] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to >> interpret this. >> > > No, that won't be necessary. Thanks though! ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x52f9c9] 2: (()+0xeff0) [0x7fe93db6dff0] 3: (gsignal()+0x35) [0x7fe93c3501b5] 4: (abort()+0x180) [0x7fe93c352fc0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fe93cbe4dc5] 6: (()+0xcb166) [0x7fe93cbe3166] 7: (()+0xcb193) [0x7fe93cbe3193] 8: (()+0xcb28e) [0x7fe93cbe328e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x55b310] 10: /usr/bin/ceph-mon() [0x497317] 11: (Monitor::init()+0xc5a) [0x4857fa] 12: (main()+0x2789) [0x46ac79] 13: (__libc_start_main()+0xfd) [0x7fe93c33cc8d] 14: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- end dump of recent events --- 2012-07-24 17:03:22.791401 7fd3045af780 1 mon.1@-1(probing) e1 init fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x497317] 2: (Monitor::init()+0xc5a) [0x4857fa] 3: (main()+0x2789) [0x46ac79] 4: (__libc_start_main()+0xfd) [0x7fd302967c8d] 5: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Well, again my cluster rebootet and now only 1 of 4 monitors is willing to start... ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x497317] 2: (Monitor::init()+0xc5a) [0x4857fa] 3: (main()+0x2789) [0x46ac79] 4: (__libc_start_main()+0xfd) [0x7fd302967c8d] 5: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -3> 2012-07-24 17:03:22.729549 7fd3045af780 1 store(/data/ceph/mon) mount -2> 2012-07-24 17:03:22.729667 7fd3045af780 0 ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8), process ceph-mon, pid 6962 -1> 2012-07-24 17:03:22.791401 7fd3045af780 1 mon.1@-1(probing) e1 init fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 0> 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function 'bool Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528 mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) --- end dump of recent events --- 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) ** in thread 7fd3045af780 ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x52f9c9] 2: (()+0xeff0) [0x7fd304198ff0] 3: (gsignal()+0x35) [0x7fd30297b1b5] 4: (abort()+0x180) [0x7fd30297dfc0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5] 6: (()+0xcb166) [0x7fd30320e166] 7: (()+0xcb193) [0x7fd30320e193] 8: (()+0xcb28e) [0x7fd30320e28e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x55b310] 10: /usr/bin/ceph-mon() [0x497317] 11: (Monitor::init()+0xc5a) [0x4857fa] 12: (main()+0x2789) [0x46ac79] 13: (__libc_start_main()+0xfd) [0x7fd302967c8d] 14: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- 0> 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) ** in thread 7fd3045af780 ceph version 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) 1: /usr/bin/ceph-mon() [0x52f9c9] 2: (()+0xeff0) [0x7fd304198ff0] 3: (gsignal()+0x35) [0x7fd30297b1b5] 4: (abort()+0x180) [0x7fd30297dfc0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5] 6: (()+0xcb166) [0x7fd30320e166] 7: (()+0xcb193) [0x7fd30320e193] 8: (()+0xcb28e) [0x7fd30320e28e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) [0x55b310] 10: /usr/bin/ceph-mon() [0x497317] 11: (Monitor::init()+0xc5a) [0x4857fa] 12: (main()+0x2789) [0x46ac79] 13: (__libc_start_main()+0xfd) [0x7fd302967c8d] 14: /usr/bin/ceph-mon() [0x468309] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- end dump of recent events --- How can i fix this or prevent this from happening? -- Mit freundlichen Grüßen, Florian Wiessner Smart Weblications GmbH Martinsberger Str. 1 D-95119 Naila fon.: +49 9282 9638 200 fax.: +49 9282 9638 205 24/7: +49 900 144 000 00 - 0,99 EUR/Min* http://www.smart-weblications.de -- Sitz der Gesellschaft: Naila Geschäftsführer: Florian Wiessner HRB-Nr.: HRB 3840 Amtsgericht Hof *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: monitor not starting 2012-07-24 15:14 ` Smart Weblications GmbH - Florian Wiessner @ 2012-07-24 23:55 ` Sage Weil 0 siblings, 0 replies; 6+ messages in thread From: Sage Weil @ 2012-07-24 23:55 UTC (permalink / raw) To: Smart Weblications GmbH - Florian Wiessner; +Cc: Gregory Farnum, ceph-devel On Tue, 24 Jul 2012, Smart Weblications GmbH - Florian Wiessner wrote: > --- end dump of recent events --- > 2012-07-24 17:03:22.791401 7fd3045af780 1 mon.1@-1(probing) e1 init fsid > 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 > 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function 'bool > Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528 > mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) Was this monitor previously starting with 0.48? This looks a lot like older bugs that were fixed before then. If you can attach a tarball of the mon data directory and send it to me directly (off-list :), I can find out exactly what is inconsistent. Thanks! sage > > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x497317] > 2: (Monitor::init()+0xc5a) [0x4857fa] > 3: (main()+0x2789) [0x46ac79] > 4: (__libc_start_main()+0xfd) [0x7fd302967c8d] > 5: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > > Well, again my cluster rebootet and now only 1 of 4 monitors is willing to start... > > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x497317] > 2: (Monitor::init()+0xc5a) [0x4857fa] > 3: (main()+0x2789) [0x46ac79] > 4: (__libc_start_main()+0xfd) [0x7fd302967c8d] > 5: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- begin dump of recent events --- > -3> 2012-07-24 17:03:22.729549 7fd3045af780 1 store(/data/ceph/mon) mount > -2> 2012-07-24 17:03:22.729667 7fd3045af780 0 ceph version > 0.48argonaut-125-g4e774fb (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8), > process ceph-mon, pid 6962 > -1> 2012-07-24 17:03:22.791401 7fd3045af780 1 mon.1@-1(probing) e1 init > fsid 4553d0f6-1b31-4ba5-9d97-edae55bcaab4 > 0> 2012-07-24 17:03:22.791890 7fd3045af780 -1 mon/Paxos.cc: In function > 'bool Paxos::is_consistent()' thread 7fd3045af780 time 2012-07-24 17:03:22.791528 > mon/Paxos.cc: 1031: FAILED assert(consistent || (slurping == 1)) > > > --- end dump of recent events --- > 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) ** > in thread 7fd3045af780 > > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x52f9c9] > 2: (()+0xeff0) [0x7fd304198ff0] > 3: (gsignal()+0x35) [0x7fd30297b1b5] > 4: (abort()+0x180) [0x7fd30297dfc0] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5] > 6: (()+0xcb166) [0x7fd30320e166] > 7: (()+0xcb193) [0x7fd30320e193] > 8: (()+0xcb28e) [0x7fd30320e28e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) > [0x55b310] > 10: /usr/bin/ceph-mon() [0x497317] > 11: (Monitor::init()+0xc5a) [0x4857fa] > 12: (main()+0x2789) [0x46ac79] > 13: (__libc_start_main()+0xfd) [0x7fd302967c8d] > 14: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- begin dump of recent events --- > 0> 2012-07-24 17:03:22.792461 7fd3045af780 -1 *** Caught signal (Aborted) ** > in thread 7fd3045af780 > > ceph version 0.48argonaut-125-g4e774fb > (commit:4e774fbcb38fd6883232b72352512a5f8e4a66e8) > 1: /usr/bin/ceph-mon() [0x52f9c9] > 2: (()+0xeff0) [0x7fd304198ff0] > 3: (gsignal()+0x35) [0x7fd30297b1b5] > 4: (abort()+0x180) [0x7fd30297dfc0] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd30320fdc5] > 6: (()+0xcb166) [0x7fd30320e166] > 7: (()+0xcb193) [0x7fd30320e193] > 8: (()+0xcb28e) [0x7fd30320e28e] > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x940) > [0x55b310] > 10: /usr/bin/ceph-mon() [0x497317] > 11: (Monitor::init()+0xc5a) [0x4857fa] > 12: (main()+0x2789) [0x46ac79] > 13: (__libc_start_main()+0xfd) [0x7fd302967c8d] > 14: /usr/bin/ceph-mon() [0x468309] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > --- end dump of recent events --- > > > > How can i fix this or prevent this from happening? > > -- > > Mit freundlichen Gr??en, > > Florian Wiessner > > Smart Weblications GmbH > Martinsberger Str. 1 > D-95119 Naila > > fon.: +49 9282 9638 200 > fax.: +49 9282 9638 205 > 24/7: +49 900 144 000 00 - 0,99 EUR/Min* > http://www.smart-weblications.de > > -- > Sitz der Gesellschaft: Naila > Gesch?ftsf?hrer: Florian Wiessner > HRB-Nr.: HRB 3840 Amtsgericht Hof > *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-07-24 23:55 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-07-04 11:45 monitor not starting Smart Weblications GmbH - Florian Wiessner 2012-07-04 16:25 ` Gregory Farnum 2012-07-04 17:02 ` Smart Weblications GmbH - Florian Wiessner 2012-07-04 19:05 ` Gregory Farnum 2012-07-24 15:14 ` Smart Weblications GmbH - Florian Wiessner 2012-07-24 23:55 ` Sage Weil
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.