All of lore.kernel.org
 help / color / mirror / Atom feed
* Seg Fault on rgw 0.61.1 with cluster in 0.61
@ 2013-05-10  8:51 Yann ROBIN
  2013-05-10  9:28 ` Yann ROBIN
  2013-05-10 16:02 ` Yehuda Sadeh
  0 siblings, 2 replies; 7+ messages in thread
From: Yann ROBIN @ 2013-05-10  8:51 UTC (permalink / raw)
  To: ceph-devel

Hi,

I've tried to update the rgw to 0.61.1 and now I have a segfault while connecting to the 0.61 cluster.
The rgw with version 0.61 run fine.

*** Caught signal (Segmentation fault) **
 in thread 7fc1fec79780
 ceph version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
 1: /usr/bin/radosgw() [0x4f19da]
 2: (()+0xfcb0) [0x7fc1fcf0dcb0]
 3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
 4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
 5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
 6: (RGWRados::initialize()+0x53) [0x5b5c03]
 7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) [0x5b9b39]
 8: (main()+0x2d7) [0x4b4ed7]
 9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
 10: /usr/bin/radosgw() [0x4b6db1]
2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal (Segmentation fault) **
 in thread 7fc1fec79780

-- 
Yann ROBIN
YouScribe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Seg Fault on rgw 0.61.1 with cluster in 0.61
  2013-05-10  8:51 Seg Fault on rgw 0.61.1 with cluster in 0.61 Yann ROBIN
@ 2013-05-10  9:28 ` Yann ROBIN
  2013-05-10 15:30   ` Sage Weil
  2013-05-10 16:02 ` Yehuda Sadeh
  1 sibling, 1 reply; 7+ messages in thread
From: Yann ROBIN @ 2013-05-10  9:28 UTC (permalink / raw)
  To: ceph-devel

I've downgraded the rgw to 0.60 and I still had the same issue.

I returned to 0.61.1 and noticed that sometimes (1 on 10) the radosgw start normally.
I've installed debug package of librados to do some debugging and now it always works...

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Yann ROBIN
Sent: vendredi 10 mai 2013 10:51
To: ceph-devel@vger.kernel.org
Subject: Seg Fault on rgw 0.61.1 with cluster in 0.61

Hi,

I've tried to update the rgw to 0.61.1 and now I have a segfault while connecting to the 0.61 cluster.
The rgw with version 0.61 run fine.

*** Caught signal (Segmentation fault) **  in thread 7fc1fec79780  ceph version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
 1: /usr/bin/radosgw() [0x4f19da]
 2: (()+0xfcb0) [0x7fc1fcf0dcb0]
 3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
 4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
 5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
 6: (RGWRados::initialize()+0x53) [0x5b5c03]
 7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) [0x5b9b39]
 8: (main()+0x2d7) [0x4b4ed7]
 9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
 10: /usr/bin/radosgw() [0x4b6db1]
2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal (Segmentation fault) **  in thread 7fc1fec79780

--
Yann ROBIN
YouScribe


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Seg Fault on rgw 0.61.1 with cluster in 0.61
  2013-05-10  9:28 ` Yann ROBIN
@ 2013-05-10 15:30   ` Sage Weil
  2013-05-11 10:02     ` Yann ROBIN
  0 siblings, 1 reply; 7+ messages in thread
From: Sage Weil @ 2013-05-10 15:30 UTC (permalink / raw)
  To: Yann ROBIN; +Cc: ceph-devel

On Fri, 10 May 2013, Yann ROBIN wrote:
> I've downgraded the rgw to 0.60 and I still had the same issue.

Yeah, the radosgw code is essentially identical between the two versions 
(there is a new config option but it is unused).
 
> I returned to 0.61.1 and noticed that sometimes (1 on 10) the radosgw start normally.
> I've installed debug package of librados to do some debugging and now it always works...

That is disconcerting.

I've pushed wip-rgw-crash that prints some debug information in 
ceph::crypto::init... do you mind instaling that (without debug packages 
:) and seeing if you can reproduce the problem?  it should print out a 
couple of lines to stdout, but you need to run radosgw with the '-f' 
option (which prevents fork).  hopefully the problem is reproducible in 
that case.

What distro are you running?
sage


 > 
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Yann ROBIN
> Sent: vendredi 10 mai 2013 10:51
> To: ceph-devel@vger.kernel.org
> Subject: Seg Fault on rgw 0.61.1 with cluster in 0.61
> 
> Hi,
> 
> I've tried to update the rgw to 0.61.1 and now I have a segfault while connecting to the 0.61 cluster.
> The rgw with version 0.61 run fine.
> 
> *** Caught signal (Segmentation fault) **  in thread 7fc1fec79780  ceph version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
>  1: /usr/bin/radosgw() [0x4f19da]
>  2: (()+0xfcb0) [0x7fc1fcf0dcb0]
>  3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
>  4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
>  5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
>  6: (RGWRados::initialize()+0x53) [0x5b5c03]
>  7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) [0x5b9b39]
>  8: (main()+0x2d7) [0x4b4ed7]
>  9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
>  10: /usr/bin/radosgw() [0x4b6db1]
> 2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal (Segmentation fault) **  in thread 7fc1fec79780
> 
> --
> Yann ROBIN
> YouScribe
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Seg Fault on rgw 0.61.1 with cluster in 0.61
  2013-05-10  8:51 Seg Fault on rgw 0.61.1 with cluster in 0.61 Yann ROBIN
  2013-05-10  9:28 ` Yann ROBIN
@ 2013-05-10 16:02 ` Yehuda Sadeh
  2013-05-14 12:16   ` Faidon Liambotis
  1 sibling, 1 reply; 7+ messages in thread
From: Yehuda Sadeh @ 2013-05-10 16:02 UTC (permalink / raw)
  To: Yann ROBIN; +Cc: ceph-devel

On Fri, May 10, 2013 at 1:51 AM, Yann ROBIN <yann.robin@youscribe.com> wrote:
> Hi,
>
> I've tried to update the rgw to 0.61.1 and now I have a segfault while connecting to the 0.61 cluster.
> The rgw with version 0.61 run fine.
>
> *** Caught signal (Segmentation fault) **
>  in thread 7fc1fec79780
>  ceph version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
>  1: /usr/bin/radosgw() [0x4f19da]
>  2: (()+0xfcb0) [0x7fc1fcf0dcb0]
>  3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
>  4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
>  5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
>  6: (RGWRados::initialize()+0x53) [0x5b5c03]
>  7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) [0x5b9b39]
>  8: (main()+0x2d7) [0x4b4ed7]
>  9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
>  10: /usr/bin/radosgw() [0x4b6db1]
> 2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal (Segmentation fault) **
>  in thread 7fc1fec79780
>

Sounds to me like package versioning mismastch. Could it be that one
of the ceph packages was on a different version (e.g., librados).

Yehuda

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Seg Fault on rgw 0.61.1 with cluster in 0.61
  2013-05-10 15:30   ` Sage Weil
@ 2013-05-11 10:02     ` Yann ROBIN
  0 siblings, 0 replies; 7+ messages in thread
From: Yann ROBIN @ 2013-05-11 10:02 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hi,

We're using ubuntu 12.04.2 with kernel 3.2.0-41-virtual.
I'll try changing removing the dbg package and install your branch Monday.


-----Message d'origine-----
De : Sage Weil [mailto:sage@inktank.com] 
Envoyé : vendredi 10 mai 2013 17:30
À : Yann ROBIN
Cc : ceph-devel@vger.kernel.org
Objet : RE: Seg Fault on rgw 0.61.1 with cluster in 0.61

On Fri, 10 May 2013, Yann ROBIN wrote:
> I've downgraded the rgw to 0.60 and I still had the same issue.

Yeah, the radosgw code is essentially identical between the two versions (there is a new config option but it is unused).
 
> I returned to 0.61.1 and noticed that sometimes (1 on 10) the radosgw start normally.
> I've installed debug package of librados to do some debugging and now it always works...

That is disconcerting.

I've pushed wip-rgw-crash that prints some debug information in ceph::crypto::init... do you mind instaling that (without debug packages
:) and seeing if you can reproduce the problem?  it should print out a couple of lines to stdout, but you need to run radosgw with the '-f' 
option (which prevents fork).  hopefully the problem is reproducible in that case.

What distro are you running?
sage


 > 
> -----Original Message-----
> From: ceph-devel-owner@vger.kernel.org 
> [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Yann ROBIN
> Sent: vendredi 10 mai 2013 10:51
> To: ceph-devel@vger.kernel.org
> Subject: Seg Fault on rgw 0.61.1 with cluster in 0.61
> 
> Hi,
> 
> I've tried to update the rgw to 0.61.1 and now I have a segfault while connecting to the 0.61 cluster.
> The rgw with version 0.61 run fine.
> 
> *** Caught signal (Segmentation fault) **  in thread 7fc1fec79780  
> ceph version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
>  1: /usr/bin/radosgw() [0x4f19da]
>  2: (()+0xfcb0) [0x7fc1fcf0dcb0]
>  3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
>  4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
>  5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
>  6: (RGWRados::initialize()+0x53) [0x5b5c03]
>  7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) 
> [0x5b9b39]
>  8: (main()+0x2d7) [0x4b4ed7]
>  9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
>  10: /usr/bin/radosgw() [0x4b6db1]
> 2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal 
> (Segmentation fault) **  in thread 7fc1fec79780
> 
> --
> Yann ROBIN
> YouScribe
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Seg Fault on rgw 0.61.1 with cluster in 0.61
  2013-05-10 16:02 ` Yehuda Sadeh
@ 2013-05-14 12:16   ` Faidon Liambotis
  2013-05-16 17:35     ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Faidon Liambotis @ 2013-05-14 12:16 UTC (permalink / raw)
  To: Yehuda Sadeh; +Cc: Yann ROBIN, ceph-devel

On 05/10/13 19:02, Yehuda Sadeh wrote:
> Sounds to me like package versioning mismastch. Could it be that one
> of the ceph packages was on a different version (e.g., librados).

I attempted to install and run radosgw 0.61.1 on a system with a 0.56.4 
librados and it segfaulted with the same backtrace as the one in this 
thread.

If a newer radosgw can't work with an older librados, this should be 
reflected on the package relationships -- hopefully without nasty 
Breaks/Conflicts, but with a proper librados SONAME bump that will allow 
coinstability between librados2 and e.g. librados3. Or symbol versioning 
could be employed to provide backwards compatibility.

This installed-but-segfaulting combination of packages shouldn't be 
allowed by apt to exist on the system. FWIW, if these were packages in 
Debian (and, presumably, Ubuntu), that would be a severity: 
serious/release critical bug.

It'd also be nice to be able to do things like mixing newer radosgw 
while also keeping the old librados2 on the system. My use case is that 
I have monitors and radosgw on the same boxes and I'd like to keep 
monitors on bobtail, while at the same time use some of the much needed 
radosgw cuttlefish features.

Regards,
Faidon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Seg Fault on rgw 0.61.1 with cluster in 0.61
  2013-05-14 12:16   ` Faidon Liambotis
@ 2013-05-16 17:35     ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2013-05-16 17:35 UTC (permalink / raw)
  To: Faidon Liambotis; +Cc: Yehuda Sadeh, Yann ROBIN, ceph-devel

On Tue, 14 May 2013, Faidon Liambotis wrote:
> On 05/10/13 19:02, Yehuda Sadeh wrote:
> > Sounds to me like package versioning mismastch. Could it be that one
> > of the ceph packages was on a different version (e.g., librados).
> 
> I attempted to install and run radosgw 0.61.1 on a system with a 0.56.4
> librados and it segfaulted with the same backtrace as the one in this thread.
> 
> If a newer radosgw can't work with an older librados, this should be reflected
> on the package relationships -- hopefully without nasty Breaks/Conflicts, but
> with a proper librados SONAME bump that will allow coinstability between
> librados2 and e.g. librados3. Or symbol versioning could be employed to
> provide backwards compatibility.
> 
> This installed-but-segfaulting combination of packages shouldn't be allowed by
> apt to exist on the system. FWIW, if these were packages in Debian (and,
> presumably, Ubuntu), that would be a severity: serious/release critical bug.

I believe this is actually a problem with radosgw statically linking some 
of the same stuff that librados includes, and not with the librados ABI 
changes.  We need to fix that somehow.. In the meantime, though, setting 
the radosgw package to require a matching librados2 ought to do the trick.

> It'd also be nice to be able to do things like mixing newer radosgw while also
> keeping the old librados2 on the system. My use case is that I have monitors
> and radosgw on the same boxes and I'd like to keep monitors on bobtail, while
> at the same time use some of the much needed radosgw cuttlefish features.

ceph-common need sto match the librados2 version, but ceph (which contains 
ceph-mon) does not, so you should be able to have dufferent ceph-mon and 
radosgw versions if we do the above.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-05-16 17:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-10  8:51 Seg Fault on rgw 0.61.1 with cluster in 0.61 Yann ROBIN
2013-05-10  9:28 ` Yann ROBIN
2013-05-10 15:30   ` Sage Weil
2013-05-11 10:02     ` Yann ROBIN
2013-05-10 16:02 ` Yehuda Sadeh
2013-05-14 12:16   ` Faidon Liambotis
2013-05-16 17:35     ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.