All of lore.kernel.org
 help / color / mirror / Atom feed
* ceph python crash dumps
@ 2017-01-03 21:39 Wyllys Ingersoll
  2017-01-04  0:22 ` Josh Durgin
  0 siblings, 1 reply; 3+ messages in thread
From: Wyllys Ingersoll @ 2017-01-03 21:39 UTC (permalink / raw)
  To: Ceph Development

I have a python based WSGI application running inside Apache that uses the
ceph python bindings to perform some operations.  Recently its been
throwing exceptions when trying to make ceph connections, but I can't
figure out what is really causing the issue here.

Running ceph 10.2.5 with latest python-ceph packages.

Anyone seen this sort of thing before or have any idea how to prevent it?

The following error appears in the Apache error logs:

common/ceph_crypto.cc: In function 'void ceph::crypto::init(CephContext*)'
thread 7fb1ca8c4700 time 2017-01-03 16:25:31.334606
common/ceph_crypto.cc: 77: FAILED assert(crypto_context != __null)
 ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
 1: (()+0x169b7b) [0x7fb1b1b82b7b]
 2: (()+0x1b88c0) [0x7fb1b1bd18c0]
 3: (()+0x184615) [0x7fb1b1b9d615]
 4: (()+0x181800) [0x7fb1b1b9a800]
 5: (()+0x95f2d) [0x7fb1b1aaef2d]
 6: (rados_connect()+0x1c) [0x7fb1b1a7dadc]
 7: (()+0x29070) [0x7fb1bb1a4070]
 8: (PyEval_EvalFrameEx()+0x4d4e) [0x7fb1d9fb324e]
 9: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 10: (PyEval_EvalFrameEx()+0x48d8) [0x7fb1d9fb2dd8]
 11: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 12: (PyEval_EvalFrameEx()+0x48d8) [0x7fb1d9fb2dd8]
 13: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
 14: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
 15: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 16: (()+0x1c37a5) [0x7fb1d9fe97a5]
 17: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 18: (()+0xbb7bd) [0x7fb1d9ee17bd]
 19: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 20: (()+0x13467f) [0x7fb1d9f5a67f]
 21: (()+0x13268f) [0x7fb1d9f5868f]
 22: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 23: (PyEval_EvalFrameEx()+0x2316) [0x7fb1d9fb0816]
 24: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 25: (()+0x1c37a5) [0x7fb1d9fe97a5]
 26: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 27: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
 28: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 29: (()+0x1c37a5) [0x7fb1d9fe97a5]
 30: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 31: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
 32: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 33: (()+0x1c37a5) [0x7fb1d9fe97a5]
 34: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 35: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
 36: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 37: (()+0x1c37a5) [0x7fb1d9fe97a5]
 38: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 39: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
 40: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 41: (()+0x1c37a5) [0x7fb1d9fe97a5]
 42: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 43: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
 44: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 45: (()+0x1c37a5) [0x7fb1d9fe97a5]
 46: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 47: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
 48: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
 49: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
 50: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
 51: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
 52: (()+0x1c36d0) [0x7fb1d9fe96d0]
 53: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 54: (()+0xbb7bd) [0x7fb1d9ee17bd]
 55: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 56: (()+0x1347e5) [0x7fb1d9f5a7e5]
 57: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
 58: (PyEval_CallObjectWithKeywords()+0x47) [0x7fb1d9fce577]
 59: (()+0x17644) [0x7fb1da3a1644]
 60: (()+0x1dba8) [0x7fb1da3a7ba8]
 61: (()+0x8184) [0x7fb1de916184]
 62: (clone()+0x6d) [0x7fb1de64337d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ceph python crash dumps
  2017-01-03 21:39 ceph python crash dumps Wyllys Ingersoll
@ 2017-01-04  0:22 ` Josh Durgin
  2017-01-04 15:20   ` Wyllys Ingersoll
  0 siblings, 1 reply; 3+ messages in thread
From: Josh Durgin @ 2017-01-04  0:22 UTC (permalink / raw)
  To: Wyllys Ingersoll, Ceph Development

On 01/03/2017 01:39 PM, Wyllys Ingersoll wrote:
> I have a python based WSGI application running inside Apache that uses the
> ceph python bindings to perform some operations.  Recently its been
> throwing exceptions when trying to make ceph connections, but I can't
> figure out what is really causing the issue here.
>
> Running ceph 10.2.5 with latest python-ceph packages.

This looks like http://tracker.ceph.com/issues/14115 - in your case
likely due to a bug in librados.

> Anyone seen this sort of thing before or have any idea how to prevent it?

You might be able to avoid it by using a single Rados instance per
process, to avoid re-initializing crypto_context within librados.

If you have a short script that reproduces it, or can run under
valgrind's memcheck tool we can see if your case would be fixed by

https://github.com/ceph/ceph/pull/12624

or if there are other leaks to fix.

Josh

> The following error appears in the Apache error logs:
>
> common/ceph_crypto.cc: In function 'void ceph::crypto::init(CephContext*)'
> thread 7fb1ca8c4700 time 2017-01-03 16:25:31.334606
> common/ceph_crypto.cc: 77: FAILED assert(crypto_context != __null)
>  ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>  1: (()+0x169b7b) [0x7fb1b1b82b7b]
>  2: (()+0x1b88c0) [0x7fb1b1bd18c0]
>  3: (()+0x184615) [0x7fb1b1b9d615]
>  4: (()+0x181800) [0x7fb1b1b9a800]
>  5: (()+0x95f2d) [0x7fb1b1aaef2d]
>  6: (rados_connect()+0x1c) [0x7fb1b1a7dadc]


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ceph python crash dumps
  2017-01-04  0:22 ` Josh Durgin
@ 2017-01-04 15:20   ` Wyllys Ingersoll
  0 siblings, 0 replies; 3+ messages in thread
From: Wyllys Ingersoll @ 2017-01-04 15:20 UTC (permalink / raw)
  To: Josh Durgin; +Cc: Ceph Development

Im pretty certain that is the same bug that we are hitting,
unfortunately I don't have a simple script to recreate it.  It happens
as part of a WSGI application in Apache running with 30 threads and
multiple processes.  If I tune it down to fewer threads and processes,
it becomes harder to trigger.

If this fix is accepted and integrated, please backport fix to the
next Jewel update (10.2.6).

-Wyllys Ingersoll
 Keeper Technology, LLC

On Tue, Jan 3, 2017 at 7:22 PM, Josh Durgin <jdurgin@redhat.com> wrote:
> On 01/03/2017 01:39 PM, Wyllys Ingersoll wrote:
>>
>> I have a python based WSGI application running inside Apache that uses the
>> ceph python bindings to perform some operations.  Recently its been
>> throwing exceptions when trying to make ceph connections, but I can't
>> figure out what is really causing the issue here.
>>
>> Running ceph 10.2.5 with latest python-ceph packages.
>
>
> This looks like http://tracker.ceph.com/issues/14115 - in your case
> likely due to a bug in librados.
>
>> Anyone seen this sort of thing before or have any idea how to prevent it?
>
>
> You might be able to avoid it by using a single Rados instance per
> process, to avoid re-initializing crypto_context within librados.
>
> If you have a short script that reproduces it, or can run under
> valgrind's memcheck tool we can see if your case would be fixed by
>
> https://github.com/ceph/ceph/pull/12624
>
> or if there are other leaks to fix.
>
> Josh
>
>
>> The following error appears in the Apache error logs:
>>
>> common/ceph_crypto.cc: In function 'void ceph::crypto::init(CephContext*)'
>> thread 7fb1ca8c4700 time 2017-01-03 16:25:31.334606
>> common/ceph_crypto.cc: 77: FAILED assert(crypto_context != __null)
>>  ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>>  1: (()+0x169b7b) [0x7fb1b1b82b7b]
>>  2: (()+0x1b88c0) [0x7fb1b1bd18c0]
>>  3: (()+0x184615) [0x7fb1b1b9d615]
>>  4: (()+0x181800) [0x7fb1b1b9a800]
>>  5: (()+0x95f2d) [0x7fb1b1aaef2d]
>>  6: (rados_connect()+0x1c) [0x7fb1b1a7dadc]
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-01-04 15:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-03 21:39 ceph python crash dumps Wyllys Ingersoll
2017-01-04  0:22 ` Josh Durgin
2017-01-04 15:20   ` Wyllys Ingersoll

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.