* ceph python crash dumps
@ 2017-01-03 21:39 Wyllys Ingersoll
2017-01-04 0:22 ` Josh Durgin
0 siblings, 1 reply; 3+ messages in thread
From: Wyllys Ingersoll @ 2017-01-03 21:39 UTC (permalink / raw)
To: Ceph Development
I have a python based WSGI application running inside Apache that uses the
ceph python bindings to perform some operations. Recently its been
throwing exceptions when trying to make ceph connections, but I can't
figure out what is really causing the issue here.
Running ceph 10.2.5 with latest python-ceph packages.
Anyone seen this sort of thing before or have any idea how to prevent it?
The following error appears in the Apache error logs:
common/ceph_crypto.cc: In function 'void ceph::crypto::init(CephContext*)'
thread 7fb1ca8c4700 time 2017-01-03 16:25:31.334606
common/ceph_crypto.cc: 77: FAILED assert(crypto_context != __null)
ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
1: (()+0x169b7b) [0x7fb1b1b82b7b]
2: (()+0x1b88c0) [0x7fb1b1bd18c0]
3: (()+0x184615) [0x7fb1b1b9d615]
4: (()+0x181800) [0x7fb1b1b9a800]
5: (()+0x95f2d) [0x7fb1b1aaef2d]
6: (rados_connect()+0x1c) [0x7fb1b1a7dadc]
7: (()+0x29070) [0x7fb1bb1a4070]
8: (PyEval_EvalFrameEx()+0x4d4e) [0x7fb1d9fb324e]
9: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
10: (PyEval_EvalFrameEx()+0x48d8) [0x7fb1d9fb2dd8]
11: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
12: (PyEval_EvalFrameEx()+0x48d8) [0x7fb1d9fb2dd8]
13: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
14: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
15: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
16: (()+0x1c37a5) [0x7fb1d9fe97a5]
17: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
18: (()+0xbb7bd) [0x7fb1d9ee17bd]
19: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
20: (()+0x13467f) [0x7fb1d9f5a67f]
21: (()+0x13268f) [0x7fb1d9f5868f]
22: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
23: (PyEval_EvalFrameEx()+0x2316) [0x7fb1d9fb0816]
24: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
25: (()+0x1c37a5) [0x7fb1d9fe97a5]
26: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
27: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
28: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
29: (()+0x1c37a5) [0x7fb1d9fe97a5]
30: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
31: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
32: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
33: (()+0x1c37a5) [0x7fb1d9fe97a5]
34: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
35: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
36: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
37: (()+0x1c37a5) [0x7fb1d9fe97a5]
38: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
39: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
40: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
41: (()+0x1c37a5) [0x7fb1d9fe97a5]
42: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
43: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
44: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
45: (()+0x1c37a5) [0x7fb1d9fe97a5]
46: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
47: (PyEval_EvalFrameEx()+0xeb1) [0x7fb1d9faf3b1]
48: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
49: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
50: (PyEval_EvalFrameEx()+0x4b59) [0x7fb1d9fb3059]
51: (PyEval_EvalCodeEx()+0x80d) [0x7fb1d9fb454d]
52: (()+0x1c36d0) [0x7fb1d9fe96d0]
53: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
54: (()+0xbb7bd) [0x7fb1d9ee17bd]
55: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
56: (()+0x1347e5) [0x7fb1d9f5a7e5]
57: (PyObject_Call()+0x43) [0x7fb1d9f55d43]
58: (PyEval_CallObjectWithKeywords()+0x47) [0x7fb1d9fce577]
59: (()+0x17644) [0x7fb1da3a1644]
60: (()+0x1dba8) [0x7fb1da3a7ba8]
61: (()+0x8184) [0x7fb1de916184]
62: (clone()+0x6d) [0x7fb1de64337d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
to interpret this.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ceph python crash dumps
2017-01-03 21:39 ceph python crash dumps Wyllys Ingersoll
@ 2017-01-04 0:22 ` Josh Durgin
2017-01-04 15:20 ` Wyllys Ingersoll
0 siblings, 1 reply; 3+ messages in thread
From: Josh Durgin @ 2017-01-04 0:22 UTC (permalink / raw)
To: Wyllys Ingersoll, Ceph Development
On 01/03/2017 01:39 PM, Wyllys Ingersoll wrote:
> I have a python based WSGI application running inside Apache that uses the
> ceph python bindings to perform some operations. Recently its been
> throwing exceptions when trying to make ceph connections, but I can't
> figure out what is really causing the issue here.
>
> Running ceph 10.2.5 with latest python-ceph packages.
This looks like http://tracker.ceph.com/issues/14115 - in your case
likely due to a bug in librados.
> Anyone seen this sort of thing before or have any idea how to prevent it?
You might be able to avoid it by using a single Rados instance per
process, to avoid re-initializing crypto_context within librados.
If you have a short script that reproduces it, or can run under
valgrind's memcheck tool we can see if your case would be fixed by
https://github.com/ceph/ceph/pull/12624
or if there are other leaks to fix.
Josh
> The following error appears in the Apache error logs:
>
> common/ceph_crypto.cc: In function 'void ceph::crypto::init(CephContext*)'
> thread 7fb1ca8c4700 time 2017-01-03 16:25:31.334606
> common/ceph_crypto.cc: 77: FAILED assert(crypto_context != __null)
> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
> 1: (()+0x169b7b) [0x7fb1b1b82b7b]
> 2: (()+0x1b88c0) [0x7fb1b1bd18c0]
> 3: (()+0x184615) [0x7fb1b1b9d615]
> 4: (()+0x181800) [0x7fb1b1b9a800]
> 5: (()+0x95f2d) [0x7fb1b1aaef2d]
> 6: (rados_connect()+0x1c) [0x7fb1b1a7dadc]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ceph python crash dumps
2017-01-04 0:22 ` Josh Durgin
@ 2017-01-04 15:20 ` Wyllys Ingersoll
0 siblings, 0 replies; 3+ messages in thread
From: Wyllys Ingersoll @ 2017-01-04 15:20 UTC (permalink / raw)
To: Josh Durgin; +Cc: Ceph Development
Im pretty certain that is the same bug that we are hitting,
unfortunately I don't have a simple script to recreate it. It happens
as part of a WSGI application in Apache running with 30 threads and
multiple processes. If I tune it down to fewer threads and processes,
it becomes harder to trigger.
If this fix is accepted and integrated, please backport fix to the
next Jewel update (10.2.6).
-Wyllys Ingersoll
Keeper Technology, LLC
On Tue, Jan 3, 2017 at 7:22 PM, Josh Durgin <jdurgin@redhat.com> wrote:
> On 01/03/2017 01:39 PM, Wyllys Ingersoll wrote:
>>
>> I have a python based WSGI application running inside Apache that uses the
>> ceph python bindings to perform some operations. Recently its been
>> throwing exceptions when trying to make ceph connections, but I can't
>> figure out what is really causing the issue here.
>>
>> Running ceph 10.2.5 with latest python-ceph packages.
>
>
> This looks like http://tracker.ceph.com/issues/14115 - in your case
> likely due to a bug in librados.
>
>> Anyone seen this sort of thing before or have any idea how to prevent it?
>
>
> You might be able to avoid it by using a single Rados instance per
> process, to avoid re-initializing crypto_context within librados.
>
> If you have a short script that reproduces it, or can run under
> valgrind's memcheck tool we can see if your case would be fixed by
>
> https://github.com/ceph/ceph/pull/12624
>
> or if there are other leaks to fix.
>
> Josh
>
>
>> The following error appears in the Apache error logs:
>>
>> common/ceph_crypto.cc: In function 'void ceph::crypto::init(CephContext*)'
>> thread 7fb1ca8c4700 time 2017-01-03 16:25:31.334606
>> common/ceph_crypto.cc: 77: FAILED assert(crypto_context != __null)
>> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>> 1: (()+0x169b7b) [0x7fb1b1b82b7b]
>> 2: (()+0x1b88c0) [0x7fb1b1bd18c0]
>> 3: (()+0x184615) [0x7fb1b1b9d615]
>> 4: (()+0x181800) [0x7fb1b1b9a800]
>> 5: (()+0x95f2d) [0x7fb1b1aaef2d]
>> 6: (rados_connect()+0x1c) [0x7fb1b1a7dadc]
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-01-04 15:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-03 21:39 ceph python crash dumps Wyllys Ingersoll
2017-01-04 0:22 ` Josh Durgin
2017-01-04 15:20 ` Wyllys Ingersoll
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.