* deadlocks in rbd unmap and map
@ 2016-07-08 1:28 Patrick McLean
2016-07-08 11:40 ` Ilya Dryomov
0 siblings, 1 reply; 15+ messages in thread
From: Patrick McLean @ 2016-07-08 1:28 UTC (permalink / raw)
To: ceph-devel
This is on linus git master as of 2016/07/01
These appear to be two separate deadlocks, one on a a map operation,
and one on an unmap operation. We can reproduce these pretty
regularly, but it seems like there is some sort of race condition, as
it happens no where near every time.
We are currently working to reproduce on a kernel with lockdep
enabled, as well as better debugging information. Please let me know
if I can provide any more information.
# grep -C 40 rbd /proc/*/stack
/proc/14109/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
/proc/14109/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
/proc/14109/stack:[<ffffffffa01561ee>]
rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
/proc/14109/stack:[<ffffffffa0156aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
/proc/14109/stack:[<ffffffffa0157450>] rbd_watch_cb+0x0/0xa0 [rbd]
/proc/14109/stack:[<ffffffffa0157586>]
rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
/proc/14109/stack:[<ffffffffa015640e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
/proc/14109/stack:[<ffffffffa015828f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
/proc/14109/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
/proc/14109/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
/proc/14109/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
/proc/14109/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
/proc/14109/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
/proc/14109/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
/proc/14109/stack-[<ffffffffffffffff>] 0xffffffffffffffff
--
/proc/29744/stack-[<ffffffff813c7c63>] call_rwsem_down_write_failed+0x13/0x20
/proc/29744/stack:[<ffffffffa01572dd>] rbd_dev_refresh+0x1d/0xf0 [rbd]
/proc/29744/stack:[<ffffffffa0157413>] rbd_watch_errcb+0x33/0x70 [rbd]
/proc/29744/stack-[<ffffffffa0126a2e>] do_watch_error+0x2e/0x40 [libceph]
/proc/29744/stack-[<ffffffff8111d935>] process_one_work+0x145/0x3c0
/proc/29744/stack-[<ffffffff8111dbfa>] worker_thread+0x4a/0x470
/proc/29744/stack-[<ffffffff8111dbb0>] worker_thread+0x0/0x470
/proc/29744/stack-[<ffffffff81122e4d>] kthread+0xbd/0xe0
/proc/29744/stack-[<ffffffff8181717f>] ret_from_fork+0x1f/0x40
/proc/29744/stack-[<ffffffff81122d90>] kthread+0x0/0xe0
/proc/29744/stack-[<ffffffffffffffff>] 0xffffffffffffffff
--
/proc/3426/stack-[<ffffffff8115ea31>] try_to_del_timer_sync+0x41/0x60
/proc/3426/stack-[<ffffffff8115ea94>] del_timer_sync+0x44/0x50
/proc/3426/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
/proc/3426/stack-[<ffffffff8111b35f>] flush_workqueue+0x12f/0x540
/proc/3426/stack:[<ffffffffa015376b>] do_rbd_remove.isra.25+0xfb/0x190 [rbd]
/proc/3426/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
/proc/3426/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
/proc/3426/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
/proc/3426/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
/proc/3426/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
/proc/3426/stack-[<ffffffffffffffff>] 0xffffffffffffffff
# ps aux | egrep '(14109|29744|3426)'
root 3426 0.0 0.0 181256 10488 ? Dl Jul06 0:00 rbd
unmap /dev/rbd0
root 14109 0.0 0.0 246704 10228 ? Sl Jul03 0:01 rbd
map --pool XXXX XXXXXXX
root 29744 0.0 0.0 0 0 ? D Jul05 0:00 [kworker/u16:2]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-08 1:28 deadlocks in rbd unmap and map Patrick McLean
@ 2016-07-08 11:40 ` Ilya Dryomov
2016-07-08 17:51 ` Patrick McLean
0 siblings, 1 reply; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-08 11:40 UTC (permalink / raw)
To: Patrick McLean; +Cc: Ceph Development
On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
> This is on linus git master as of 2016/07/01
>
> These appear to be two separate deadlocks, one on a a map operation,
> and one on an unmap operation. We can reproduce these pretty
> regularly, but it seems like there is some sort of race condition, as
> it happens no where near every time.
>
> We are currently working to reproduce on a kernel with lockdep
> enabled, as well as better debugging information. Please let me know
> if I can provide any more information.
I can see what happened from the traces, so no need to bother with
lockdep. Figuring out how and why is trickier...
It's a single deadlock between "rbd map" and a kworker thread, a later
"rbd unmap" is just a victim.
Are you mapping the same image more than once?
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-08 11:40 ` Ilya Dryomov
@ 2016-07-08 17:51 ` Patrick McLean
2016-07-08 18:46 ` Ilya Dryomov
0 siblings, 1 reply; 15+ messages in thread
From: Patrick McLean @ 2016-07-08 17:51 UTC (permalink / raw)
To: Ilya Dryomov; +Cc: Ceph Development
On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>> This is on linus git master as of 2016/07/01
>>
>> These appear to be two separate deadlocks, one on a a map operation,
>> and one on an unmap operation. We can reproduce these pretty
>> regularly, but it seems like there is some sort of race condition, as
>> it happens no where near every time.
>>
>
> It's a single deadlock between "rbd map" and a kworker thread, a later
> "rbd unmap" is just a victim.
>
> Are you mapping the same image more than once?
>
We shouldn't be, there is a higher-level locking system that is
supposed to prevent that.
Thanks,
Patrick
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-08 17:51 ` Patrick McLean
@ 2016-07-08 18:46 ` Ilya Dryomov
2016-07-08 20:30 ` Patrick McLean
0 siblings, 1 reply; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-08 18:46 UTC (permalink / raw)
To: Patrick McLean; +Cc: Ceph Development
On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>> This is on linus git master as of 2016/07/01
>>>
>>> These appear to be two separate deadlocks, one on a a map operation,
>>> and one on an unmap operation. We can reproduce these pretty
>>> regularly, but it seems like there is some sort of race condition, as
>>> it happens no where near every time.
>>>
>
>>
>> It's a single deadlock between "rbd map" and a kworker thread, a later
>> "rbd unmap" is just a victim.
>>
>> Are you mapping the same image more than once?
>>
>
> We shouldn't be, there is a higher-level locking system that is
> supposed to prevent that.
It's actually allowed, I'm just trying to get an idea of what was going
on.
I spoke too soon. The trace of pid 14109 is inconsistent - the entries
in the middle don't make sense. Do you have a different set of traces
or are they all the same?
Were there other images mapped at the time 14109 was exec'ed? Other
concurrent "rbd map" processes?
What was/is the state of the cluster? Can you provide the output of
ceph -s? Any stuck PGs?
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-08 18:46 ` Ilya Dryomov
@ 2016-07-08 20:30 ` Patrick McLean
2016-07-09 14:30 ` Ilya Dryomov
0 siblings, 1 reply; 15+ messages in thread
From: Patrick McLean @ 2016-07-08 20:30 UTC (permalink / raw)
To: Ilya Dryomov
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Fri, Jul 8, 2016 at 11:46 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>> This is on linus git master as of 2016/07/01
>>>>
>>>> These appear to be two separate deadlocks, one on a a map operation,
>>>> and one on an unmap operation. We can reproduce these pretty
>>>> regularly, but it seems like there is some sort of race condition, as
>>>> it happens no where near every time.
>>>>
>>
>>>
>>> It's a single deadlock between "rbd map" and a kworker thread, a later
>>> "rbd unmap" is just a victim.
>>>
>>> Are you mapping the same image more than once?
>>>
>>
>> We shouldn't be, there is a higher-level locking system that is
>> supposed to prevent that.
>
> It's actually allowed, I'm just trying to get an idea of what was going
> on.
>
> I spoke too soon. The trace of pid 14109 is inconsistent - the entries
> in the middle don't make sense. Do you have a different set of traces
> or are they all the same?
Here is another backtrace. It seems that once it happens to a
particular image once, it always does the same thing. We have not
figured out what triggers the first hang.
# grep -C 40 rbd /proc/*/stack
/proc/5295/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
/proc/5295/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
/proc/5295/stack:[<ffffffffa01621ee>]
rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
/proc/5295/stack:[<ffffffffa0162aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
/proc/5295/stack:[<ffffffffa0163450>] rbd_watch_cb+0x0/0xa0 [rbd]
/proc/5295/stack:[<ffffffffa0163586>]
rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
/proc/5295/stack:[<ffffffffa016240e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
/proc/5295/stack:[<ffffffffa016428f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
/proc/5295/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
/proc/5295/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
/proc/5295/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
/proc/5295/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
/proc/5295/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
/proc/5295/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
/proc/5295/stack-[<ffffffffffffffff>] 0xffffffffffffffff
# ps aux | grep 5295
root 5295 0.0 0.0 246704 10352 pts/4 Sl+ 20:20 0:00 rbd
map --pool XXXXXXX XXXXXXXX
> Were there other images mapped at the time 14109 was exec'ed? Other
> concurrent "rbd map" processes?
There weren't when I got this particular backtrace, previous times
when this happened may or may not have had other images mapped.
> What was/is the state of the cluster? Can you provide the output of
> ceph -s? Any stuck PGs?
Here is the output of "ceph -s", the cluster is healthy.
# ceph -s
cluster XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
health HEALTH_OK
monmap e1: 3 mons at
{cephmon1.XXXXXXXXXXXXXXXXXXXXX=XXX.XXX.XXX.XXX:6789/0,cephmon1.XXXXXXXXXXXXXXXXXXXXX=XXX.XXX.XXX.XXX:6789/0,cephmon1.XXXXXXXXXXXXXXXXXXXXX=XXX.XXX.XXX.XXX:6789/0}
election epoch 48, quorum 0,1,2
cephmon1.XXXXXXXXXXXXXXXXXXXXX,cephmon1.XXXXXXXXXXXXXXXXXXXXX,cephmon1.XXXXXXXXXXXXXXXXXXXXX
osdmap e40242: 48 osds: 48 up, 48 in
flags sortbitwise
pgmap v1904602: 4160 pgs, 3 pools, 273 GB data, 867 kobjects
4065 GB used, 170 TB / 174 TB avail
4160 active+clean
client io 9872 kB/s rd, 12650 op/s
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-08 20:30 ` Patrick McLean
@ 2016-07-09 14:30 ` Ilya Dryomov
2016-07-11 19:12 ` Victor Payno
2016-07-13 0:38 ` Patrick McLean
0 siblings, 2 replies; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-09 14:30 UTC (permalink / raw)
To: Patrick McLean
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Fri, Jul 8, 2016 at 10:30 PM, Patrick McLean <patrickm@gaikai.com> wrote:
> On Fri, Jul 8, 2016 at 11:46 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>> On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>> This is on linus git master as of 2016/07/01
>>>>>
>>>>> These appear to be two separate deadlocks, one on a a map operation,
>>>>> and one on an unmap operation. We can reproduce these pretty
>>>>> regularly, but it seems like there is some sort of race condition, as
>>>>> it happens no where near every time.
>>>>>
>>>
>>>>
>>>> It's a single deadlock between "rbd map" and a kworker thread, a later
>>>> "rbd unmap" is just a victim.
>>>>
>>>> Are you mapping the same image more than once?
>>>>
>>>
>>> We shouldn't be, there is a higher-level locking system that is
>>> supposed to prevent that.
>>
>> It's actually allowed, I'm just trying to get an idea of what was going
>> on.
>>
>> I spoke too soon. The trace of pid 14109 is inconsistent - the entries
>> in the middle don't make sense. Do you have a different set of traces
>> or are they all the same?
>
> Here is another backtrace. It seems that once it happens to a
> particular image once, it always does the same thing. We have not
> figured out what triggers the first hang.
>
> # grep -C 40 rbd /proc/*/stack
> /proc/5295/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
> /proc/5295/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
> /proc/5295/stack:[<ffffffffa01621ee>]
> rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
> /proc/5295/stack:[<ffffffffa0162aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
> /proc/5295/stack:[<ffffffffa0163450>] rbd_watch_cb+0x0/0xa0 [rbd]
> /proc/5295/stack:[<ffffffffa0163586>]
> rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
> /proc/5295/stack:[<ffffffffa016240e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
> /proc/5295/stack:[<ffffffffa016428f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
> /proc/5295/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
> /proc/5295/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
> /proc/5295/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
> /proc/5295/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
> /proc/5295/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
> /proc/5295/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> /proc/5295/stack-[<ffffffffffffffff>] 0xffffffffffffffff
>
> # ps aux | grep 5295
> root 5295 0.0 0.0 246704 10352 pts/4 Sl+ 20:20 0:00 rbd
> map --pool XXXXXXX XXXXXXXX
Anything in dmesg? It looks like rbd map is waiting for a reply to one
of the header read-in requests and is never woken up. rbd map holds
a semaphore, blocking the kworker, which blocks rbd unmap.
Can you set "debug ms = 1" on all OSDs and try to reproduce?
I'd also like to see the content of /sys/kernel/debug/ceph/*/osdc when
it happens again.
How long does it take to hit it on average?
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-09 14:30 ` Ilya Dryomov
@ 2016-07-11 19:12 ` Victor Payno
2016-07-13 0:38 ` Patrick McLean
1 sibling, 0 replies; 15+ messages in thread
From: Victor Payno @ 2016-07-11 19:12 UTC (permalink / raw)
To: Ilya Dryomov
Cc: Patrick McLean, Ceph Development, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On the ceph clients we haven't seen any ceph/rbd related dmesg other than these.
[28623.056936] libceph: mon0 10.9.228.50:6789 session established
[28623.060888] libceph: client773380 fsid e88d2684-47c1-5a64-a275-6e375d11b557
[28623.131952] rbd: rbd0: added with size 0x2540be400
[28624.040816] EXT4-fs (rbd0): mounted filesystem without journal. Opts: (null)
[26123.339863] libceph: osd1 10.9.228.101:6804 socket closed (con state OPEN)
We can enable "debug ms=1" on the OSD servers.
We'll get you the output next time we are able to reproduce it (this
weekend's loadtest failed to reproduce the error with kernel
4.7.0-rc6-vanilla-ams-3-00134-gcfae7e3-dirty).
It doesn't happen when we have a single instance of map/unmap. It
seems to happen 1 or more hours into the load test.
On Sat, Jul 9, 2016 at 7:30 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Fri, Jul 8, 2016 at 10:30 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>> On Fri, Jul 8, 2016 at 11:46 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>> On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>>> This is on linus git master as of 2016/07/01
>>>>>>
>>>>>> These appear to be two separate deadlocks, one on a a map operation,
>>>>>> and one on an unmap operation. We can reproduce these pretty
>>>>>> regularly, but it seems like there is some sort of race condition, as
>>>>>> it happens no where near every time.
>>>>>>
>>>>
>>>>>
>>>>> It's a single deadlock between "rbd map" and a kworker thread, a later
>>>>> "rbd unmap" is just a victim.
>>>>>
>>>>> Are you mapping the same image more than once?
>>>>>
>>>>
>>>> We shouldn't be, there is a higher-level locking system that is
>>>> supposed to prevent that.
>>>
>>> It's actually allowed, I'm just trying to get an idea of what was going
>>> on.
>>>
>>> I spoke too soon. The trace of pid 14109 is inconsistent - the entries
>>> in the middle don't make sense. Do you have a different set of traces
>>> or are they all the same?
>>
>> Here is another backtrace. It seems that once it happens to a
>> particular image once, it always does the same thing. We have not
>> figured out what triggers the first hang.
>>
>> # grep -C 40 rbd /proc/*/stack
>> /proc/5295/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
>> /proc/5295/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
>> /proc/5295/stack:[<ffffffffa01621ee>]
>> rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
>> /proc/5295/stack:[<ffffffffa0162aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
>> /proc/5295/stack:[<ffffffffa0163450>] rbd_watch_cb+0x0/0xa0 [rbd]
>> /proc/5295/stack:[<ffffffffa0163586>]
>> rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
>> /proc/5295/stack:[<ffffffffa016240e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
>> /proc/5295/stack:[<ffffffffa016428f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
>> /proc/5295/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
>> /proc/5295/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
>> /proc/5295/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
>> /proc/5295/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
>> /proc/5295/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
>> /proc/5295/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
>> /proc/5295/stack-[<ffffffffffffffff>] 0xffffffffffffffff
>>
>> # ps aux | grep 5295
>> root 5295 0.0 0.0 246704 10352 pts/4 Sl+ 20:20 0:00 rbd
>> map --pool XXXXXXX XXXXXXXX
>
> Anything in dmesg? It looks like rbd map is waiting for a reply to one
> of the header read-in requests and is never woken up. rbd map holds
> a semaphore, blocking the kworker, which blocks rbd unmap.
>
> Can you set "debug ms = 1" on all OSDs and try to reproduce?
> I'd also like to see the content of /sys/kernel/debug/ceph/*/osdc when
> it happens again.
>
> How long does it take to hit it on average?
>
> Thanks,
>
> Ilya
--
Victor Payno
ビクター·ペイン
Sr. Release Engineer
シニアリリースエンジニア
Gaikai, a Sony Computer Entertainment Company ∆○×□
ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
65 Enterprise
Aliso Viejo, CA 92656 USA
Web: www.gaikai.com
Email: vpayno@gaikai.com
Phone: (949) 330-6850
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-09 14:30 ` Ilya Dryomov
2016-07-11 19:12 ` Victor Payno
@ 2016-07-13 0:38 ` Patrick McLean
2016-07-13 7:38 ` Ilya Dryomov
1 sibling, 1 reply; 15+ messages in thread
From: Patrick McLean @ 2016-07-13 0:38 UTC (permalink / raw)
To: Ilya Dryomov
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Sat, Jul 9, 2016 at 7:30 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Fri, Jul 8, 2016 at 10:30 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>> On Fri, Jul 8, 2016 at 11:46 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>> On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>>> This is on linus git master as of 2016/07/01
>>>>>>
>>>>>> These appear to be two separate deadlocks, one on a a map operation,
>>>>>> and one on an unmap operation. We can reproduce these pretty
>>>>>> regularly, but it seems like there is some sort of race condition, as
>>>>>> it happens no where near every time.
>>>>>>
>>>>
>>>>>
>>>>> It's a single deadlock between "rbd map" and a kworker thread, a later
>>>>> "rbd unmap" is just a victim.
>>>>>
>>>>> Are you mapping the same image more than once?
>>>>>
>>>>
>>>> We shouldn't be, there is a higher-level locking system that is
>>>> supposed to prevent that.
>>>
>>> It's actually allowed, I'm just trying to get an idea of what was going
>>> on.
>>>
>>> I spoke too soon. The trace of pid 14109 is inconsistent - the entries
>>> in the middle don't make sense. Do you have a different set of traces
>>> or are they all the same?
>>
>> Here is another backtrace. It seems that once it happens to a
>> particular image once, it always does the same thing. We have not
>> figured out what triggers the first hang.
>>
>> # grep -C 40 rbd /proc/*/stack
>> /proc/5295/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
>> /proc/5295/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
>> /proc/5295/stack:[<ffffffffa01621ee>]
>> rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
>> /proc/5295/stack:[<ffffffffa0162aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
>> /proc/5295/stack:[<ffffffffa0163450>] rbd_watch_cb+0x0/0xa0 [rbd]
>> /proc/5295/stack:[<ffffffffa0163586>]
>> rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
>> /proc/5295/stack:[<ffffffffa016240e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
>> /proc/5295/stack:[<ffffffffa016428f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
>> /proc/5295/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
>> /proc/5295/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
>> /proc/5295/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
>> /proc/5295/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
>> /proc/5295/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
>> /proc/5295/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
>> /proc/5295/stack-[<ffffffffffffffff>] 0xffffffffffffffff
>>
>> # ps aux | grep 5295
>> root 5295 0.0 0.0 246704 10352 pts/4 Sl+ 20:20 0:00 rbd
>> map --pool XXXXXXX XXXXXXXX
>
> Anything in dmesg? It looks like rbd map is waiting for a reply to one
> of the header read-in requests and is never woken up. rbd map holds
> a semaphore, blocking the kworker, which blocks rbd unmap.
>
> Can you set "debug ms = 1" on all OSDs and try to reproduce?
> I'd also like to see the content of /sys/kernel/debug/ceph/*/osdc when
> it happens again.
>
Here is the contents of the osdc debug file on a machine that had been
in that state for 3 days.
# cat /sys/kernel/debug/ceph/*/osdc
REQUESTS 1 homeless 0
7 osd38 4.27a8d388 [38,16,47]/38 [38,16,47]/38
rbd_header.bea4a16ebd6a9a 0x400011 1 0'0 call
LINGER REQUESTS
1 osd38 4.27a8d388 [38,16,47]/38 [38,16,47]/38
rbd_header.bea4a16ebd6a9a 0x24 0 WC/0
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-13 0:38 ` Patrick McLean
@ 2016-07-13 7:38 ` Ilya Dryomov
[not found] ` <CAE2NFgUxwvS_LnsoNp+zTK+4e+DSF24i==sPW2ARcZ9LJn1Otg@mail.gmail.com>
2016-07-26 11:27 ` Ilya Dryomov
0 siblings, 2 replies; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-13 7:38 UTC (permalink / raw)
To: Patrick McLean
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Wed, Jul 13, 2016 at 2:38 AM, Patrick McLean <patrickm@gaikai.com> wrote:
> On Sat, Jul 9, 2016 at 7:30 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>> On Fri, Jul 8, 2016 at 10:30 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>> On Fri, Jul 8, 2016 at 11:46 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>> On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>>>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>>>> This is on linus git master as of 2016/07/01
>>>>>>>
>>>>>>> These appear to be two separate deadlocks, one on a a map operation,
>>>>>>> and one on an unmap operation. We can reproduce these pretty
>>>>>>> regularly, but it seems like there is some sort of race condition, as
>>>>>>> it happens no where near every time.
>>>>>>>
>>>>>
>>>>>>
>>>>>> It's a single deadlock between "rbd map" and a kworker thread, a later
>>>>>> "rbd unmap" is just a victim.
>>>>>>
>>>>>> Are you mapping the same image more than once?
>>>>>>
>>>>>
>>>>> We shouldn't be, there is a higher-level locking system that is
>>>>> supposed to prevent that.
>>>>
>>>> It's actually allowed, I'm just trying to get an idea of what was going
>>>> on.
>>>>
>>>> I spoke too soon. The trace of pid 14109 is inconsistent - the entries
>>>> in the middle don't make sense. Do you have a different set of traces
>>>> or are they all the same?
>>>
>>> Here is another backtrace. It seems that once it happens to a
>>> particular image once, it always does the same thing. We have not
>>> figured out what triggers the first hang.
>>>
>>> # grep -C 40 rbd /proc/*/stack
>>> /proc/5295/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
>>> /proc/5295/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
>>> /proc/5295/stack:[<ffffffffa01621ee>]
>>> rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
>>> /proc/5295/stack:[<ffffffffa0162aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
>>> /proc/5295/stack:[<ffffffffa0163450>] rbd_watch_cb+0x0/0xa0 [rbd]
>>> /proc/5295/stack:[<ffffffffa0163586>]
>>> rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
>>> /proc/5295/stack:[<ffffffffa016240e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
>>> /proc/5295/stack:[<ffffffffa016428f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
>>> /proc/5295/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
>>> /proc/5295/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
>>> /proc/5295/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
>>> /proc/5295/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
>>> /proc/5295/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
>>> /proc/5295/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
>>> /proc/5295/stack-[<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>> # ps aux | grep 5295
>>> root 5295 0.0 0.0 246704 10352 pts/4 Sl+ 20:20 0:00 rbd
>>> map --pool XXXXXXX XXXXXXXX
>>
>> Anything in dmesg? It looks like rbd map is waiting for a reply to one
>> of the header read-in requests and is never woken up. rbd map holds
>> a semaphore, blocking the kworker, which blocks rbd unmap.
>>
>> Can you set "debug ms = 1" on all OSDs and try to reproduce?
>> I'd also like to see the content of /sys/kernel/debug/ceph/*/osdc when
>> it happens again.
>>
>
> Here is the contents of the osdc debug file on a machine that had been
> in that state for 3 days.
>
> # cat /sys/kernel/debug/ceph/*/osdc
> REQUESTS 1 homeless 0
> 7 osd38 4.27a8d388 [38,16,47]/38 [38,16,47]/38
> rbd_header.bea4a16ebd6a9a 0x400011 1 0'0 call
> LINGER REQUESTS
> 1 osd38 4.27a8d388 [38,16,47]/38 [38,16,47]/38
> rbd_header.bea4a16ebd6a9a 0x24 0 WC/0
OK, so it's waiting for a reply. Do you have a debug ms = 1 log for
osd38?
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
[not found] ` <CAE2NFgUxwvS_LnsoNp+zTK+4e+DSF24i==sPW2ARcZ9LJn1Otg@mail.gmail.com>
@ 2016-07-20 13:14 ` Ilya Dryomov
0 siblings, 0 replies; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-20 13:14 UTC (permalink / raw)
To: Patrick McLean
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Wed, Jul 20, 2016 at 1:23 AM, Patrick McLean <patrickm@gaikai.com> wrote:
> We got this on our rbd clients this morning, it is not actually a
> panic, but networking seems to have died on those boxes and since they
> are netbooted, they were dead.
>
> It looks like a crash in rbd_watch_cb:
>
> [68479.925931] Call Trace:
> [68479.928632] [<ffffffff81140f2e>] ? wq_worker_sleeping+0xe/0x90
> [68479.934793] [<ffffffff819738bc>] __schedule+0x50c/0xb90
> [68479.940349] [<ffffffff81445db5>] ? put_io_context_active+0xa5/0xc0
> [68479.946866] [<ffffffff81973f7c>] schedule+0x3c/0x90
> [68479.952077] [<ffffffff81126554>] do_exit+0x7b4/0xc60
> [68479.957373] [<ffffffff8109891c>] oops_end+0x9c/0xd0
> [68479.962579] [<ffffffff81098d8b>] die+0x4b/0x70
> [68479.967357] [<ffffffff81095d15>] do_general_protection+0xe5/0x1b0
> [68479.973780] [<ffffffff8197b988>] general_protection+0x28/0x30
> [68479.979857] [<ffffffffa01be161>] ? rbd_watch_cb+0x21/0x100 [rbd]
> [68479.986202] [<ffffffff81173f8f>] ? up_read+0x1f/0x40
> [68479.991506] [<ffffffffa00c3c09>] do_watch_notify+0x99/0x170 [libceph]
> [68479.998279] [<ffffffff8114003a>] process_one_work+0x1da/0x660
> [68480.004352] [<ffffffff8113ffac>] ? process_one_work+0x14c/0x660
> [68480.010601] [<ffffffff8114050e>] worker_thread+0x4e/0x490
> [68480.016326] [<ffffffff811404c0>] ? process_one_work+0x660/0x660
> [68480.022578] [<ffffffff811404c0>] ? process_one_work+0x660/0x660
> [68480.028823] [<ffffffff81146c51>] kthread+0x101/0x120
> [68480.034149] [<ffffffff81979baf>] ret_from_fork+0x1f/0x40
> [68480.039796] [<ffffffff81146b50>] ? kthread_create_on_node+0x250/0x250
The attached log starts with [68479.452369]. Do you have an earlier
chunk? I'm specifically looking for whether the rbd_watch_cb() was the
first splat.
Any luck reproducing the hang with logging enabled?
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-13 7:38 ` Ilya Dryomov
[not found] ` <CAE2NFgUxwvS_LnsoNp+zTK+4e+DSF24i==sPW2ARcZ9LJn1Otg@mail.gmail.com>
@ 2016-07-26 11:27 ` Ilya Dryomov
[not found] ` <CAE2NFgXj0PpX0uN6RGNuXquwkmBncr7KHANaSK6c8SAVTWUVLQ@mail.gmail.com>
1 sibling, 1 reply; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-26 11:27 UTC (permalink / raw)
To: Patrick McLean
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Wed, Jul 13, 2016 at 9:38 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Wed, Jul 13, 2016 at 2:38 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>> On Sat, Jul 9, 2016 at 7:30 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>> On Fri, Jul 8, 2016 at 10:30 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>> On Fri, Jul 8, 2016 at 11:46 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>>> On Fri, Jul 8, 2016 at 7:51 PM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>>> On Fri, Jul 8, 2016 at 4:40 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>>>>>> On Fri, Jul 8, 2016 at 3:28 AM, Patrick McLean <patrickm@gaikai.com> wrote:
>>>>>>>> This is on linus git master as of 2016/07/01
>>>>>>>>
>>>>>>>> These appear to be two separate deadlocks, one on a a map operation,
>>>>>>>> and one on an unmap operation. We can reproduce these pretty
>>>>>>>> regularly, but it seems like there is some sort of race condition, as
>>>>>>>> it happens no where near every time.
>>>>>>>>
>>>>>>
>>>>>>>
>>>>>>> It's a single deadlock between "rbd map" and a kworker thread, a later
>>>>>>> "rbd unmap" is just a victim.
>>>>>>>
>>>>>>> Are you mapping the same image more than once?
>>>>>>>
>>>>>>
>>>>>> We shouldn't be, there is a higher-level locking system that is
>>>>>> supposed to prevent that.
>>>>>
>>>>> It's actually allowed, I'm just trying to get an idea of what was going
>>>>> on.
>>>>>
>>>>> I spoke too soon. The trace of pid 14109 is inconsistent - the entries
>>>>> in the middle don't make sense. Do you have a different set of traces
>>>>> or are they all the same?
>>>>
>>>> Here is another backtrace. It seems that once it happens to a
>>>> particular image once, it always does the same thing. We have not
>>>> figured out what triggers the first hang.
>>>>
>>>> # grep -C 40 rbd /proc/*/stack
>>>> /proc/5295/stack-[<ffffffff8111c7c4>] __queue_work+0x144/0x420
>>>> /proc/5295/stack-[<ffffffff8112c1c0>] default_wake_function+0x0/0x10
>>>> /proc/5295/stack:[<ffffffffa01621ee>]
>>>> rbd_obj_method_sync.constprop.48+0x1be/0x290 [rbd]
>>>> /proc/5295/stack:[<ffffffffa0162aea>] rbd_dev_header_info+0x15a/0x930 [rbd]
>>>> /proc/5295/stack:[<ffffffffa0163450>] rbd_watch_cb+0x0/0xa0 [rbd]
>>>> /proc/5295/stack:[<ffffffffa0163586>]
>>>> rbd_dev_image_probe.part.42+0x96/0x910 [rbd]
>>>> /proc/5295/stack:[<ffffffffa016240e>] rbd_dev_image_id+0x14e/0x1b0 [rbd]
>>>> /proc/5295/stack:[<ffffffffa016428f>] do_rbd_add.isra.43+0x48f/0xbb0 [rbd]
>>>> /proc/5295/stack-[<ffffffff8123ba27>] __kmalloc+0x27/0x170
>>>> /proc/5295/stack-[<ffffffff812b7f3a>] kernfs_fop_write+0x10a/0x190
>>>> /proc/5295/stack-[<ffffffff8124dc63>] __vfs_write+0x23/0x120
>>>> /proc/5295/stack-[<ffffffff8124e8f3>] vfs_write+0xb3/0x1a0
>>>> /proc/5295/stack-[<ffffffff8124fbd2>] SyS_write+0x42/0xa0
>>>> /proc/5295/stack-[<ffffffff81816f72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
>>>> /proc/5295/stack-[<ffffffffffffffff>] 0xffffffffffffffff
>>>>
>>>> # ps aux | grep 5295
>>>> root 5295 0.0 0.0 246704 10352 pts/4 Sl+ 20:20 0:00 rbd
>>>> map --pool XXXXXXX XXXXXXXX
>>>
>>> Anything in dmesg? It looks like rbd map is waiting for a reply to one
>>> of the header read-in requests and is never woken up. rbd map holds
>>> a semaphore, blocking the kworker, which blocks rbd unmap.
>>>
>>> Can you set "debug ms = 1" on all OSDs and try to reproduce?
>>> I'd also like to see the content of /sys/kernel/debug/ceph/*/osdc when
>>> it happens again.
>>>
>>
>> Here is the contents of the osdc debug file on a machine that had been
>> in that state for 3 days.
>>
>> # cat /sys/kernel/debug/ceph/*/osdc
>> REQUESTS 1 homeless 0
>> 7 osd38 4.27a8d388 [38,16,47]/38 [38,16,47]/38
>> rbd_header.bea4a16ebd6a9a 0x400011 1 0'0 call
>> LINGER REQUESTS
>> 1 osd38 4.27a8d388 [38,16,47]/38 [38,16,47]/38
>> rbd_header.bea4a16ebd6a9a 0x24 0 WC/0
>
> OK, so it's waiting for a reply. Do you have a debug ms = 1 log for
> osd38?
I've filed http://tracker.ceph.com/issues/16630.
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
[not found] ` <CAE2NFgXj0PpX0uN6RGNuXquwkmBncr7KHANaSK6c8SAVTWUVLQ@mail.gmail.com>
@ 2016-07-26 18:11 ` Ilya Dryomov
[not found] ` <CALXUmNTwDOKZm=S6mkA4=HzM8H=pHWSuX1VwJZg3dtTRd9e8Tw@mail.gmail.com>
0 siblings, 1 reply; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-26 18:11 UTC (permalink / raw)
To: Patrick McLean
Cc: Ceph Development, Victor Payno, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
On Tue, Jul 26, 2016 at 7:58 PM, Patrick McLean <patrickm@gaikai.com> wrote:
> Hi Ilya,
>
> We discovered this weekend that enabling lockdep in the kernel makes the
> issue go away. We are working on reproducing without lockdep, and isolating
> the issue in the OSD logs. We should be have OSD debug logs this week.
I'm going to need the "cat /sys/kernel/debug/ceph/*/osdc" output, the
osd log for the osd from that output, and the output of "echo w" and
"echo t" to /proc/sysrq-trigger.
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
[not found] ` <CALXUmNTwDOKZm=S6mkA4=HzM8H=pHWSuX1VwJZg3dtTRd9e8Tw@mail.gmail.com>
@ 2016-07-26 22:31 ` Victor Payno
[not found] ` <CALXUmNS0dtX5T3++8JgtoWoBHVGB-Bgh+BZpqYiARtMG3qJVCA@mail.gmail.com>
0 siblings, 1 reply; 15+ messages in thread
From: Victor Payno @ 2016-07-26 22:31 UTC (permalink / raw)
To: Ilya Dryomov
Cc: Patrick McLean, Ceph Development, Cliff Pajaro,
Roderick Colenbrander, Rae Yip, Frank Reed
We'll have to run the test again for the OSD log data. Forgot to make
sure that the ceph log partition wasn't full.
client: /sys/kernel/debug/ceph/e88d2684-47c1-5a64-a275-6e375d11b557.client1242818/osdc
REQUESTS 1 homeless 0
11475 osd2 37.a819967 [2]/2 [2]/2
rbd_header.d960431b30e2f 0x400019 3 0'0 call
LINGER REQUESTS
91 osd0 37.851f81e1 [0]/0 [0]/0
rbd_header.d94f26cf2eafd 0x24 0 WC/0
93 osd0 37.98ca7eab [0]/0 [0]/0
rbd_header.d94b96a21ce28 0x24 0 WC/0
106 osd0 37.9720d758 [0]/0 [0]/0
rbd_header.d94f53a7da731 0x24 0 WC/0
104 osd1 37.de8088a1 [1]/1 [1]/1
rbd_header.d94f52e0f9b4d 0x24 0 WC/0
105 osd1 37.db9af301 [1]/1 [1]/1
rbd_header.d94f1ed40c97 0x24 0 WC/0
14 osd2 37.a819967 [2]/2 [2]/2
rbd_header.d960431b30e2f 0x24 2 WC/0
96 osd2 37.8fb9befc [2]/2 [2]/2
rbd_header.d94f03028192c 0x24 0 WC/0
82 osd3 37.370c3798 [3]/3 [3]/3
rbd_header.d94da25e9605b 0x24 0 WC/0
87 osd4 37.9c510a15 [4]/4 [4]/4
rbd_header.d94f079de7a55 0x24 0 WC/0
85 osd5 37.832091ad [5]/5 [5]/5
rbd_header.d94f22d15792a 0x24 0 WC/0
94 osd5 37.344d5f3 [5]/5 [5]/5 rbd_header.d94f6b30f4a
0x24 0 WC/0
103 osd5 37.4cb8bb74 [5]/5 [5]/5
rbd_header.d94ef2bac496b 0x24 0 WC/0
77 osd6 37.7c480437 [6]/6 [6]/6
rbd_header.d94c057a9331d 0x24 1 WC/0
88 osd7 37.58634cdc [7]/7 [7]/7
rbd_header.d94da55fd8a34 0x24 1 WC/0
98 osd7 37.a61c68b [7]/7 [7]/7
rbd_header.d94ce7d11bf7f 0x24 0 WC/0
rbd image '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7':
size 9536 MB in 2385 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.d960431b30e2f
format: 2
features: layering, striping
flags:
stripe unit: 4096 kB
stripe count: 1
osdmaptool: osdmap file '/tmp/osdmap'
object '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7'
-> 37.c6 -> [7]
On Tue, Jul 26, 2016 at 3:27 PM, Victor Payno <vpayno@gaikai.com> wrote:
> We'll have to run the test again for the OSD log data. Forgot to make sure
> that the ceph log partition wasn't full.
>
>
> client:
> /sys/kernel/debug/ceph/e88d2684-47c1-5a64-a275-6e375d11b557.client1242818/osdc
>
> REQUESTS 1 homeless 0
> 11475 osd2 37.a819967 [2]/2 [2]/2 rbd_header.d960431b30e2f
> 0x400019 3 0'0 call
> LINGER REQUESTS
> 91 osd0 37.851f81e1 [0]/0 [0]/0 rbd_header.d94f26cf2eafd
> 0x24 0 WC/0
> 93 osd0 37.98ca7eab [0]/0 [0]/0 rbd_header.d94b96a21ce28
> 0x24 0 WC/0
> 106 osd0 37.9720d758 [0]/0 [0]/0 rbd_header.d94f53a7da731
> 0x24 0 WC/0
> 104 osd1 37.de8088a1 [1]/1 [1]/1 rbd_header.d94f52e0f9b4d
> 0x24 0 WC/0
> 105 osd1 37.db9af301 [1]/1 [1]/1 rbd_header.d94f1ed40c97 0x24
> 0 WC/0
> 14 osd2 37.a819967 [2]/2 [2]/2 rbd_header.d960431b30e2f
> 0x24 2 WC/0
> 96 osd2 37.8fb9befc [2]/2 [2]/2 rbd_header.d94f03028192c
> 0x24 0 WC/0
> 82 osd3 37.370c3798 [3]/3 [3]/3 rbd_header.d94da25e9605b
> 0x24 0 WC/0
> 87 osd4 37.9c510a15 [4]/4 [4]/4 rbd_header.d94f079de7a55
> 0x24 0 WC/0
> 85 osd5 37.832091ad [5]/5 [5]/5 rbd_header.d94f22d15792a
> 0x24 0 WC/0
> 94 osd5 37.344d5f3 [5]/5 [5]/5 rbd_header.d94f6b30f4a 0x24
> 0 WC/0
> 103 osd5 37.4cb8bb74 [5]/5 [5]/5 rbd_header.d94ef2bac496b
> 0x24 0 WC/0
> 77 osd6 37.7c480437 [6]/6 [6]/6 rbd_header.d94c057a9331d
> 0x24 1 WC/0
> 88 osd7 37.58634cdc [7]/7 [7]/7 rbd_header.d94da55fd8a34
> 0x24 1 WC/0
> 98 osd7 37.a61c68b [7]/7 [7]/7 rbd_header.d94ce7d11bf7f
> 0x24 0 WC/0
>
>
> rbd image
> '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7':
> size 9536 MB in 2385 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.d960431b30e2f
> format: 2
> features: layering, striping
> flags:
> stripe unit: 4096 kB
> stripe count: 1
>
>
> osdmaptool: osdmap file '/tmp/osdmap'
> object '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7'
> -> 37.c6 -> [7]
>
>
>
>
>
>
> On Tue, Jul 26, 2016 at 11:11 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
>>
>> On Tue, Jul 26, 2016 at 7:58 PM, Patrick McLean <patrickm@gaikai.com>
>> wrote:
>> > Hi Ilya,
>> >
>> > We discovered this weekend that enabling lockdep in the kernel makes the
>> > issue go away. We are working on reproducing without lockdep, and
>> > isolating
>> > the issue in the OSD logs. We should be have OSD debug logs this week.
>>
>> I'm going to need the "cat /sys/kernel/debug/ceph/*/osdc" output, the
>> osd log for the osd from that output, and the output of "echo w" and
>> "echo t" to /proc/sysrq-trigger.
>>
>> Thanks,
>>
>> Ilya
>
>
>
>
> --
> Victor Payno
> ビクター·ペイン
>
> Sr. Release Engineer
> シニアリリースエンジニア
>
>
>
> Gaikai, a Sony Computer Entertainment Company ∆○×□
> ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
> 65 Enterprise
> Aliso Viejo, CA 92656 USA
>
> Web: www.gaikai.com
> Email: vpayno@gaikai.com
> Phone: (949) 330-6850
--
Victor Payno
ビクター·ペイン
Sr. Release Engineer
シニアリリースエンジニア
Gaikai, a Sony Computer Entertainment Company ∆○×□
ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
65 Enterprise
Aliso Viejo, CA 92656 USA
Web: www.gaikai.com
Email: vpayno@gaikai.com
Phone: (949) 330-6850
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
[not found] ` <CALXUmNSkR92CYWX4DU37RuiwHm-nubYxq0egQPpnBqN-7=VTwg@mail.gmail.com>
@ 2016-07-29 23:11 ` Victor Payno
2016-07-30 14:32 ` Ilya Dryomov
0 siblings, 1 reply; 15+ messages in thread
From: Victor Payno @ 2016-07-29 23:11 UTC (permalink / raw)
To: Cliff Pajaro
Cc: Ilya Dryomov, Patrick McLean, Ceph Development,
Roderick Colenbrander, Rae Yip, Frank Reed
Thanks for the info on the 510 snapshot per rbd kernel limit.
We're also wondering what the rbd metadata limits might be. Are the
metadata key and value size limitations listed anywhere?
We're planning on using key names with 64 characters (the same string
as the snapshot name) with a string json payload in the value field.
So on on an rbd with with 100 snapshots, we would also have 100
metadata key/value pairs. The value/data would probably be at least
100 characters per key.
On Wed, Jul 27, 2016 at 12:26 PM, Victor Payno <vpayno@gaikai.com> wrote:
> Yes, it has 697 snapshots. The overall average snapshot count per rdb is
> 355. The lowest number is 3 and the highest number is 708.
>
> Interestingly enough we can't map that rbd anymore.
>
> $ rbd info
> test-rbd/3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7-willnotmap
> rbd image
> '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7-willnotmap':
> size 9536 MB in 2385 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.11e601379f78e7
> format: 2
> features: layering, striping
> flags:
> stripe unit: 4096 kB
> stripe count: 1
>
>
> When we try to mount it, it hangs here:
>
> setsockopt(3, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0
> open("/sys/bus/rbd/add_single_major", O_WRONLY) = 4
> write(4, "192.168.2.63:6789,192.168.3.63:6789,"..., 147
>
>
> Other rbds mount fine.
>
> The cluster health is HEALTH_OK.
>
>
>
>
> From Cliff:
>
> Sifting through the osd2's log for "d960431b30e2f" I see this:
>
> 2016-07-27 00:47:47.199252 7f1ac7dbb700 1 -- 10.9.228.101:6814/25892 <==
> client.1252427 10.9.72.23:0/273648291 13 ==== osd_op(client.1252427.0:19
> rbd_header.d960431b30e2f [call lock.lock] 37.a819967
> ondisk+write+known_if_redirected e107453) v6 ==== 210+0+51 (4053684278 0
> 659001106) 0x55c9ec362000 con 0x55c9f0101600
> 2016-07-27 00:47:47.199515 7f1ad3539700 1 -- 10.9.228.101:6814/25892 -->
> 10.9.72.23:0/273648291 -- osd_op_reply(19 rbd_header.d960431b30e2f [call
> lock.lock] v0'0 uv0 ondisk = -16 ((16) Device or resource busy)) v6 -- ?+0
>
> On Tue, Jul 26, 2016 at 8:59 PM, Cliff Pajaro <cpajaro@gaikai.com> wrote:
>>
>> Sifting through the osd2's log for "d960431b30e2f" I see this:
>>
>>> 2016-07-27 00:47:47.199252 7f1ac7dbb700 1 -- 10.9.228.101:6814/25892 <==
>>> client.1252427 10.9.72.23:0/273648291 13 ==== osd_op(client.1252427.0:19
>>> rbd_header.d960431b30e2f [call lock.lock] 37.a819967
>>> ondisk+write+known_if_redirected e107453) v6 ==== 210+0+51 (4053684278 0
>>> 659001106) 0x55c9ec362000 con 0x55c9f0101600
>>> 2016-07-27 00:47:47.199515 7f1ad3539700 1 -- 10.9.228.101:6814/25892 -->
>>> 10.9.72.23:0/273648291 -- osd_op_reply(19 rbd_header.d960431b30e2f [call
>>> lock.lock] v0'0 uv0 ondisk = -16 ((16) Device or resource busy)) v6 -- ?+0
>>> 0x55c9ee4b8b00 con 0x55c9f0101600
>>
>>
>>
>> On Tue, Jul 26, 2016 at 6:48 PM, Victor Payno <vpayno@gaikai.com> wrote:
>>>
>>> REQUESTS 1 homeless 0
>>> 322452 osd2 37.a819967 [2]/2 [2]/2
>>> rbd_header.d960431b30e2f 0x400019 3 0'0 call
>>> LINGER REQUESTS
>>> 167 osd2 37.a819967 [2]/2 [2]/2
>>> rbd_header.d960431b30e2f 0x24 2 WC/0
>>>
>>>
>>> ceph-osd.2.log.gz has been attached/uploaded to here:
>>> http://tracker.ceph.com/issues/16630
>>>
>>>
>>> On Tue, Jul 26, 2016 at 3:31 PM, Victor Payno <vpayno@gaikai.com> wrote:
>>> > We'll have to run the test again for the OSD log data. Forgot to make
>>> > sure that the ceph log partition wasn't full.
>>> >
>>> >
>>> > client:
>>> > /sys/kernel/debug/ceph/e88d2684-47c1-5a64-a275-6e375d11b557.client1242818/osdc
>>> >
>>> > REQUESTS 1 homeless 0
>>> > 11475 osd2 37.a819967 [2]/2 [2]/2
>>> > rbd_header.d960431b30e2f 0x400019 3 0'0 call
>>> > LINGER REQUESTS
>>> > 91 osd0 37.851f81e1 [0]/0 [0]/0
>>> > rbd_header.d94f26cf2eafd 0x24 0 WC/0
>>> > 93 osd0 37.98ca7eab [0]/0 [0]/0
>>> > rbd_header.d94b96a21ce28 0x24 0 WC/0
>>> > 106 osd0 37.9720d758 [0]/0 [0]/0
>>> > rbd_header.d94f53a7da731 0x24 0 WC/0
>>> > 104 osd1 37.de8088a1 [1]/1 [1]/1
>>> > rbd_header.d94f52e0f9b4d 0x24 0 WC/0
>>> > 105 osd1 37.db9af301 [1]/1 [1]/1
>>> > rbd_header.d94f1ed40c97 0x24 0 WC/0
>>> > 14 osd2 37.a819967 [2]/2 [2]/2
>>> > rbd_header.d960431b30e2f 0x24 2 WC/0
>>> > 96 osd2 37.8fb9befc [2]/2 [2]/2
>>> > rbd_header.d94f03028192c 0x24 0 WC/0
>>> > 82 osd3 37.370c3798 [3]/3 [3]/3
>>> > rbd_header.d94da25e9605b 0x24 0 WC/0
>>> > 87 osd4 37.9c510a15 [4]/4 [4]/4
>>> > rbd_header.d94f079de7a55 0x24 0 WC/0
>>> > 85 osd5 37.832091ad [5]/5 [5]/5
>>> > rbd_header.d94f22d15792a 0x24 0 WC/0
>>> > 94 osd5 37.344d5f3 [5]/5 [5]/5 rbd_header.d94f6b30f4a
>>> > 0x24 0 WC/0
>>> > 103 osd5 37.4cb8bb74 [5]/5 [5]/5
>>> > rbd_header.d94ef2bac496b 0x24 0 WC/0
>>> > 77 osd6 37.7c480437 [6]/6 [6]/6
>>> > rbd_header.d94c057a9331d 0x24 1 WC/0
>>> > 88 osd7 37.58634cdc [7]/7 [7]/7
>>> > rbd_header.d94da55fd8a34 0x24 1 WC/0
>>> > 98 osd7 37.a61c68b [7]/7 [7]/7
>>> > rbd_header.d94ce7d11bf7f 0x24 0 WC/0
>>> >
>>> >
>>> > rbd image
>>> > '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7':
>>> > size 9536 MB in 2385 objects
>>> > order 22 (4096 kB objects)
>>> > block_name_prefix: rbd_data.d960431b30e2f
>>> > format: 2
>>> > features: layering, striping
>>> > flags:
>>> > stripe unit: 4096 kB
>>> > stripe count: 1
>>> >
>>> >
>>> > osdmaptool: osdmap file '/tmp/osdmap'
>>> > object
>>> > '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7'
>>> > -> 37.c6 -> [7]
>>> >
>>> > On Tue, Jul 26, 2016 at 3:27 PM, Victor Payno <vpayno@gaikai.com>
>>> > wrote:
>>> >> We'll have to run the test again for the OSD log data. Forgot to make
>>> >> sure
>>> >> that the ceph log partition wasn't full.
>>> >>
>>> >>
>>> >> client:
>>> >>
>>> >> /sys/kernel/debug/ceph/e88d2684-47c1-5a64-a275-6e375d11b557.client1242818/osdc
>>> >>
>>> >> REQUESTS 1 homeless 0
>>> >> 11475 osd2 37.a819967 [2]/2 [2]/2
>>> >> rbd_header.d960431b30e2f
>>> >> 0x400019 3 0'0 call
>>> >> LINGER REQUESTS
>>> >> 91 osd0 37.851f81e1 [0]/0 [0]/0
>>> >> rbd_header.d94f26cf2eafd
>>> >> 0x24 0 WC/0
>>> >> 93 osd0 37.98ca7eab [0]/0 [0]/0
>>> >> rbd_header.d94b96a21ce28
>>> >> 0x24 0 WC/0
>>> >> 106 osd0 37.9720d758 [0]/0 [0]/0
>>> >> rbd_header.d94f53a7da731
>>> >> 0x24 0 WC/0
>>> >> 104 osd1 37.de8088a1 [1]/1 [1]/1
>>> >> rbd_header.d94f52e0f9b4d
>>> >> 0x24 0 WC/0
>>> >> 105 osd1 37.db9af301 [1]/1 [1]/1
>>> >> rbd_header.d94f1ed40c97 0x24
>>> >> 0 WC/0
>>> >> 14 osd2 37.a819967 [2]/2 [2]/2
>>> >> rbd_header.d960431b30e2f
>>> >> 0x24 2 WC/0
>>> >> 96 osd2 37.8fb9befc [2]/2 [2]/2
>>> >> rbd_header.d94f03028192c
>>> >> 0x24 0 WC/0
>>> >> 82 osd3 37.370c3798 [3]/3 [3]/3
>>> >> rbd_header.d94da25e9605b
>>> >> 0x24 0 WC/0
>>> >> 87 osd4 37.9c510a15 [4]/4 [4]/4
>>> >> rbd_header.d94f079de7a55
>>> >> 0x24 0 WC/0
>>> >> 85 osd5 37.832091ad [5]/5 [5]/5
>>> >> rbd_header.d94f22d15792a
>>> >> 0x24 0 WC/0
>>> >> 94 osd5 37.344d5f3 [5]/5 [5]/5 rbd_header.d94f6b30f4a
>>> >> 0x24
>>> >> 0 WC/0
>>> >> 103 osd5 37.4cb8bb74 [5]/5 [5]/5
>>> >> rbd_header.d94ef2bac496b
>>> >> 0x24 0 WC/0
>>> >> 77 osd6 37.7c480437 [6]/6 [6]/6
>>> >> rbd_header.d94c057a9331d
>>> >> 0x24 1 WC/0
>>> >> 88 osd7 37.58634cdc [7]/7 [7]/7
>>> >> rbd_header.d94da55fd8a34
>>> >> 0x24 1 WC/0
>>> >> 98 osd7 37.a61c68b [7]/7 [7]/7
>>> >> rbd_header.d94ce7d11bf7f
>>> >> 0x24 0 WC/0
>>> >>
>>> >>
>>> >> rbd image
>>> >> '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7':
>>> >> size 9536 MB in 2385 objects
>>> >> order 22 (4096 kB objects)
>>> >> block_name_prefix: rbd_data.d960431b30e2f
>>> >> format: 2
>>> >> features: layering, striping
>>> >> flags:
>>> >> stripe unit: 4096 kB
>>> >> stripe count: 1
>>> >>
>>> >>
>>> >> osdmaptool: osdmap file '/tmp/osdmap'
>>> >> object
>>> >> '3f370dbabff91bbb7ff23ae7a96e5cb414cac3408013cefed6d4b627b5eed9c7'
>>> >> -> 37.c6 -> [7]
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Jul 26, 2016 at 11:11 AM, Ilya Dryomov <idryomov@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> On Tue, Jul 26, 2016 at 7:58 PM, Patrick McLean <patrickm@gaikai.com>
>>> >>> wrote:
>>> >>> > Hi Ilya,
>>> >>> >
>>> >>> > We discovered this weekend that enabling lockdep in the kernel
>>> >>> > makes the
>>> >>> > issue go away. We are working on reproducing without lockdep, and
>>> >>> > isolating
>>> >>> > the issue in the OSD logs. We should be have OSD debug logs this
>>> >>> > week.
>>> >>>
>>> >>> I'm going to need the "cat /sys/kernel/debug/ceph/*/osdc" output, the
>>> >>> osd log for the osd from that output, and the output of "echo w" and
>>> >>> "echo t" to /proc/sysrq-trigger.
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Ilya
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Victor Payno
>>> >> ビクター·ペイン
>>> >>
>>> >> Sr. Release Engineer
>>> >> シニアリリースエンジニア
>>> >>
>>> >>
>>> >>
>>> >> Gaikai, a Sony Computer Entertainment Company ∆○×□
>>> >> ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
>>> >> 65 Enterprise
>>> >> Aliso Viejo, CA 92656 USA
>>> >>
>>> >> Web: www.gaikai.com
>>> >> Email: vpayno@gaikai.com
>>> >> Phone: (949) 330-6850
>>> >
>>> >
>>> >
>>> > --
>>> > Victor Payno
>>> > ビクター·ペイン
>>> >
>>> > Sr. Release Engineer
>>> > シニアリリースエンジニア
>>> >
>>> >
>>> >
>>> > Gaikai, a Sony Computer Entertainment Company ∆○×□
>>> > ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
>>> > 65 Enterprise
>>> > Aliso Viejo, CA 92656 USA
>>> >
>>> > Web: www.gaikai.com
>>> > Email: vpayno@gaikai.com
>>> > Phone: (949) 330-6850
>>>
>>>
>>>
>>> --
>>> Victor Payno
>>> ビクター·ペイン
>>>
>>> Sr. Release Engineer
>>> シニアリリースエンジニア
>>>
>>>
>>>
>>> Gaikai, a Sony Computer Entertainment Company ∆○×□
>>> ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
>>> 65 Enterprise
>>> Aliso Viejo, CA 92656 USA
>>>
>>> Web: www.gaikai.com
>>> Email: vpayno@gaikai.com
>>> Phone: (949) 330-6850
>>
>>
>
>
>
> --
> Victor Payno
> ビクター·ペイン
>
> Sr. Release Engineer
> シニアリリースエンジニア
>
>
>
> Gaikai, a Sony Computer Entertainment Company ∆○×□
> ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
> 65 Enterprise
> Aliso Viejo, CA 92656 USA
>
> Web: www.gaikai.com
> Email: vpayno@gaikai.com
> Phone: (949) 330-6850
--
Victor Payno
ビクター·ペイン
Sr. Release Engineer
シニアリリースエンジニア
Gaikai, a Sony Computer Entertainment Company ∆○×□
ガイカイ、ソニー・コンピュータエンタテインメント傘下会社
65 Enterprise
Aliso Viejo, CA 92656 USA
Web: www.gaikai.com
Email: vpayno@gaikai.com
Phone: (949) 330-6850
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deadlocks in rbd unmap and map
2016-07-29 23:11 ` Victor Payno
@ 2016-07-30 14:32 ` Ilya Dryomov
0 siblings, 0 replies; 15+ messages in thread
From: Ilya Dryomov @ 2016-07-30 14:32 UTC (permalink / raw)
To: Victor Payno
Cc: Cliff Pajaro, Patrick McLean, Ceph Development,
Roderick Colenbrander, Rae Yip, Frank Reed
On Sat, Jul 30, 2016 at 1:11 AM, Victor Payno <vpayno@gaikai.com> wrote:
> Thanks for the info on the 510 snapshot per rbd kernel limit.
>
> We're also wondering what the rbd metadata limits might be. Are the
> metadata key and value size limitations listed anywhere?
>
> We're planning on using key names with 64 characters (the same string
> as the snapshot name) with a string json payload in the value field.
>
> So on on an rbd with with 100 snapshots, we would also have 100
> metadata key/value pairs. The value/data would probably be at least
> 100 characters per key.
Those key/value numbers are reasonable. I don't think there are any
particular librbd-imposed limits: image metadata is stored as omap on
the header object, so anything that could be stored in omap is fine.
The kernel client doesn't care. Contrary to the snapshot info, image
metadata is never accessed.
Thanks,
Ilya
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-07-30 14:32 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-08 1:28 deadlocks in rbd unmap and map Patrick McLean
2016-07-08 11:40 ` Ilya Dryomov
2016-07-08 17:51 ` Patrick McLean
2016-07-08 18:46 ` Ilya Dryomov
2016-07-08 20:30 ` Patrick McLean
2016-07-09 14:30 ` Ilya Dryomov
2016-07-11 19:12 ` Victor Payno
2016-07-13 0:38 ` Patrick McLean
2016-07-13 7:38 ` Ilya Dryomov
[not found] ` <CAE2NFgUxwvS_LnsoNp+zTK+4e+DSF24i==sPW2ARcZ9LJn1Otg@mail.gmail.com>
2016-07-20 13:14 ` Ilya Dryomov
2016-07-26 11:27 ` Ilya Dryomov
[not found] ` <CAE2NFgXj0PpX0uN6RGNuXquwkmBncr7KHANaSK6c8SAVTWUVLQ@mail.gmail.com>
2016-07-26 18:11 ` Ilya Dryomov
[not found] ` <CALXUmNTwDOKZm=S6mkA4=HzM8H=pHWSuX1VwJZg3dtTRd9e8Tw@mail.gmail.com>
2016-07-26 22:31 ` Victor Payno
[not found] ` <CALXUmNS0dtX5T3++8JgtoWoBHVGB-Bgh+BZpqYiARtMG3qJVCA@mail.gmail.com>
[not found] ` <CAFXAJ64a9KqF5A+f6BZsD_LtGUniqya+Wz0oMtcaLhxrnW1TFA@mail.gmail.com>
[not found] ` <CALXUmNSkR92CYWX4DU37RuiwHm-nubYxq0egQPpnBqN-7=VTwg@mail.gmail.com>
2016-07-29 23:11 ` Victor Payno
2016-07-30 14:32 ` Ilya Dryomov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.