All of lore.kernel.org
 help / color / mirror / Atom feed
* bcache causes RCU stalls/bcache_gc hogs CPU
@ 2015-04-14 14:52 Nikolay Borisov
  2015-04-14 20:03 ` Eric Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2015-04-14 14:52 UTC (permalink / raw)
  To: linux-bcache

Hello list,


I'm currently testing bcache with the following setup:

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT

sda                   8:0    0   1.8T  0 disk
├─sda1                8:1    0     2M  0 part
├─sda2                8:2    0   191M  0 part
└─sda3                8:3    0   1.8T  0 part
   └─main-os (dm-0)  254:0    0   1.8T  0 lvm  /
sdb                   8:16   0 223.1G  0 disk
├─sdb1                8:17   0 188.2M  0 part /boot
└─sdb2                8:18   0 222.9G  0 part
   ├─main-ssd (dm-1) 254:1    0    40G  0 lvm
   │ └─bcache0       253:0    0   182G  0 disk /sdb
   └─main-db (dm-2)  254:2    0   182G  0 lvm
     └─bcache0       253:0    0   182G  0 disk /sdb

So a 40gig ssd (main-ssd, lvm2 volume) backed by a 180gig hdd (main-db, lvm 2 volume), using the writeback cache policy. Every other setting is at its default. I'm running the 4.0-rc6 (!CONFIG_PREEMPT). After running fio (using a 30gb file) with a mix of sequential and random i/o and I'm getting the following RCU warn:

======================================================
INFO: rcu_sched self-detected stall on CPU
         4: (2099 ticks this GP) idle=fcf/140000000000001/0 softirq=1031582/1031582 fqs=2100
INFO: rcu_sched detected stalls on CPUs/tasks:
         4: (2099 ticks this GP) idle=fcf/140000000000001/0 softirq=1031582/1031582 fqs=2100
         (detected by 16, t=2104 jiffies, g=2176431, c=2176430, q=3098)
Task dump for CPU 4:
bcache_gc	R  running task    12728 18115      2 0x00000008
  ffff880079e85720 fffffffffffffffc ffff88046c180e20 fffffffffffffffc
  ffffffff81091693 fffffffffffffffc ffff88086aa3d000 ffff88046c180000
  ffff88086aa3d000 ffff88046c180000 ffff88086aa3d060 ffff88086aa3d000
Call Trace:
  [<ffffffff81091693>] ? __wake_up+0x53/0x70
  [<ffffffffa01103d4>] ? bch_btree_gc+0x2f4/0x560 [bcache]
  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
  [<ffffffffa0110678>] ? bch_gc_thread+0x38/0x120 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffff81070a9e>] ? kthread+0xce/0xf0
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff815b8818>] ? ret_from_fork+0x58/0x90
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
          (t=2228 jiffies g=2176431 c=2176430 q=3161)
Task dump for CPU 4:
bcache_gc	R  running task    12728 18115      2 0x00000008
  0000000000000005 ffff88046fc83ca8 ffffffff8107720b 0000000000000004
  ffffffff8183d040 ffff88046fc83cc8 ffffffff810772af ffff88046fc83cc8
  ffffffff8183d100 ffff88046fc83cf8 ffffffff810a5101 ffff88046fc94500
Call Trace:
  <IRQ>  [<ffffffff8107720b>] sched_show_task+0xcb/0x130
  [<ffffffff810772af>] dump_cpu_task+0x3f/0x50
  [<ffffffff810a5101>] rcu_dump_cpu_stacks+0x91/0xd0
  [<ffffffff810a68cf>] rcu_check_callbacks+0x65f/0xc30
  [<ffffffff81080ecc>] ? account_process_tick+0x6c/0x170
  [<ffffffff810acf29>] update_process_times+0x39/0x70
  [<ffffffff810beba0>] tick_sched_handle+0x40/0x50
  [<ffffffff810bedb2>] tick_sched_timer+0x52/0xa0
  [<ffffffff810afa16>] __run_hrtimer+0x86/0x1d0
  [<ffffffff810bed60>] ? tick_nohz_handler+0xc0/0xc0
  [<ffffffff810afd92>] hrtimer_interrupt+0x102/0x240
  [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
  [<ffffffff81032e79>] local_apic_timer_interrupt+0x39/0x60
  [<ffffffff815bb355>] smp_apic_timer_interrupt+0x45/0x59
  [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
  [<ffffffff815b972d>] apic_timer_interrupt+0x6d/0x80
  <EOI>  [<ffffffffa01117c5>] ? __bch_extent_invalid+0xa5/0xd0 [bcache]
  [<ffffffffa0111721>] ? __bch_extent_invalid+0x1/0xd0 [bcache]
  [<ffffffffa0111802>] ? bch_extent_invalid+0x12/0x20 [bcache]
  [<ffffffffa011183d>] bch_extent_bad+0x2d/0x1c0 [bcache]
  [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
  [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
  [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
  [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
  [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
  [<ffffffff81091693>] ? __wake_up+0x53/0x70
  [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
  [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffff81070a9e>] kthread+0xce/0xf0
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff815b8818>] ret_from_fork+0x58/0x90

Naturally, checking /sys/fs/bcache/b9bcddd1-7a9a-4f2f-88e6-cb5bef6abcf2/internal/btree_gc_max_duration_ms shows: 31593  Clearly at some point the GC overhead becomes so large that it causes RCU grace period stalls. I'm puzzled since bch_btree_gc_finish(...) is not listed and this is the only function that pertains to bcache gc AND executes code in RCU critical read section.

In addition to that I also observed that the after this RCU stall warn occurs the bcache_gc thread hogs the machine at 100% rendering it unusable. I managed to get 2 call stack dumps via magic sysrq as follows:

=============
NMI backtrace for cpu 4
CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
RIP: 0010:[<ffffffffa01098cb>]  [<ffffffffa01098cb>] bch_btree_iter_next_filter+0xb/0x50 [bcache]
RSP: 0018:ffff880868027bd8  EFLAGS: 00000202
RAX: 0000000000000001 RBX: 0000000000002034 RCX: 000000000000000a
RDX: ffffffffa0109920 RSI: ffff88086aa3dcd0 RDI: ffff880868027c08
RBP: ffff880868027bf8 R08: 0000000000000001 R09: 0000000000000001
R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
R13: ffff880868027c08 R14: ffff880868027cf8 R15: ffff880868027dd8
FS:  0000000000000000(0000) GS:ffff88046fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
Stack:
  0000000000002034 ffff88086aa3dcd0 ffff880868027c08 ffff880868027cf8
  ffff880868027c78 ffffffffa0109c80 0000000000000004 0000000000000001
  ffff8807740101c0 ffff88077402a9f8 ffff88086aa3dc00 ffff88086aa3d000
Call Trace:
  [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
  [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
  [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
  [<ffffffff81091693>] ? __wake_up+0x53/0x70
  [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
  [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffff81070a9e>] kthread+0xce/0xf0
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff815b8818>] ret_from_fork+0x58/0x90
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
Code: ff 48 89 d7 4c 29 cf eb a7 48 29 f2 48 89 d6 e9 18 ff ff ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 <0f> 1f 44 00 00 48 89 fb 49 89 f4 49 89 d6 0f 1f 80 00 00 00 00


===========================================

NMI backtrace for cpu 4
CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
RIP: 0010:[<ffffffffa0111916>]  [<ffffffffa0111916>] bch_extent_bad+0x106/0x1c0 [bcache]
RSP: 0018:ffff880868027ba8  EFLAGS: 00000202
RAX: 000000000000bd2a RBX: ffff88077400a550 RCX: 000000000000000a
RDX: 0000000000000004 RSI: 00000000fc390004 RDI: ffff88046c180000
RBP: ffff880868027bb8 R08: 0000000000000001 R09: 0000000000000000
R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
R13: ffff88077400a550 R14: ffffffffa0109920 R15: ffff880868027dd8
FS:  0000000000000000(0000) GS:ffff88046fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
Stack:
  ffff880868027c08 ffff88086aa3dcd0 ffff880868027bc8 ffffffffa010992a
  ffff880868027bf8 ffffffffa01098f9 00000000000014a6 ffff88086aa3dcd0
  ffff880868027c08 ffff880868027cf8 ffff880868027c78 ffffffffa0109c80
Call Trace:
  [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
  [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
  [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
  [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
  [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
  [<ffffffff81091693>] ? __wake_up+0x53/0x70
  [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
  [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
  [<ffffffff81070a9e>] kthread+0xce/0xf0
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff815b8818>] ret_from_fork+0x58/0x90
  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
Code: 33 25 ff 0f 00 00 48 8b 94 c7 40 0c 00 00 48 89 f0 48 8b 92 d8 0a 00 00 48 c1 e8 08 4c 21 d0 48 d3 e8 48 8d 04 40 0f b6 54 82 06 <40> 28 f2 80 fa 80 0f 87 7e 00 00 00 0f b6 d2 83 fa 60 76 66 31


In the mean time I'm running the stable 4.0.0 where I observe better results ( no bcache_gc thread hog but still the occasional stall warn)

Regards,
Nikolay

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bcache causes RCU stalls/bcache_gc hogs CPU
  2015-04-14 14:52 bcache causes RCU stalls/bcache_gc hogs CPU Nikolay Borisov
@ 2015-04-14 20:03 ` Eric Wheeler
  2015-04-15  8:41   ` Nikolay Borisov
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wheeler @ 2015-04-14 20:03 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-bcache

[-- Attachment #1: Type: TEXT/PLAIN, Size: 10834 bytes --]

Apply all of the attached patches to your kernel and try again.

I wish somebody would apply these upstream and get it into the official 
kernel. I have been carrying all of these patches with me for some time 
and they definitely make bcache more stable.

-Eric


--
Eric Wheeler, President           eWheeler, Inc. dba Global Linux Security
888-LINUX26 (888-546-8926)        Fax: 503-716-3878           PO Box 25107
www.GlobalLinuxSecurity.pro       Linux since 1996!     Portland, OR 97298

On Tue, 14 Apr 2015, Nikolay Borisov wrote:

> Hello list,
> 
> 
> I'm currently testing bcache with the following setup:
> 
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> 
> sda                   8:0    0   1.8T  0 disk
> ??sda1                8:1    0     2M  0 part
> ??sda2                8:2    0   191M  0 part
> ??sda3                8:3    0   1.8T  0 part
>   ??main-os (dm-0)  254:0    0   1.8T  0 lvm  /
> sdb                   8:16   0 223.1G  0 disk
> ??sdb1                8:17   0 188.2M  0 part /boot
> ??sdb2                8:18   0 222.9G  0 part
>   ??main-ssd (dm-1) 254:1    0    40G  0 lvm
>   ? ??bcache0       253:0    0   182G  0 disk /sdb
>   ??main-db (dm-2)  254:2    0   182G  0 lvm
>     ??bcache0       253:0    0   182G  0 disk /sdb
> 
> So a 40gig ssd (main-ssd, lvm2 volume) backed by a 180gig hdd (main-db, lvm 2
> volume), using the writeback cache policy. Every other setting is at its
> default. I'm running the 4.0-rc6 (!CONFIG_PREEMPT). After running fio (using a
> 30gb file) with a mix of sequential and random i/o and I'm getting the
> following RCU warn:
> 
> ======================================================
> INFO: rcu_sched self-detected stall on CPU
>         4: (2099 ticks this GP) idle=fcf/140000000000001/0
> softirq=1031582/1031582 fqs=2100
> INFO: rcu_sched detected stalls on CPUs/tasks:
>         4: (2099 ticks this GP) idle=fcf/140000000000001/0
> softirq=1031582/1031582 fqs=2100
>         (detected by 16, t=2104 jiffies, g=2176431, c=2176430, q=3098)
> Task dump for CPU 4:
> bcache_gc	R  running task    12728 18115      2 0x00000008
>  ffff880079e85720 fffffffffffffffc ffff88046c180e20 fffffffffffffffc
>  ffffffff81091693 fffffffffffffffc ffff88086aa3d000 ffff88046c180000
>  ffff88086aa3d000 ffff88046c180000 ffff88086aa3d060 ffff88086aa3d000
> Call Trace:
>  [<ffffffff81091693>] ? __wake_up+0x53/0x70
>  [<ffffffffa01103d4>] ? bch_btree_gc+0x2f4/0x560 [bcache]
>  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>  [<ffffffffa0110678>] ? bch_gc_thread+0x38/0x120 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffff81070a9e>] ? kthread+0xce/0xf0
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>  [<ffffffff815b8818>] ? ret_from_fork+0x58/0x90
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>          (t=2228 jiffies g=2176431 c=2176430 q=3161)
> Task dump for CPU 4:
> bcache_gc	R  running task    12728 18115      2 0x00000008
>  0000000000000005 ffff88046fc83ca8 ffffffff8107720b 0000000000000004
>  ffffffff8183d040 ffff88046fc83cc8 ffffffff810772af ffff88046fc83cc8
>  ffffffff8183d100 ffff88046fc83cf8 ffffffff810a5101 ffff88046fc94500
> Call Trace:
>  <IRQ>  [<ffffffff8107720b>] sched_show_task+0xcb/0x130
>  [<ffffffff810772af>] dump_cpu_task+0x3f/0x50
>  [<ffffffff810a5101>] rcu_dump_cpu_stacks+0x91/0xd0
>  [<ffffffff810a68cf>] rcu_check_callbacks+0x65f/0xc30
>  [<ffffffff81080ecc>] ? account_process_tick+0x6c/0x170
>  [<ffffffff810acf29>] update_process_times+0x39/0x70
>  [<ffffffff810beba0>] tick_sched_handle+0x40/0x50
>  [<ffffffff810bedb2>] tick_sched_timer+0x52/0xa0
>  [<ffffffff810afa16>] __run_hrtimer+0x86/0x1d0
>  [<ffffffff810bed60>] ? tick_nohz_handler+0xc0/0xc0
>  [<ffffffff810afd92>] hrtimer_interrupt+0x102/0x240
>  [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
>  [<ffffffff81032e79>] local_apic_timer_interrupt+0x39/0x60
>  [<ffffffff815bb355>] smp_apic_timer_interrupt+0x45/0x59
>  [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
>  [<ffffffff815b972d>] apic_timer_interrupt+0x6d/0x80
>  <EOI>  [<ffffffffa01117c5>] ? __bch_extent_invalid+0xa5/0xd0 [bcache]
>  [<ffffffffa0111721>] ? __bch_extent_invalid+0x1/0xd0 [bcache]
>  [<ffffffffa0111802>] ? bch_extent_invalid+0x12/0x20 [bcache]
>  [<ffffffffa011183d>] bch_extent_bad+0x2d/0x1c0 [bcache]
>  [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
>  [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
>  [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
>  [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
>  [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
>  [<ffffffff81091693>] ? __wake_up+0x53/0x70
>  [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
>  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>  [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffff81070a9e>] kthread+0xce/0xf0
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>  [<ffffffff815b8818>] ret_from_fork+0x58/0x90
> 
> Naturally, checking
> /sys/fs/bcache/b9bcddd1-7a9a-4f2f-88e6-cb5bef6abcf2/internal/btree_gc_max_duration_ms
> shows: 31593  Clearly at some point the GC overhead becomes so large that it
> causes RCU grace period stalls. I'm puzzled since bch_btree_gc_finish(...) is
> not listed and this is the only function that pertains to bcache gc AND
> executes code in RCU critical read section.
> 
> In addition to that I also observed that the after this RCU stall warn occurs
> the bcache_gc thread hogs the machine at 100% rendering it unusable. I managed
> to get 2 call stack dumps via magic sysrq as follows:
> 
> =============
> NMI backtrace for cpu 4
> CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
> Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
> task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
> RIP: 0010:[<ffffffffa01098cb>]  [<ffffffffa01098cb>]
> bch_btree_iter_next_filter+0xb/0x50 [bcache]
> RSP: 0018:ffff880868027bd8  EFLAGS: 00000202
> RAX: 0000000000000001 RBX: 0000000000002034 RCX: 000000000000000a
> RDX: ffffffffa0109920 RSI: ffff88086aa3dcd0 RDI: ffff880868027c08
> RBP: ffff880868027bf8 R08: 0000000000000001 R09: 0000000000000001
> R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
> R13: ffff880868027c08 R14: ffff880868027cf8 R15: ffff880868027dd8
> FS:  0000000000000000(0000) GS:ffff88046fc80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
> Stack:
>  0000000000002034 ffff88086aa3dcd0 ffff880868027c08 ffff880868027cf8
>  ffff880868027c78 ffffffffa0109c80 0000000000000004 0000000000000001
>  ffff8807740101c0 ffff88077402a9f8 ffff88086aa3dc00 ffff88086aa3d000
> Call Trace:
>  [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
>  [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
>  [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
>  [<ffffffff81091693>] ? __wake_up+0x53/0x70
>  [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
>  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>  [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffff81070a9e>] kthread+0xce/0xf0
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>  [<ffffffff815b8818>] ret_from_fork+0x58/0x90
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> Code: ff 48 89 d7 4c 29 cf eb a7 48 29 f2 48 89 d6 e9 18 ff ff ff 66 66 66 2e
> 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 <0f> 1f 44 00 00 48
> 89 fb 49 89 f4 49 89 d6 0f 1f 80 00 00 00 00
> 
> 
> ===========================================
> 
> NMI backtrace for cpu 4
> CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
> Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
> task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
> RIP: 0010:[<ffffffffa0111916>]  [<ffffffffa0111916>]
> bch_extent_bad+0x106/0x1c0 [bcache]
> RSP: 0018:ffff880868027ba8  EFLAGS: 00000202
> RAX: 000000000000bd2a RBX: ffff88077400a550 RCX: 000000000000000a
> RDX: 0000000000000004 RSI: 00000000fc390004 RDI: ffff88046c180000
> RBP: ffff880868027bb8 R08: 0000000000000001 R09: 0000000000000000
> R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
> R13: ffff88077400a550 R14: ffffffffa0109920 R15: ffff880868027dd8
> FS:  0000000000000000(0000) GS:ffff88046fc80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
> Stack:
>  ffff880868027c08 ffff88086aa3dcd0 ffff880868027bc8 ffffffffa010992a
>  ffff880868027bf8 ffffffffa01098f9 00000000000014a6 ffff88086aa3dcd0
>  ffff880868027c08 ffff880868027cf8 ffff880868027c78 ffffffffa0109c80
> Call Trace:
>  [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
>  [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
>  [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
>  [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
>  [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
>  [<ffffffff81091693>] ? __wake_up+0x53/0x70
>  [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
>  [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>  [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>  [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>  [<ffffffff81070a9e>] kthread+0xce/0xf0
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>  [<ffffffff815b8818>] ret_from_fork+0x58/0x90
>  [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> Code: 33 25 ff 0f 00 00 48 8b 94 c7 40 0c 00 00 48 89 f0 48 8b 92 d8 0a 00 00
> 48 c1 e8 08 4c 21 d0 48 d3 e8 48 8d 04 40 0f b6 54 82 06 <40> 28 f2 80 fa 80
> 0f 87 7e 00 00 00 0f b6 d2 83 fa 60 76 66 31
> 
> 
> In the mean time I'm running the stable 4.0.0 where I observe better results (
> no bcache_gc thread hog but still the occasional stall warn)
> 
> Regards,
> Nikolay
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-diff; name=bcache-unregister-reboot-notifier.patch, Size: 921 bytes --]

From: Zheng Liu <wenqing.lz@taobao.com>

In bcache_init() function it forgot to unregister reboot notifier if
bcache fails to unregister a block device.  This commit fixes this.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Joshua Schmid <jschmid@suse.com>
---
 drivers/md/bcache/super.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4dd2bb7..fdbb211 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2100,8 +2100,10 @@ static int __init bcache_init(void)
 	closure_debug_init();
 
 	bcache_major = register_blkdev(0, "bcache");
-	if (bcache_major < 0)
+	if (bcache_major < 0) {
+		unregister_reboot_notifier(&reboot);
 		return bcache_major;
+	}
 
 	if (!(bcache_wq = create_workqueue("bcache")) ||
 	    !(bcache_kobj = kobject_create_and_add("bcache", fs_kobj)) ||

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: Type: TEXT/x-diff; name=bcache-fix-memleak-bch_cached_dev_run.patch, Size: 989 bytes --]


From: Joshua Schmid <jschmid@suse.com>
Subject: [PATCH] fix a leak in bch_cached_dev_run()
Newsgroups: gmane.linux.kernel.bcache.devel
Date: 2015-02-03 11:24:06 GMT (3 weeks, 2 days, 11 hours and 43 minutes ago)

From: Al Viro <viro@ZenIV.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Joshua Schmid <jschmid@suse.com>
---
 drivers/md/bcache/super.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 8c2d657..53f1512 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -880,8 +880,11 @@ void bch_cached_dev_run(struct cached_dev *dc)
 	buf[SB_LABEL_SIZE] = '\0';
 	env[2] = kasprintf(GFP_KERNEL, "CACHED_LABEL=%s", buf);

-	if (atomic_xchg(&dc->running, 1))
+	if (atomic_xchg(&dc->running, 1)) {
+		kfree(env[1]);
+		kfree(env[2]);
 		return;
+	}

 	if (!d->c &&
 	    BDEV_STATE(&dc->sb) != BDEV_STATE_NONE) {
-- 
2.1.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: Type: TEXT/x-diff; name=bcache-attach-detach-cleanup.patch, Size: 5427 bytes --]

From: Joshua Schmid <jschmid@suse.com>
Subject: [PATCH] bcache: [BUG] clear BCACHE_DEV_UNLINK_DONE flag when attaching a backing device
Newsgroups: gmane.linux.kernel.bcache.devel
Date: 2015-02-03 11:18:01 GMT (3 weeks, 2 days, 11 hours and 45 minutes ago)

From: Zheng Liu <wenqing.lz@taobao.com>

This bug can be reproduced by the following script:

  #!/bin/bash

  bcache_sysfs="/sys/fs/bcache"

  function clear_cache()
  {
  	if [ ! -e $bcache_sysfs ]; then
  		echo "no bcache sysfs"
  		exit
  	fi

  	cset_uuid=$(ls -l $bcache_sysfs|head -n 2|tail -n 1|awk '{print $9}')
  	sudo sh -c "echo $cset_uuid > /sys/block/sdb/sdb1/bcache/detach"
  	sleep 5
  	sudo sh -c "echo $cset_uuid > /sys/block/sdb/sdb1/bcache/attach"
  }

  for ((i=0;i<10;i++)); do
  	clear_cache
  done

The warning messages look like below:
[  275.948611] ------------[ cut here ]------------
[  275.963840] WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xb8/0xd0() (Tainted: P        W 
---------------   )
[  275.979253] Hardware name: Tecal RH2285
[  275.994106] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:09.0/0000:08:00.0/host4/target4:2:1/4:2:1:0/block/sdb/sdb1/bcache/cache'
[  276.024105] Modules linked in: bcache tcp_diag inet_diag ipmi_devintf ipmi_si ipmi_msghandler
bonding 8021q garp stp llc ipv6 ext3 jbd loop sg iomemory_vsl(P) bnx2 microcode serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 jbd2 mbcache megaraid_sas
pata_acpi ata_generic ata_piix dm_mod [last unloaded: scsi_wait_scan]
[  276.072643] Pid: 2765, comm: sh Tainted: P        W  ---------------    2.6.32 #1
[  276.089315] Call Trace:
[  276.105801]  [<ffffffff81070fe7>] ? warn_slowpath_common+0x87/0xc0
[  276.122650]  [<ffffffff810710d6>] ? warn_slowpath_fmt+0x46/0x50
[  276.139361]  [<ffffffff81205c08>] ? sysfs_add_one+0xb8/0xd0
[  276.156012]  [<ffffffff8120609b>] ? sysfs_do_create_link+0x12b/0x170
[  276.172682]  [<ffffffff81206113>] ? sysfs_create_link+0x13/0x20
[  276.189282]  [<ffffffffa03bda21>] ? bcache_device_link+0xc1/0x110 [bcache]
[  276.205993]  [<ffffffffa03bfa08>] ? bch_cached_dev_attach+0x478/0x4f0 [bcache]
[  276.222794]  [<ffffffffa03c4a17>] ? bch_cached_dev_store+0x627/0x780 [bcache]
[  276.239680]  [<ffffffff8116783a>] ? alloc_pages_current+0xaa/0x110
[  276.256594]  [<ffffffff81203b15>] ? sysfs_write_file+0xe5/0x170
[  276.273364]  [<ffffffff811887b8>] ? vfs_write+0xb8/0x1a0
[  276.290133]  [<ffffffff811890b1>] ? sys_write+0x51/0x90
[  276.306368]  [<ffffffff8100c072>] ? system_call_fastpath+0x16/0x1b
[  276.322301] ---[ end trace 9f5d4fcdd0c3edfb ]---
[  276.338241] ------------[ cut here ]------------
[  276.354109] WARNING: at /home/wenqing.lz/bcache/bcache/super.c:720
bcache_device_link+0xdf/0x110 [bcache]() (Tainted: P        W  ---------------   )
[  276.386017] Hardware name: Tecal RH2285
[  276.401430] Couldn't create device <-> cache set symlinks
[  276.401759] Modules linked in: bcache tcp_diag inet_diag ipmi_devintf ipmi_si ipmi_msghandler
bonding 8021q garp stp llc ipv6 ext3 jbd loop sg iomemory_vsl(P) bnx2 microcode serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 jbd2 mbcache megaraid_sas
pata_acpi ata_generic ata_piix dm_mod [last unloaded: scsi_wait_scan]
[  276.465477] Pid: 2765, comm: sh Tainted: P        W  ---------------    2.6.32 #1
[  276.482169] Call Trace:
[  276.498610]  [<ffffffff81070fe7>] ? warn_slowpath_common+0x87/0xc0
[  276.515405]  [<ffffffff810710d6>] ? warn_slowpath_fmt+0x46/0x50
[  276.532059]  [<ffffffffa03bda3f>] ? bcache_device_link+0xdf/0x110 [bcache]
[  276.548808]  [<ffffffffa03bfa08>] ? bch_cached_dev_attach+0x478/0x4f0 [bcache]
[  276.565569]  [<ffffffffa03c4a17>] ? bch_cached_dev_store+0x627/0x780 [bcache]
[  276.582418]  [<ffffffff8116783a>] ? alloc_pages_current+0xaa/0x110
[  276.599341]  [<ffffffff81203b15>] ? sysfs_write_file+0xe5/0x170
[  276.616142]  [<ffffffff811887b8>] ? vfs_write+0xb8/0x1a0
[  276.632607]  [<ffffffff811890b1>] ? sys_write+0x51/0x90
[  276.648671]  [<ffffffff8100c072>] ? system_call_fastpath+0x16/0x1b
[  276.664756] ---[ end trace 9f5d4fcdd0c3edfc ]---

We forget to clear BCACHE_DEV_UNLINK_DONE flag in bcache_device_attach()
function when we attach a backing device first time.  After detaching this
backing device, this flag will be true and sysfs_remove_link() isn't called in
bcache_device_unlink().  Then when we attach this backing device again,
sysfs_create_link() will return EEXIST error in bcache_device_link().

So the fix is trival and we clear this flag in bcache_device_link().

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Joshua Schmid <jschmid@suse.com>
---
 drivers/md/bcache/super.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4dd2bb7..f624ae8 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -708,6 +708,8 @@ static void bcache_device_link(struct bcache_device *d, struct cache_set *c,
 	WARN(sysfs_create_link(&d->kobj, &c->kobj, "cache") ||
 	     sysfs_create_link(&c->kobj, &d->kobj, d->name),
 	     "Couldn't create device <-> cache set symlinks");
+
+	clear_bit(BCACHE_DEV_UNLINK_DONE, &d->flags);
 }

 static void bcache_device_detach(struct bcache_device *d)
-- 
2.1.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: Type: TEXT/x-diff; name=bcache-rcu-sched-bugfix.patch, Size: 2722 bytes --]

	
From:	Zheng Liu <gnehzuil.liu@gmail.com>
To:	linux-bcache@vger.kernel.org
Cc:	Zheng Liu <wenqing.lz@taobao.com>, Joshua Schmid <jschmid@suse.com>, Zhu Yanhai <zhu.yanhai@gmail.com>, Kent Overstreet <kmo@daterainc.com>
Subject:	[PATCH v2] bcache: fix a livelock in btree lock
Date:	Wed, 25 Feb 2015 20:32:09 +0800 (02/25/2015 04:32:09 AM)
From: Zheng Liu <wenqing.lz@taobao.com>

This commit tries to fix a livelock in bcache.  This livelock might
happen when we causes a huge number of cache misses simultaneously.

When we get a cache miss, bcache will execute the following path.

->cached_dev_make_request()
  ->cached_dev_read()
    ->cached_lookup()
      ->bch->btree_map_keys()
        ->btree_root()  <------------------------
          ->bch_btree_map_keys_recurse()        |
            ->cache_lookup_fn()                 |
              ->cached_dev_cache_miss()         |
                ->bch_btree_insert_check_key() -|
                  [If btree->seq is not equal to seq + 1, we should return
                   EINTR and traverse btree again.]

In bch_btree_insert_check_key() function we first need to check upgrade
flag (op->lock == -1), and when this flag is true we need to release
read btree->lock and try to take write btree->lock.  During taking and
releasing this write lock, btree->seq will be monotone increased in
order to prevent other threads modify this in cache miss (see btree.h:74).
But if there are some cache misses caused by some requested, we could
meet a livelock because btree->seq is always changed by others.  Thus no
one can make progress.

This commit will try to take write btree->lock if it encounters a race
when we traverse btree.  Although it sacrifice the scalability but we
can ensure that only one can modify the btree.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Joshua Schmid <jschmid@suse.com>
Cc: Joshua Schmid <jschmid@suse.com>
Cc: Zhu Yanhai <zhu.yanhai@gmail.com>
Cc: Kent Overstreet <kmo@daterainc.com>
---
changelog:
v2: fix a bug that stops all concurrency writes unconditionally.

 drivers/md/bcache/btree.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 218f21a..43829d9 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2163,8 +2163,10 @@ int bch_btree_insert_check_key(struct btree *b, struct btree_op *op,
 		rw_lock(true, b, b->level);
 
 		if (b->key.ptr[0] != btree_ptr ||
-		    b->seq != seq + 1)
+                   b->seq != seq + 1) {
+                       op->lock = b->level;
 			goto out;
+               }
 	}
 
 	SET_KEY_PTRS(check_key, 1);

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: Type: TEXT/x-diff; name=bcache-cond_resched.patch, Size: 832 bytes --]

From f0e6320a7874af434575f37a11ec6e4992cef790 Mon Sep 17 00:00:00 2001
From: Kent Overstreet <kmo@daterainc.com>
Date: Sat, 1 Nov 2014 13:44:47 -0700
Subject: [PATCH 1/5] bcache: Add a cond_resched() call to gc
Git-commit: f0e6320a7874af434575f37a11ec6e4992cef790
Patch-mainline: Submitted
References: bnc#910440

Change-id: Id4f18c533b80ddb40df94ed0bb5e2a236a4bc325
Signed-off-by: Takashi Iwai <tiwai@suse.de>

---
 drivers/md/bcache/btree.c | 1 +
  1 file changed, 1 insertion(+)

--- a/drivers/md/bcache/btree.c	2014-11-03 16:51:01.720000000 -0800
+++ b/drivers/md/bcache/btree.c	2014-11-03 16:51:26.456000000 -0800
@@ -1741,6 +1741,7 @@
 	do {
 		ret = btree_root(gc_root, c, &op, &writes, &stats);
 		closure_sync(&writes);
+		cond_resched();
 
 		if (ret && ret != -EAGAIN)
 			pr_warn("gc failed!");

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: bcache causes RCU stalls/bcache_gc hogs CPU
  2015-04-14 20:03 ` Eric Wheeler
@ 2015-04-15  8:41   ` Nikolay Borisov
  2015-04-15 19:49     ` Eric Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2015-04-15  8:41 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-bcache

Thanks for the patches, I've applied them to 4.0 and am in the process 
of testing that.

Do you happen to know in which (if any) repo do those patches live and 
if there is a way to "reliably" (e.g. a repo where they are being 
applied) track them or have you just collected them from misc postings 
to the mailing list?

Regards,
Nikolay

On 04/14/2015 11:03 PM, Eric Wheeler wrote:
> Apply all of the attached patches to your kernel and try again.
>
> I wish somebody would apply these upstream and get it into the official
> kernel. I have been carrying all of these patches with me for some time
> and they definitely make bcache more stable.
>
> -Eric
>
>
> --
> Eric Wheeler, President           eWheeler, Inc. dba Global Linux Security
> 888-LINUX26 (888-546-8926)        Fax: 503-716-3878           PO Box 25107
> www.GlobalLinuxSecurity.pro       Linux since 1996!     Portland, OR 97298
>
> On Tue, 14 Apr 2015, Nikolay Borisov wrote:
>
>> Hello list,
>>
>>
>> I'm currently testing bcache with the following setup:
>>
>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
>>
>> sda                   8:0    0   1.8T  0 disk
>> ??sda1                8:1    0     2M  0 part
>> ??sda2                8:2    0   191M  0 part
>> ??sda3                8:3    0   1.8T  0 part
>>    ??main-os (dm-0)  254:0    0   1.8T  0 lvm  /
>> sdb                   8:16   0 223.1G  0 disk
>> ??sdb1                8:17   0 188.2M  0 part /boot
>> ??sdb2                8:18   0 222.9G  0 part
>>    ??main-ssd (dm-1) 254:1    0    40G  0 lvm
>>    ? ??bcache0       253:0    0   182G  0 disk /sdb
>>    ??main-db (dm-2)  254:2    0   182G  0 lvm
>>      ??bcache0       253:0    0   182G  0 disk /sdb
>>
>> So a 40gig ssd (main-ssd, lvm2 volume) backed by a 180gig hdd (main-db, lvm 2
>> volume), using the writeback cache policy. Every other setting is at its
>> default. I'm running the 4.0-rc6 (!CONFIG_PREEMPT). After running fio (using a
>> 30gb file) with a mix of sequential and random i/o and I'm getting the
>> following RCU warn:
>>
>> ======================================================
>> INFO: rcu_sched self-detected stall on CPU
>>          4: (2099 ticks this GP) idle=fcf/140000000000001/0
>> softirq=1031582/1031582 fqs=2100
>> INFO: rcu_sched detected stalls on CPUs/tasks:
>>          4: (2099 ticks this GP) idle=fcf/140000000000001/0
>> softirq=1031582/1031582 fqs=2100
>>          (detected by 16, t=2104 jiffies, g=2176431, c=2176430, q=3098)
>> Task dump for CPU 4:
>> bcache_gc	R  running task    12728 18115      2 0x00000008
>>   ffff880079e85720 fffffffffffffffc ffff88046c180e20 fffffffffffffffc
>>   ffffffff81091693 fffffffffffffffc ffff88086aa3d000 ffff88046c180000
>>   ffff88086aa3d000 ffff88046c180000 ffff88086aa3d060 ffff88086aa3d000
>> Call Trace:
>>   [<ffffffff81091693>] ? __wake_up+0x53/0x70
>>   [<ffffffffa01103d4>] ? bch_btree_gc+0x2f4/0x560 [bcache]
>>   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>>   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>>   [<ffffffffa0110678>] ? bch_gc_thread+0x38/0x120 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffff81070a9e>] ? kthread+0xce/0xf0
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>>   [<ffffffff815b8818>] ? ret_from_fork+0x58/0x90
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>>           (t=2228 jiffies g=2176431 c=2176430 q=3161)
>> Task dump for CPU 4:
>> bcache_gc	R  running task    12728 18115      2 0x00000008
>>   0000000000000005 ffff88046fc83ca8 ffffffff8107720b 0000000000000004
>>   ffffffff8183d040 ffff88046fc83cc8 ffffffff810772af ffff88046fc83cc8
>>   ffffffff8183d100 ffff88046fc83cf8 ffffffff810a5101 ffff88046fc94500
>> Call Trace:
>>   <IRQ>  [<ffffffff8107720b>] sched_show_task+0xcb/0x130
>>   [<ffffffff810772af>] dump_cpu_task+0x3f/0x50
>>   [<ffffffff810a5101>] rcu_dump_cpu_stacks+0x91/0xd0
>>   [<ffffffff810a68cf>] rcu_check_callbacks+0x65f/0xc30
>>   [<ffffffff81080ecc>] ? account_process_tick+0x6c/0x170
>>   [<ffffffff810acf29>] update_process_times+0x39/0x70
>>   [<ffffffff810beba0>] tick_sched_handle+0x40/0x50
>>   [<ffffffff810bedb2>] tick_sched_timer+0x52/0xa0
>>   [<ffffffff810afa16>] __run_hrtimer+0x86/0x1d0
>>   [<ffffffff810bed60>] ? tick_nohz_handler+0xc0/0xc0
>>   [<ffffffff810afd92>] hrtimer_interrupt+0x102/0x240
>>   [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
>>   [<ffffffff81032e79>] local_apic_timer_interrupt+0x39/0x60
>>   [<ffffffff815bb355>] smp_apic_timer_interrupt+0x45/0x59
>>   [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
>>   [<ffffffff815b972d>] apic_timer_interrupt+0x6d/0x80
>>   <EOI>  [<ffffffffa01117c5>] ? __bch_extent_invalid+0xa5/0xd0 [bcache]
>>   [<ffffffffa0111721>] ? __bch_extent_invalid+0x1/0xd0 [bcache]
>>   [<ffffffffa0111802>] ? bch_extent_invalid+0x12/0x20 [bcache]
>>   [<ffffffffa011183d>] bch_extent_bad+0x2d/0x1c0 [bcache]
>>   [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
>>   [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
>>   [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
>>   [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
>>   [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
>>   [<ffffffff81091693>] ? __wake_up+0x53/0x70
>>   [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
>>   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>>   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>>   [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffff81070a9e>] kthread+0xce/0xf0
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>>   [<ffffffff815b8818>] ret_from_fork+0x58/0x90
>>
>> Naturally, checking
>> /sys/fs/bcache/b9bcddd1-7a9a-4f2f-88e6-cb5bef6abcf2/internal/btree_gc_max_duration_ms
>> shows: 31593  Clearly at some point the GC overhead becomes so large that it
>> causes RCU grace period stalls. I'm puzzled since bch_btree_gc_finish(...) is
>> not listed and this is the only function that pertains to bcache gc AND
>> executes code in RCU critical read section.
>>
>> In addition to that I also observed that the after this RCU stall warn occurs
>> the bcache_gc thread hogs the machine at 100% rendering it unusable. I managed
>> to get 2 call stack dumps via magic sysrq as follows:
>>
>> =============
>> NMI backtrace for cpu 4
>> CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
>> Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
>> task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
>> RIP: 0010:[<ffffffffa01098cb>]  [<ffffffffa01098cb>]
>> bch_btree_iter_next_filter+0xb/0x50 [bcache]
>> RSP: 0018:ffff880868027bd8  EFLAGS: 00000202
>> RAX: 0000000000000001 RBX: 0000000000002034 RCX: 000000000000000a
>> RDX: ffffffffa0109920 RSI: ffff88086aa3dcd0 RDI: ffff880868027c08
>> RBP: ffff880868027bf8 R08: 0000000000000001 R09: 0000000000000001
>> R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
>> R13: ffff880868027c08 R14: ffff880868027cf8 R15: ffff880868027dd8
>> FS:  0000000000000000(0000) GS:ffff88046fc80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
>> Stack:
>>   0000000000002034 ffff88086aa3dcd0 ffff880868027c08 ffff880868027cf8
>>   ffff880868027c78 ffffffffa0109c80 0000000000000004 0000000000000001
>>   ffff8807740101c0 ffff88077402a9f8 ffff88086aa3dc00 ffff88086aa3d000
>> Call Trace:
>>   [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
>>   [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
>>   [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
>>   [<ffffffff81091693>] ? __wake_up+0x53/0x70
>>   [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
>>   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>>   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>>   [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffff81070a9e>] kthread+0xce/0xf0
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>>   [<ffffffff815b8818>] ret_from_fork+0x58/0x90
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>> Code: ff 48 89 d7 4c 29 cf eb a7 48 29 f2 48 89 d6 e9 18 ff ff ff 66 66 66 2e
>> 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 <0f> 1f 44 00 00 48
>> 89 fb 49 89 f4 49 89 d6 0f 1f 80 00 00 00 00
>>
>>
>> ===========================================
>>
>> NMI backtrace for cpu 4
>> CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
>> Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
>> task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
>> RIP: 0010:[<ffffffffa0111916>]  [<ffffffffa0111916>]
>> bch_extent_bad+0x106/0x1c0 [bcache]
>> RSP: 0018:ffff880868027ba8  EFLAGS: 00000202
>> RAX: 000000000000bd2a RBX: ffff88077400a550 RCX: 000000000000000a
>> RDX: 0000000000000004 RSI: 00000000fc390004 RDI: ffff88046c180000
>> RBP: ffff880868027bb8 R08: 0000000000000001 R09: 0000000000000000
>> R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
>> R13: ffff88077400a550 R14: ffffffffa0109920 R15: ffff880868027dd8
>> FS:  0000000000000000(0000) GS:ffff88046fc80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
>> Stack:
>>   ffff880868027c08 ffff88086aa3dcd0 ffff880868027bc8 ffffffffa010992a
>>   ffff880868027bf8 ffffffffa01098f9 00000000000014a6 ffff88086aa3dcd0
>>   ffff880868027c08 ffff880868027cf8 ffff880868027c78 ffffffffa0109c80
>> Call Trace:
>>   [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
>>   [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
>>   [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
>>   [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
>>   [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
>>   [<ffffffff81091693>] ? __wake_up+0x53/0x70
>>   [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
>>   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
>>   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
>>   [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
>>   [<ffffffff81070a9e>] kthread+0xce/0xf0
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>>   [<ffffffff815b8818>] ret_from_fork+0x58/0x90
>>   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
>> Code: 33 25 ff 0f 00 00 48 8b 94 c7 40 0c 00 00 48 89 f0 48 8b 92 d8 0a 00 00
>> 48 c1 e8 08 4c 21 d0 48 d3 e8 48 8d 04 40 0f b6 54 82 06 <40> 28 f2 80 fa 80
>> 0f 87 7e 00 00 00 0f b6 d2 83 fa 60 76 66 31
>>
>>
>> In the mean time I'm running the stable 4.0.0 where I observe better results (
>> no bcache_gc thread hog but still the occasional stall warn)
>>
>> Regards,
>> Nikolay
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bcache causes RCU stalls/bcache_gc hogs CPU
  2015-04-15  8:41   ` Nikolay Borisov
@ 2015-04-15 19:49     ` Eric Wheeler
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Wheeler @ 2015-04-15 19:49 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-bcache

On Wed, 15 Apr 2015, Nikolay Borisov wrote:

> Thanks for the patches, I've applied them to 4.0 and am in the process of
> testing that.
> 
> Do you happen to know in which (if any) repo do those patches live and if
> there is a way to "reliably" (e.g. a repo where they are being applied) track
> them or have you just collected them from misc postings to the mailing list?

I keep meaning to create a git repository of my own with all of the 
patches, but that is not publically available yet. For the moment, I have 
been collecting all of the patches as they show up on the mailing list. I 
review them to make sure that they make sense, then add them to my list of 
bcache fixes. The specific fix that you are looking for is probably the 
rcu_schedpatch, but the other ones are good for stability as well.

-Eric

 
> Regards,
> Nikolay
> 
> On 04/14/2015 11:03 PM, Eric Wheeler wrote:
> > Apply all of the attached patches to your kernel and try again.
> > 
> > I wish somebody would apply these upstream and get it into the official
> > kernel. I have been carrying all of these patches with me for some time
> > and they definitely make bcache more stable.
> > 
> > -Eric
> > 
> > 
> > --
> > Eric Wheeler, President           eWheeler, Inc. dba Global Linux Security
> > 888-LINUX26 (888-546-8926)        Fax: 503-716-3878           PO Box 25107
> > www.GlobalLinuxSecurity.pro       Linux since 1996!     Portland, OR 97298
> > 
> > On Tue, 14 Apr 2015, Nikolay Borisov wrote:
> > 
> > > Hello list,
> > > 
> > > 
> > > I'm currently testing bcache with the following setup:
> > > 
> > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> > > 
> > > sda                   8:0    0   1.8T  0 disk
> > > ??sda1                8:1    0     2M  0 part
> > > ??sda2                8:2    0   191M  0 part
> > > ??sda3                8:3    0   1.8T  0 part
> > >    ??main-os (dm-0)  254:0    0   1.8T  0 lvm  /
> > > sdb                   8:16   0 223.1G  0 disk
> > > ??sdb1                8:17   0 188.2M  0 part /boot
> > > ??sdb2                8:18   0 222.9G  0 part
> > >    ??main-ssd (dm-1) 254:1    0    40G  0 lvm
> > >    ? ??bcache0       253:0    0   182G  0 disk /sdb
> > >    ??main-db (dm-2)  254:2    0   182G  0 lvm
> > >      ??bcache0       253:0    0   182G  0 disk /sdb
> > > 
> > > So a 40gig ssd (main-ssd, lvm2 volume) backed by a 180gig hdd (main-db,
> > > lvm 2
> > > volume), using the writeback cache policy. Every other setting is at its
> > > default. I'm running the 4.0-rc6 (!CONFIG_PREEMPT). After running fio
> > > (using a
> > > 30gb file) with a mix of sequential and random i/o and I'm getting the
> > > following RCU warn:
> > > 
> > > ======================================================
> > > INFO: rcu_sched self-detected stall on CPU
> > >          4: (2099 ticks this GP) idle=fcf/140000000000001/0
> > > softirq=1031582/1031582 fqs=2100
> > > INFO: rcu_sched detected stalls on CPUs/tasks:
> > >          4: (2099 ticks this GP) idle=fcf/140000000000001/0
> > > softirq=1031582/1031582 fqs=2100
> > >          (detected by 16, t=2104 jiffies, g=2176431, c=2176430, q=3098)
> > > Task dump for CPU 4:
> > > bcache_gc	R  running task    12728 18115      2 0x00000008
> > >   ffff880079e85720 fffffffffffffffc ffff88046c180e20 fffffffffffffffc
> > >   ffffffff81091693 fffffffffffffffc ffff88086aa3d000 ffff88046c180000
> > >   ffff88086aa3d000 ffff88046c180000 ffff88086aa3d060 ffff88086aa3d000
> > > Call Trace:
> > >   [<ffffffff81091693>] ? __wake_up+0x53/0x70
> > >   [<ffffffffa01103d4>] ? bch_btree_gc+0x2f4/0x560 [bcache]
> > >   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
> > >   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
> > >   [<ffffffffa0110678>] ? bch_gc_thread+0x38/0x120 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffff81070a9e>] ? kthread+0xce/0xf0
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > >   [<ffffffff815b8818>] ? ret_from_fork+0x58/0x90
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > >           (t=2228 jiffies g=2176431 c=2176430 q=3161)
> > > Task dump for CPU 4:
> > > bcache_gc	R  running task    12728 18115      2 0x00000008
> > >   0000000000000005 ffff88046fc83ca8 ffffffff8107720b 0000000000000004
> > >   ffffffff8183d040 ffff88046fc83cc8 ffffffff810772af ffff88046fc83cc8
> > >   ffffffff8183d100 ffff88046fc83cf8 ffffffff810a5101 ffff88046fc94500
> > > Call Trace:
> > >   <IRQ>  [<ffffffff8107720b>] sched_show_task+0xcb/0x130
> > >   [<ffffffff810772af>] dump_cpu_task+0x3f/0x50
> > >   [<ffffffff810a5101>] rcu_dump_cpu_stacks+0x91/0xd0
> > >   [<ffffffff810a68cf>] rcu_check_callbacks+0x65f/0xc30
> > >   [<ffffffff81080ecc>] ? account_process_tick+0x6c/0x170
> > >   [<ffffffff810acf29>] update_process_times+0x39/0x70
> > >   [<ffffffff810beba0>] tick_sched_handle+0x40/0x50
> > >   [<ffffffff810bedb2>] tick_sched_timer+0x52/0xa0
> > >   [<ffffffff810afa16>] __run_hrtimer+0x86/0x1d0
> > >   [<ffffffff810bed60>] ? tick_nohz_handler+0xc0/0xc0
> > >   [<ffffffff810afd92>] hrtimer_interrupt+0x102/0x240
> > >   [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
> > >   [<ffffffff81032e79>] local_apic_timer_interrupt+0x39/0x60
> > >   [<ffffffff815bb355>] smp_apic_timer_interrupt+0x45/0x59
> > >   [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache]
> > >   [<ffffffff815b972d>] apic_timer_interrupt+0x6d/0x80
> > >   <EOI>  [<ffffffffa01117c5>] ? __bch_extent_invalid+0xa5/0xd0 [bcache]
> > >   [<ffffffffa0111721>] ? __bch_extent_invalid+0x1/0xd0 [bcache]
> > >   [<ffffffffa0111802>] ? bch_extent_invalid+0x12/0x20 [bcache]
> > >   [<ffffffffa011183d>] bch_extent_bad+0x2d/0x1c0 [bcache]
> > >   [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
> > >   [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
> > >   [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
> > >   [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
> > >   [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
> > >   [<ffffffff81091693>] ? __wake_up+0x53/0x70
> > >   [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
> > >   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
> > >   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
> > >   [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffff81070a9e>] kthread+0xce/0xf0
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > >   [<ffffffff815b8818>] ret_from_fork+0x58/0x90
> > > 
> > > Naturally, checking
> > > /sys/fs/bcache/b9bcddd1-7a9a-4f2f-88e6-cb5bef6abcf2/internal/btree_gc_max_duration_ms
> > > shows: 31593  Clearly at some point the GC overhead becomes so large that
> > > it
> > > causes RCU grace period stalls. I'm puzzled since bch_btree_gc_finish(...)
> > > is
> > > not listed and this is the only function that pertains to bcache gc AND
> > > executes code in RCU critical read section.
> > > 
> > > In addition to that I also observed that the after this RCU stall warn
> > > occurs
> > > the bcache_gc thread hogs the machine at 100% rendering it unusable. I
> > > managed
> > > to get 2 call stack dumps via magic sysrq as follows:
> > > 
> > > =============
> > > NMI backtrace for cpu 4
> > > CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
> > > Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
> > > task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
> > > RIP: 0010:[<ffffffffa01098cb>]  [<ffffffffa01098cb>]
> > > bch_btree_iter_next_filter+0xb/0x50 [bcache]
> > > RSP: 0018:ffff880868027bd8  EFLAGS: 00000202
> > > RAX: 0000000000000001 RBX: 0000000000002034 RCX: 000000000000000a
> > > RDX: ffffffffa0109920 RSI: ffff88086aa3dcd0 RDI: ffff880868027c08
> > > RBP: ffff880868027bf8 R08: 0000000000000001 R09: 0000000000000001
> > > R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
> > > R13: ffff880868027c08 R14: ffff880868027cf8 R15: ffff880868027dd8
> > > FS:  0000000000000000(0000) GS:ffff88046fc80000(0000)
> > > knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
> > > Stack:
> > >   0000000000002034 ffff88086aa3dcd0 ffff880868027c08 ffff880868027cf8
> > >   ffff880868027c78 ffffffffa0109c80 0000000000000004 0000000000000001
> > >   ffff8807740101c0 ffff88077402a9f8 ffff88086aa3dc00 ffff88086aa3d000
> > > Call Trace:
> > >   [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
> > >   [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
> > >   [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
> > >   [<ffffffff81091693>] ? __wake_up+0x53/0x70
> > >   [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
> > >   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
> > >   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
> > >   [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffff81070a9e>] kthread+0xce/0xf0
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > >   [<ffffffff815b8818>] ret_from_fork+0x58/0x90
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > > Code: ff 48 89 d7 4c 29 cf eb a7 48 29 f2 48 89 d6 e9 18 ff ff ff 66 66 66
> > > 2e
> > > 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 <0f> 1f 44 00 00
> > > 48
> > > 89 fb 49 89 f4 49 89 d6 0f 1f 80 00 00 00 00
> > > 
> > > 
> > > ===========================================
> > > 
> > > NMI backtrace for cpu 4
> > > CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4
> > > Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
> > > task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000
> > > RIP: 0010:[<ffffffffa0111916>]  [<ffffffffa0111916>]
> > > bch_extent_bad+0x106/0x1c0 [bcache]
> > > RSP: 0018:ffff880868027ba8  EFLAGS: 00000202
> > > RAX: 000000000000bd2a RBX: ffff88077400a550 RCX: 000000000000000a
> > > RDX: 0000000000000004 RSI: 00000000fc390004 RDI: ffff88046c180000
> > > RBP: ffff880868027bb8 R08: 0000000000000001 R09: 0000000000000000
> > > R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0
> > > R13: ffff88077400a550 R14: ffffffffa0109920 R15: ffff880868027dd8
> > > FS:  0000000000000000(0000) GS:ffff88046fc80000(0000)
> > > knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0
> > > Stack:
> > >   ffff880868027c08 ffff88086aa3dcd0 ffff880868027bc8 ffffffffa010992a
> > >   ffff880868027bf8 ffffffffa01098f9 00000000000014a6 ffff88086aa3dcd0
> > >   ffff880868027c08 ffff880868027cf8 ffff880868027c78 ffffffffa0109c80
> > > Call Trace:
> > >   [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache]
> > >   [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache]
> > >   [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache]
> > >   [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache]
> > >   [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache]
> > >   [<ffffffff81091693>] ? __wake_up+0x53/0x70
> > >   [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache]
> > >   [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0
> > >   [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20
> > >   [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache]
> > >   [<ffffffff81070a9e>] kthread+0xce/0xf0
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > >   [<ffffffff815b8818>] ret_from_fork+0x58/0x90
> > >   [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70
> > > Code: 33 25 ff 0f 00 00 48 8b 94 c7 40 0c 00 00 48 89 f0 48 8b 92 d8 0a 00
> > > 00
> > > 48 c1 e8 08 4c 21 d0 48 d3 e8 48 8d 04 40 0f b6 54 82 06 <40> 28 f2 80 fa
> > > 80
> > > 0f 87 7e 00 00 00 0f b6 d2 83 fa 60 76 66 31
> > > 
> > > 
> > > In the mean time I'm running the stable 4.0.0 where I observe better
> > > results (
> > > no bcache_gc thread hog but still the occasional stall warn)
> > > 
> > > Regards,
> > > Nikolay
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-04-15 19:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-14 14:52 bcache causes RCU stalls/bcache_gc hogs CPU Nikolay Borisov
2015-04-14 20:03 ` Eric Wheeler
2015-04-15  8:41   ` Nikolay Borisov
2015-04-15 19:49     ` Eric Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.