All of lore.kernel.org
 help / color / mirror / Atom feed
* lock, Workqueue: bcache bch_data_insert_keys [bcache]
@ 2017-07-07 15:06 Konstantin Shalygin
  2017-07-07 17:16 ` Eric Wheeler
  0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Shalygin @ 2017-07-07 15:06 UTC (permalink / raw)
  To: linux-bcache

Hello. I caught lock (only hard reset helped to reboot kernel), when I 
started "every week rsync backup" of my laptop '/home'. After reboot I 
started the same task and this don't get any issues.

I run bcache over md raid5.

[k0ste@home ~]$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda           8:0    0   5.5T  0 disk
└─md0         9:0    0    15T  0 raid5
   └─bcache0 254:0    0    15T  0 disk  /srv/raid
sdb           8:16   0   5.5T  0 disk
└─md0         9:0    0    15T  0 raid5
   └─bcache0 254:0    0    15T  0 disk  /srv/raid
sdc           8:32   0   5.5T  0 disk
└─md0         9:0    0    15T  0 raid5
   └─bcache0 254:0    0    15T  0 disk  /srv/raid
sdd           8:48   1  14.3G  0 disk
└─sdd1        8:49   1  14.3G  0 part  /boot
sde           8:64   0   5.5T  0 disk
└─md0         9:0    0    15T  0 raid5
   └─bcache0 254:0    0    15T  0 disk  /srv/raid
sdf           8:80   0 119.2G  0 disk
└─sdf1        8:81   0 119.2G  0 part  /
nvme0n1     259:0    0 119.2G  0 disk
└─bcache0   254:0    0    15T  0 disk  /srv/raid

dmesg:


> Jul 07 19:59:59 home kernel: INFO: task bcache:86 blocked for more 
> than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: bcache          D    0    86      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys 
> [bcache]
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 19:59:59 home kernel:  rescuer_thread+0x20c/0x390
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? cancel_delayed_work_sync+0x20/0x20
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 19:59:59 home kernel: INFO: task xfsaild/bcache0:5174 blocked 
> for more than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: xfsaild/bcache0 D    0  5174      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel:  schedule_timeout+0x224/0x3c0
> Jul 07 19:59:59 home kernel:  ? check_preempt_curr+0x27/0x80
> Jul 07 19:59:59 home kernel:  ? ttwu_do_wakeup.isra.16+0x1e/0x170
> Jul 07 19:59:59 home kernel:  wait_for_common+0xb9/0x180
> Jul 07 19:59:59 home kernel:  ? wait_for_common+0xb9/0x180
> Jul 07 19:59:59 home kernel:  ? wake_up_q+0x80/0x80
> Jul 07 19:59:59 home kernel:  wait_for_completion+0x1d/0x20
> Jul 07 19:59:59 home kernel:  flush_work+0x13b/0x1c0
> Jul 07 19:59:59 home kernel:  ? flush_workqueue_prep_pwqs+0x190/0x190
> Jul 07 19:59:59 home kernel:  xlog_cil_force_lsn+0x7c/0x200 [xfs]
> Jul 07 19:59:59 home kernel:  ? try_to_del_timer_sync+0x53/0x80
> Jul 07 19:59:59 home kernel:  _xfs_log_force+0x85/0x290 [xfs]
> Jul 07 19:59:59 home kernel:  ? del_timer_sync+0x50/0x50
> Jul 07 19:59:59 home kernel:  ? xfsaild+0x16a/0x7c0 [xfs]
> Jul 07 19:59:59 home kernel:  xfs_log_force+0x2c/0x90 [xfs]
> Jul 07 19:59:59 home kernel:  xfsaild+0x16a/0x7c0 [xfs]
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 19:59:59 home kernel: INFO: task kworker/0:1:84671 blocked for 
> more than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: kworker/0:1     D    0 84671      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys 
> [bcache]
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 19:59:59 home kernel: INFO: task kworker/0:3:84673 blocked for 
> more than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: kworker/0:3     D    0 84673      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys 
> [bcache]
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 19:59:59 home kernel: INFO: task kworker/2:2:87164 blocked for 
> more than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: kworker/2:2     D    0 87164      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Workqueue: xfs-sync/bcache0 
> xfs_log_worker [xfs]
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  ? find_next_bit+0xb/0x10
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel:  schedule_timeout+0x224/0x3c0
> Jul 07 19:59:59 home kernel:  wait_for_common+0xb9/0x180
> Jul 07 19:59:59 home kernel:  ? wait_for_common+0xb9/0x180
> Jul 07 19:59:59 home kernel:  ? wake_up_q+0x80/0x80
> Jul 07 19:59:59 home kernel:  wait_for_completion+0x1d/0x20
> Jul 07 19:59:59 home kernel:  flush_work+0x13b/0x1c0
> Jul 07 19:59:59 home kernel:  ? flush_workqueue_prep_pwqs+0x190/0x190
> Jul 07 19:59:59 home kernel:  xlog_cil_force_lsn+0x7c/0x200 [xfs]
> Jul 07 19:59:59 home kernel:  ? find_next_bit+0xb/0x10
> Jul 07 19:59:59 home kernel:  _xfs_log_force+0x85/0x290 [xfs]
> Jul 07 19:59:59 home kernel:  ? __switch_to+0x270/0x490
> Jul 07 19:59:59 home kernel:  ? xfs_log_worker+0x34/0x100 [xfs]
> Jul 07 19:59:59 home kernel:  xfs_log_force+0x2c/0x90 [xfs]
> Jul 07 19:59:59 home kernel:  xfs_log_worker+0x34/0x100 [xfs]
> Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 19:59:59 home kernel: INFO: task kworker/1:28:89441 blocked for 
> more than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: kworker/1:28    D    0 89441      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Workqueue: xfs-cil/bcache0 
> xlog_cil_push_work [xfs]
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel: xlog_state_get_iclog_space+0xf8/0x2d0 [xfs]
> Jul 07 19:59:59 home kernel:  ? wake_up_q+0x80/0x80
> Jul 07 19:59:59 home kernel:  xlog_write+0x166/0x640 [xfs]
> Jul 07 19:59:59 home kernel:  xlog_cil_push+0x2a1/0x490 [xfs]
> Jul 07 19:59:59 home kernel:  ? put_prev_entity+0x40/0xc10
> Jul 07 19:59:59 home kernel:  ? __switch_to+0x270/0x490
> Jul 07 19:59:59 home kernel:  xlog_cil_push_work+0x15/0x20 [xfs]
> Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 19:59:59 home kernel:  worker_thread+0x225/0x4e0
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 19:59:59 home kernel: INFO: task kworker/0:0:92924 blocked for 
> more than 120 seconds.
> Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 19:59:59 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 19:59:59 home kernel: kworker/0:0     D    0 92924      2 
> 0x00000000
> Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys 
> [bcache]
> Jul 07 19:59:59 home kernel: Call Trace:
> Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 20:02:02 home kernel: INFO: task bcache:86 blocked for more 
> than 120 seconds.
> Jul 07 20:02:02 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 20:02:02 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 20:02:02 home kernel: bcache          D    0    86      2 
> 0x00000000
> Jul 07 20:02:02 home kernel: Workqueue: bcache bch_data_insert_keys 
> [bcache]
> Jul 07 20:02:02 home kernel: Call Trace:
> Jul 07 20:02:02 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 20:02:02 home kernel:  schedule+0x3d/0x90
> Jul 07 20:02:02 home kernel:  closure_sync+0x23/0x90 [bcache]
> Jul 07 20:02:02 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> Jul 07 20:02:02 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> Jul 07 20:02:02 home kernel:  process_one_work+0x1e0/0x490
> Jul 07 20:02:02 home kernel:  rescuer_thread+0x20c/0x390
> Jul 07 20:02:02 home kernel:  kthread+0x125/0x140
> Jul 07 20:02:02 home kernel:  ? cancel_delayed_work_sync+0x20/0x20
> Jul 07 20:02:02 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 20:02:02 home kernel:  ret_from_fork+0x2c/0x40
> Jul 07 20:02:02 home kernel: INFO: task systemd-logind:412 blocked for 
> more than 120 seconds.
> Jul 07 20:02:02 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 20:02:02 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 20:02:02 home kernel: systemd-logind  D    0   412      1 
> 0x00000100
> Jul 07 20:02:02 home kernel: Call Trace:
> Jul 07 20:02:02 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 20:02:02 home kernel:  schedule+0x3d/0x90
> Jul 07 20:02:02 home kernel:  rwsem_down_write_failed+0x175/0x250
> Jul 07 20:02:02 home kernel:  ? ida_get_new_above+0x1a7/0x320
> Jul 07 20:02:02 home kernel: call_rwsem_down_write_failed+0x17/0x30
> Jul 07 20:02:02 home kernel:  ? call_rwsem_down_write_failed+0x17/0x30
> Jul 07 20:02:02 home kernel:  down_write+0x24/0x40
> Jul 07 20:02:02 home kernel:  register_shrinker+0x3e/0x90
> Jul 07 20:02:02 home kernel:  sget_userns+0x409/0x470
> Jul 07 20:02:02 home kernel:  ? get_anon_bdev+0x110/0x110
> Jul 07 20:02:02 home kernel:  ? shmem_remount_fs+0x1c0/0x1c0
> Jul 07 20:02:02 home kernel:  sget+0x5e/0x80
> Jul 07 20:02:02 home kernel:  ? get_anon_bdev+0x110/0x110
> Jul 07 20:02:02 home kernel:  mount_nodev+0x30/0xa0
> Jul 07 20:02:02 home kernel:  shmem_mount+0x18/0x20
> Jul 07 20:02:02 home kernel:  mount_fs+0x32/0x160
> Jul 07 20:02:02 home kernel:  vfs_kern_mount.part.7+0x5d/0x110
> Jul 07 20:02:02 home kernel:  do_mount+0x528/0xc60
> Jul 07 20:02:02 home kernel:  ? _copy_from_user+0x4b/0x80
> Jul 07 20:02:02 home kernel:  SyS_mount+0x5a/0xe0
> Jul 07 20:02:02 home kernel:  do_syscall_64+0x54/0xc0
> Jul 07 20:02:02 home kernel:  entry_SYSCALL64_slow_path+0x25/0x25
> Jul 07 20:02:02 home kernel: RIP: 0033:0x7f35e30b98da
> Jul 07 20:02:02 home kernel: RSP: 002b:00007ffc4ed06fb8 EFLAGS: 
> 00000202 ORIG_RAX: 00000000000000a5
> Jul 07 20:02:02 home kernel: RAX: ffffffffffffffda RBX: 
> 0000557b6f476480 RCX: 00007f35e30b98da
> Jul 07 20:02:02 home kernel: RDX: 0000557b6d4f1402 RSI: 
> 0000557b6f478b80 RDI: 0000557b6d4f1402
> Jul 07 20:02:02 home kernel: RBP: 0000000000000000 R08: 
> 0000557b6f478d80 R09: 0000000000000040
> Jul 07 20:02:02 home kernel: R10: 0000000000000006 R11: 
> 0000000000000202 R12: 00007ffc4ed07060
> Jul 07 20:02:02 home kernel: R13: 0000557b6d4f17ff R14: 
> 0000000000000000 R15: 0000000000000000
> Jul 07 20:02:02 home kernel: INFO: task xfsaild/bcache0:5174 blocked 
> for more than 120 seconds.
> Jul 07 20:02:02 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> Jul 07 20:02:02 home kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 07 20:02:02 home kernel: xfsaild/bcache0 D    0  5174      2 
> 0x00000000
> Jul 07 20:02:02 home kernel: Call Trace:
> Jul 07 20:02:02 home kernel:  __schedule+0x22e/0x8e0
> Jul 07 20:02:02 home kernel:  schedule+0x3d/0x90
> Jul 07 20:02:02 home kernel:  schedule_timeout+0x224/0x3c0
> Jul 07 20:02:02 home kernel:  ? check_preempt_curr+0x27/0x80
> Jul 07 20:02:02 home kernel:  ? ttwu_do_wakeup.isra.16+0x1e/0x170
> Jul 07 20:02:02 home kernel:  wait_for_common+0xb9/0x180
> Jul 07 20:02:02 home kernel:  ? wait_for_common+0xb9/0x180
> Jul 07 20:02:02 home kernel:  ? wake_up_q+0x80/0x80
> Jul 07 20:02:02 home kernel:  wait_for_completion+0x1d/0x20
> Jul 07 20:02:02 home kernel:  flush_work+0x13b/0x1c0
> Jul 07 20:02:02 home kernel:  ? flush_workqueue_prep_pwqs+0x190/0x190
> Jul 07 20:02:02 home kernel:  xlog_cil_force_lsn+0x7c/0x200 [xfs]
> Jul 07 20:02:02 home kernel:  ? try_to_del_timer_sync+0x53/0x80
> Jul 07 20:02:02 home kernel:  _xfs_log_force+0x85/0x290 [xfs]
> Jul 07 20:02:02 home kernel:  ? del_timer_sync+0x50/0x50
> Jul 07 20:02:02 home kernel:  ? xfsaild+0x16a/0x7c0 [xfs]
> Jul 07 20:02:02 home kernel:  xfs_log_force+0x2c/0x90 [xfs]
> Jul 07 20:02:02 home kernel:  xfsaild+0x16a/0x7c0 [xfs]
> Jul 07 20:02:02 home kernel:  kthread+0x125/0x140
> Jul 07 20:02:02 home kernel:  ? kthread+0x125/0x140
> Jul 07 20:02:02 home kernel:  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
> Jul 07 20:02:02 home kernel:  ? kthread_create_on_node+0x70/0x70
> Jul 07 20:02:02 home kernel:  ret_from_fork+0x2c/0x40

This is bcache issue?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: lock, Workqueue: bcache bch_data_insert_keys [bcache]
  2017-07-07 15:06 lock, Workqueue: bcache bch_data_insert_keys [bcache] Konstantin Shalygin
@ 2017-07-07 17:16 ` Eric Wheeler
  2017-07-08 11:43   ` Konstantin Shalygin
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wheeler @ 2017-07-07 17:16 UTC (permalink / raw)
  To: Konstantin Shalygin; +Cc: linux-bcache

[-- Attachment #1: Type: TEXT/PLAIN, Size: 17020 bytes --]

On Fri, 7 Jul 2017, Konstantin Shalygin wrote:

> Hello. I caught lock (only hard reset helped to reboot kernel), when I started
> "every week rsync backup" of my laptop '/home'. After reboot I started the
> same task and this don't get any issues.

Hi Konstantin,

Try this patch:

https://www.spinics.net/lists/stable/msg178933.html



--
Eric Wheeler



> 
> I run bcache over md raid5.
> 
> [k0ste@home ~]$ lsblk
> NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
> sda           8:0    0   5.5T  0 disk
> └─md0         9:0    0    15T  0 raid5
>   └─bcache0 254:0    0    15T  0 disk  /srv/raid
> sdb           8:16   0   5.5T  0 disk
> └─md0         9:0    0    15T  0 raid5
>   └─bcache0 254:0    0    15T  0 disk  /srv/raid
> sdc           8:32   0   5.5T  0 disk
> └─md0         9:0    0    15T  0 raid5
>   └─bcache0 254:0    0    15T  0 disk  /srv/raid
> sdd           8:48   1  14.3G  0 disk
> └─sdd1        8:49   1  14.3G  0 part  /boot
> sde           8:64   0   5.5T  0 disk
> └─md0         9:0    0    15T  0 raid5
>   └─bcache0 254:0    0    15T  0 disk  /srv/raid
> sdf           8:80   0 119.2G  0 disk
> └─sdf1        8:81   0 119.2G  0 part  /
> nvme0n1     259:0    0 119.2G  0 disk
> └─bcache0   254:0    0    15T  0 disk  /srv/raid
> 
> dmesg:
> 
> 
> > Jul 07 19:59:59 home kernel: INFO: task bcache:86 blocked for more than 120
> > seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: bcache          D    0    86      2 0x00000000
> > Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys [bcache]
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> > Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 19:59:59 home kernel:  rescuer_thread+0x20c/0x390
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? cancel_delayed_work_sync+0x20/0x20
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 19:59:59 home kernel: INFO: task xfsaild/bcache0:5174 blocked for
> > more than 120 seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: xfsaild/bcache0 D    0  5174      2 0x00000000
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel:  schedule_timeout+0x224/0x3c0
> > Jul 07 19:59:59 home kernel:  ? check_preempt_curr+0x27/0x80
> > Jul 07 19:59:59 home kernel:  ? ttwu_do_wakeup.isra.16+0x1e/0x170
> > Jul 07 19:59:59 home kernel:  wait_for_common+0xb9/0x180
> > Jul 07 19:59:59 home kernel:  ? wait_for_common+0xb9/0x180
> > Jul 07 19:59:59 home kernel:  ? wake_up_q+0x80/0x80
> > Jul 07 19:59:59 home kernel:  wait_for_completion+0x1d/0x20
> > Jul 07 19:59:59 home kernel:  flush_work+0x13b/0x1c0
> > Jul 07 19:59:59 home kernel:  ? flush_workqueue_prep_pwqs+0x190/0x190
> > Jul 07 19:59:59 home kernel:  xlog_cil_force_lsn+0x7c/0x200 [xfs]
> > Jul 07 19:59:59 home kernel:  ? try_to_del_timer_sync+0x53/0x80
> > Jul 07 19:59:59 home kernel:  _xfs_log_force+0x85/0x290 [xfs]
> > Jul 07 19:59:59 home kernel:  ? del_timer_sync+0x50/0x50
> > Jul 07 19:59:59 home kernel:  ? xfsaild+0x16a/0x7c0 [xfs]
> > Jul 07 19:59:59 home kernel:  xfs_log_force+0x2c/0x90 [xfs]
> > Jul 07 19:59:59 home kernel:  xfsaild+0x16a/0x7c0 [xfs]
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 19:59:59 home kernel: INFO: task kworker/0:1:84671 blocked for more
> > than 120 seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: kworker/0:1     D    0 84671      2 0x00000000
> > Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys [bcache]
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> > Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 19:59:59 home kernel: INFO: task kworker/0:3:84673 blocked for more
> > than 120 seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: kworker/0:3     D    0 84673      2 0x00000000
> > Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys [bcache]
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> > Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 19:59:59 home kernel: INFO: task kworker/2:2:87164 blocked for more
> > than 120 seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: kworker/2:2     D    0 87164      2 0x00000000
> > Jul 07 19:59:59 home kernel: Workqueue: xfs-sync/bcache0 xfs_log_worker
> > [xfs]
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  ? find_next_bit+0xb/0x10
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel:  schedule_timeout+0x224/0x3c0
> > Jul 07 19:59:59 home kernel:  wait_for_common+0xb9/0x180
> > Jul 07 19:59:59 home kernel:  ? wait_for_common+0xb9/0x180
> > Jul 07 19:59:59 home kernel:  ? wake_up_q+0x80/0x80
> > Jul 07 19:59:59 home kernel:  wait_for_completion+0x1d/0x20
> > Jul 07 19:59:59 home kernel:  flush_work+0x13b/0x1c0
> > Jul 07 19:59:59 home kernel:  ? flush_workqueue_prep_pwqs+0x190/0x190
> > Jul 07 19:59:59 home kernel:  xlog_cil_force_lsn+0x7c/0x200 [xfs]
> > Jul 07 19:59:59 home kernel:  ? find_next_bit+0xb/0x10
> > Jul 07 19:59:59 home kernel:  _xfs_log_force+0x85/0x290 [xfs]
> > Jul 07 19:59:59 home kernel:  ? __switch_to+0x270/0x490
> > Jul 07 19:59:59 home kernel:  ? xfs_log_worker+0x34/0x100 [xfs]
> > Jul 07 19:59:59 home kernel:  xfs_log_force+0x2c/0x90 [xfs]
> > Jul 07 19:59:59 home kernel:  xfs_log_worker+0x34/0x100 [xfs]
> > Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 19:59:59 home kernel: INFO: task kworker/1:28:89441 blocked for more
> > than 120 seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: kworker/1:28    D    0 89441      2 0x00000000
> > Jul 07 19:59:59 home kernel: Workqueue: xfs-cil/bcache0 xlog_cil_push_work
> > [xfs]
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel: xlog_state_get_iclog_space+0xf8/0x2d0 [xfs]
> > Jul 07 19:59:59 home kernel:  ? wake_up_q+0x80/0x80
> > Jul 07 19:59:59 home kernel:  xlog_write+0x166/0x640 [xfs]
> > Jul 07 19:59:59 home kernel:  xlog_cil_push+0x2a1/0x490 [xfs]
> > Jul 07 19:59:59 home kernel:  ? put_prev_entity+0x40/0xc10
> > Jul 07 19:59:59 home kernel:  ? __switch_to+0x270/0x490
> > Jul 07 19:59:59 home kernel:  xlog_cil_push_work+0x15/0x20 [xfs]
> > Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 19:59:59 home kernel:  worker_thread+0x225/0x4e0
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 19:59:59 home kernel: INFO: task kworker/0:0:92924 blocked for more
> > than 120 seconds.
> > Jul 07 19:59:59 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 19:59:59 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 19:59:59 home kernel: kworker/0:0     D    0 92924      2 0x00000000
> > Jul 07 19:59:59 home kernel: Workqueue: bcache bch_data_insert_keys [bcache]
> > Jul 07 19:59:59 home kernel: Call Trace:
> > Jul 07 19:59:59 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 19:59:59 home kernel:  schedule+0x3d/0x90
> > Jul 07 19:59:59 home kernel:  closure_sync+0x23/0x90 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> > Jul 07 19:59:59 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> > Jul 07 19:59:59 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 19:59:59 home kernel:  worker_thread+0x48/0x4e0
> > Jul 07 19:59:59 home kernel:  kthread+0x125/0x140
> > Jul 07 19:59:59 home kernel:  ? process_one_work+0x490/0x490
> > Jul 07 19:59:59 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 19:59:59 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 20:02:02 home kernel: INFO: task bcache:86 blocked for more than 120
> > seconds.
> > Jul 07 20:02:02 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 20:02:02 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 20:02:02 home kernel: bcache          D    0    86      2 0x00000000
> > Jul 07 20:02:02 home kernel: Workqueue: bcache bch_data_insert_keys [bcache]
> > Jul 07 20:02:02 home kernel: Call Trace:
> > Jul 07 20:02:02 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 20:02:02 home kernel:  schedule+0x3d/0x90
> > Jul 07 20:02:02 home kernel:  closure_sync+0x23/0x90 [bcache]
> > Jul 07 20:02:02 home kernel:  bch_journal+0x115/0x3a0 [bcache]
> > Jul 07 20:02:02 home kernel:  bch_data_insert_keys+0x95/0x130 [bcache]
> > Jul 07 20:02:02 home kernel:  process_one_work+0x1e0/0x490
> > Jul 07 20:02:02 home kernel:  rescuer_thread+0x20c/0x390
> > Jul 07 20:02:02 home kernel:  kthread+0x125/0x140
> > Jul 07 20:02:02 home kernel:  ? cancel_delayed_work_sync+0x20/0x20
> > Jul 07 20:02:02 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 20:02:02 home kernel:  ret_from_fork+0x2c/0x40
> > Jul 07 20:02:02 home kernel: INFO: task systemd-logind:412 blocked for more
> > than 120 seconds.
> > Jul 07 20:02:02 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 20:02:02 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 20:02:02 home kernel: systemd-logind  D    0   412      1 0x00000100
> > Jul 07 20:02:02 home kernel: Call Trace:
> > Jul 07 20:02:02 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 20:02:02 home kernel:  schedule+0x3d/0x90
> > Jul 07 20:02:02 home kernel:  rwsem_down_write_failed+0x175/0x250
> > Jul 07 20:02:02 home kernel:  ? ida_get_new_above+0x1a7/0x320
> > Jul 07 20:02:02 home kernel: call_rwsem_down_write_failed+0x17/0x30
> > Jul 07 20:02:02 home kernel:  ? call_rwsem_down_write_failed+0x17/0x30
> > Jul 07 20:02:02 home kernel:  down_write+0x24/0x40
> > Jul 07 20:02:02 home kernel:  register_shrinker+0x3e/0x90
> > Jul 07 20:02:02 home kernel:  sget_userns+0x409/0x470
> > Jul 07 20:02:02 home kernel:  ? get_anon_bdev+0x110/0x110
> > Jul 07 20:02:02 home kernel:  ? shmem_remount_fs+0x1c0/0x1c0
> > Jul 07 20:02:02 home kernel:  sget+0x5e/0x80
> > Jul 07 20:02:02 home kernel:  ? get_anon_bdev+0x110/0x110
> > Jul 07 20:02:02 home kernel:  mount_nodev+0x30/0xa0
> > Jul 07 20:02:02 home kernel:  shmem_mount+0x18/0x20
> > Jul 07 20:02:02 home kernel:  mount_fs+0x32/0x160
> > Jul 07 20:02:02 home kernel:  vfs_kern_mount.part.7+0x5d/0x110
> > Jul 07 20:02:02 home kernel:  do_mount+0x528/0xc60
> > Jul 07 20:02:02 home kernel:  ? _copy_from_user+0x4b/0x80
> > Jul 07 20:02:02 home kernel:  SyS_mount+0x5a/0xe0
> > Jul 07 20:02:02 home kernel:  do_syscall_64+0x54/0xc0
> > Jul 07 20:02:02 home kernel:  entry_SYSCALL64_slow_path+0x25/0x25
> > Jul 07 20:02:02 home kernel: RIP: 0033:0x7f35e30b98da
> > Jul 07 20:02:02 home kernel: RSP: 002b:00007ffc4ed06fb8 EFLAGS: 00000202
> > ORIG_RAX: 00000000000000a5
> > Jul 07 20:02:02 home kernel: RAX: ffffffffffffffda RBX: 0000557b6f476480
> > RCX: 00007f35e30b98da
> > Jul 07 20:02:02 home kernel: RDX: 0000557b6d4f1402 RSI: 0000557b6f478b80
> > RDI: 0000557b6d4f1402
> > Jul 07 20:02:02 home kernel: RBP: 0000000000000000 R08: 0000557b6f478d80
> > R09: 0000000000000040
> > Jul 07 20:02:02 home kernel: R10: 0000000000000006 R11: 0000000000000202
> > R12: 00007ffc4ed07060
> > Jul 07 20:02:02 home kernel: R13: 0000557b6d4f17ff R14: 0000000000000000
> > R15: 0000000000000000
> > Jul 07 20:02:02 home kernel: INFO: task xfsaild/bcache0:5174 blocked for
> > more than 120 seconds.
> > Jul 07 20:02:02 home kernel:       Tainted: G           O 4.11.7-1-ARCH #1
> > Jul 07 20:02:02 home kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Jul 07 20:02:02 home kernel: xfsaild/bcache0 D    0  5174      2 0x00000000
> > Jul 07 20:02:02 home kernel: Call Trace:
> > Jul 07 20:02:02 home kernel:  __schedule+0x22e/0x8e0
> > Jul 07 20:02:02 home kernel:  schedule+0x3d/0x90
> > Jul 07 20:02:02 home kernel:  schedule_timeout+0x224/0x3c0
> > Jul 07 20:02:02 home kernel:  ? check_preempt_curr+0x27/0x80
> > Jul 07 20:02:02 home kernel:  ? ttwu_do_wakeup.isra.16+0x1e/0x170
> > Jul 07 20:02:02 home kernel:  wait_for_common+0xb9/0x180
> > Jul 07 20:02:02 home kernel:  ? wait_for_common+0xb9/0x180
> > Jul 07 20:02:02 home kernel:  ? wake_up_q+0x80/0x80
> > Jul 07 20:02:02 home kernel:  wait_for_completion+0x1d/0x20
> > Jul 07 20:02:02 home kernel:  flush_work+0x13b/0x1c0
> > Jul 07 20:02:02 home kernel:  ? flush_workqueue_prep_pwqs+0x190/0x190
> > Jul 07 20:02:02 home kernel:  xlog_cil_force_lsn+0x7c/0x200 [xfs]
> > Jul 07 20:02:02 home kernel:  ? try_to_del_timer_sync+0x53/0x80
> > Jul 07 20:02:02 home kernel:  _xfs_log_force+0x85/0x290 [xfs]
> > Jul 07 20:02:02 home kernel:  ? del_timer_sync+0x50/0x50
> > Jul 07 20:02:02 home kernel:  ? xfsaild+0x16a/0x7c0 [xfs]
> > Jul 07 20:02:02 home kernel:  xfs_log_force+0x2c/0x90 [xfs]
> > Jul 07 20:02:02 home kernel:  xfsaild+0x16a/0x7c0 [xfs]
> > Jul 07 20:02:02 home kernel:  kthread+0x125/0x140
> > Jul 07 20:02:02 home kernel:  ? kthread+0x125/0x140
> > Jul 07 20:02:02 home kernel:  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
> > Jul 07 20:02:02 home kernel:  ? kthread_create_on_node+0x70/0x70
> > Jul 07 20:02:02 home kernel:  ret_from_fork+0x2c/0x40
> 
> This is bcache issue?
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: lock, Workqueue: bcache bch_data_insert_keys [bcache]
  2017-07-07 17:16 ` Eric Wheeler
@ 2017-07-08 11:43   ` Konstantin Shalygin
  0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Shalygin @ 2017-07-08 11:43 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-bcache

Eric, thanks. I was apply this patch on 4.11.9.

I hope that I will not stumble upon this lock again.


On 07/08/2017 12:16 AM, Eric Wheeler wrote:
> Hi Konstantin,
>
> Try this patch:
>
> https://www.spinics.net/lists/stable/msg178933.html

-- 
Best regards,
Konstantin Shalygin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: lock, Workqueue: bcache bch_data_insert_keys [bcache]
@ 2017-08-06  3:40 Konstantin Shalygin
  0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Shalygin @ 2017-08-06  3:40 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-bcache

[-- Attachment #1: Type: text/plain, Size: 111 bytes --]

After month of usage I can confirm that this patch resolve this issue.

-- 
Best regards,
Konstantin Shalygin


[-- Attachment #2: linux-4.11-bcache_fix_for_gc_and_write-back_race.diff --]
[-- Type: text/x-patch, Size: 3576 bytes --]

From: Tang Junhui <tang.junhui@xxxxxxxxxx>

gc and write-back get raced (see the email "bcache get stucked" I sended
before):
gc thread						write-back thread
|							|bch_writeback_thread()
|bch_gc_thread()					|
|							|==>read_dirty()
|==>bch_btree_gc()					|
|==>btree_root() //get btree root			|
|			node write locker		|
|==>bch_btree_gc_root()					|
|							|==>read_dirty_submit()
|							|==>write_dirty()
|							|==>continue_at(cl, write_dirty_finish, system_wq);
|							|==>write_dirty_finish()//excute in system_wq
|							|==>bch_btree_insert()
|							|==>bch_btree_map_leaf_nodes()
|							|==>__bch_btree_map_nodes()
|							|==>btree_root //try to get btree root node read lock
|							|-----stuck here
|==>bch_btree_set_root()				|
|==>bch_journal_meta()					|
|==>bch_journal()					|
|==>journal_try_write()					|
|==>journal_write_unlocked() //journal_full(&c->journal) condition satisfied
|==>continue_at(cl, journal_write, system_wq); //try to excute journal_write in system_wq
|					//but work queue is excuting write_dirty_finish()
|==>closure_sync(); //wait journal_write execute over and wake up gc,
|			--stuck here
|==>release root node write locker

This patch alloc a separate work-queue for write-back thread to avoid such
race.

Signed-off-by: Tang Junhui <tang.junhui@xxxxxxxxxx>
---
diff -Naupr linux-4.11_orig/drivers/md/bcache/bcache.h linux-4.11/drivers/md/bcache/bcache.h
--- linux-4.11_orig/drivers/md/bcache/bcache.h	2017-05-01 09:47:48.000000000 +0700
+++ linux-4.11/drivers/md/bcache/bcache.h	2017-07-08 11:49:25.044671097 +0700
@@ -333,6 +333,7 @@ struct cached_dev {
 	/* Limit number of writeback bios in flight */
 	struct semaphore	in_flight;
 	struct task_struct	*writeback_thread;
+	struct workqueue_struct	*writeback_write_wq;
 
 	struct keybuf		writeback_keys;
 
diff -Naupr linux-4.11_orig/drivers/md/bcache/super.c linux-4.11/drivers/md/bcache/super.c
--- linux-4.11_orig/drivers/md/bcache/super.c	2017-05-01 09:47:48.000000000 +0700
+++ linux-4.11/drivers/md/bcache/super.c	2017-07-08 11:50:16.574952148 +0700
@@ -1061,6 +1061,8 @@ static void cached_dev_free(struct closu
 	cancel_delayed_work_sync(&dc->writeback_rate_update);
 	if (!IS_ERR_OR_NULL(dc->writeback_thread))
 		kthread_stop(dc->writeback_thread);
+		if (dc->writeback_write_wq)
+			destroy_workqueue(dc->writeback_write_wq);
 
 	mutex_lock(&bch_register_lock);
 
diff -Naupr linux-4.11_orig/drivers/md/bcache/writeback.c linux-4.11/drivers/md/bcache/writeback.c
--- linux-4.11_orig/drivers/md/bcache/writeback.c	2017-05-01 09:47:48.000000000 +0700
+++ linux-4.11/drivers/md/bcache/writeback.c	2017-07-08 11:52:21.799010810 +0700
@@ -186,7 +186,7 @@ static void write_dirty(struct closure *
 
 	closure_bio_submit(&io->bio, cl);
 
-	continue_at(cl, write_dirty_finish, system_wq);
+	continue_at(cl, write_dirty_finish, io->dc->writeback_write_wq);
 }
 
 static void read_dirty_endio(struct bio *bio)
@@ -206,7 +206,7 @@ static void read_dirty_submit(struct clo
 
 	closure_bio_submit(&io->bio, cl);
 
-	continue_at(cl, write_dirty, system_wq);
+	continue_at(cl, write_dirty, io->dc->writeback_write_wq);
 }
 
 static void read_dirty(struct cached_dev *dc)
@@ -516,6 +516,10 @@ void bch_cached_dev_writeback_init(struc
 
 int bch_cached_dev_writeback_start(struct cached_dev *dc)
 {
+	dc->writeback_write_wq = alloc_workqueue("bcache_writeback_wq", WQ_MEM_RECLAIM, 0);
+	if (!dc->writeback_write_wq)
+		return -ENOMEM;
+
 	dc->writeback_thread = kthread_create(bch_writeback_thread, dc,
 					      "bcache_writeback");
 	if (IS_ERR(dc->writeback_thread))

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-06  3:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-07 15:06 lock, Workqueue: bcache bch_data_insert_keys [bcache] Konstantin Shalygin
2017-07-07 17:16 ` Eric Wheeler
2017-07-08 11:43   ` Konstantin Shalygin
2017-08-06  3:40 Konstantin Shalygin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.