All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs hang on nfs?
@ 2019-01-09 10:54 Scott E. Blomquist
  2019-01-09 12:14 ` Jojo
  0 siblings, 1 reply; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-09 10:54 UTC (permalink / raw)
  To: linux-btrfs


Hi All,

I have a system that has been hanging/wedging frequently.  It seems to
be load dependent but have not been able to isolate the problem.  The
only option once the hang happens is to reboot via
/proc/sysreq_trigger

In dmesg I see this...

    [Tue Jan  8 16:03:40 2019] perf: interrupt took too long (3193 > 3176), lowering kernel.perf_event_max_sample_rate to 62500
    [Tue Jan  8 16:25:19 2019] perf: interrupt took too long (4013 > 3991), lowering kernel.perf_event_max_sample_rate to 49750
    [Tue Jan  8 17:01:20 2019] perf: interrupt took too long (5043 > 5016), lowering kernel.perf_event_max_sample_rate to 39500
    [Tue Jan  8 17:16:47 2019] INFO: task btrfs-transacti:2098 blocked for more than 120 seconds.
    [Tue Jan  8 17:16:47 2019]       Not tainted 4.17.14-custom #1
    [Tue Jan  8 17:16:47 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [Tue Jan  8 17:16:47 2019] btrfs-transacti D    0  2098      2 0x80000000
    [Tue Jan  8 17:16:47 2019] Call Trace:
    [Tue Jan  8 17:16:47 2019]  ? __schedule+0x2cf/0x850
    [Tue Jan  8 17:16:47 2019]  schedule+0x32/0x80
    [Tue Jan  8 17:16:47 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
    [Tue Jan  8 17:16:47 2019]  ? wait_woken+0x80/0x80
    [Tue Jan  8 17:16:47 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
    [Tue Jan  8 17:16:47 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
    [Tue Jan  8 17:16:47 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
    [Tue Jan  8 17:16:47 2019]  ? btrfs_run_delayed_refs+0x93/0x1e0 [btrfs]
    [Tue Jan  8 17:16:47 2019]  commit_cowonly_roots+0x1f0/0x280 [btrfs]
    [Tue Jan  8 17:16:47 2019]  btrfs_commit_transaction+0x3a2/0x910 [btrfs]
    [Tue Jan  8 17:16:47 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
    [Tue Jan  8 17:16:47 2019]  transaction_kthread+0x14d/0x180 [btrfs]
    [Tue Jan  8 17:16:47 2019]  kthread+0xf8/0x130
    [Tue Jan  8 17:16:47 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
    [Tue Jan  8 17:16:47 2019]  ? kthread_bind+0x10/0x10
    [Tue Jan  8 17:16:47 2019]  ret_from_fork+0x35/0x40
    [Tue Jan  8 17:16:47 2019] INFO: task nfsd:4154 blocked for more than 120 seconds.
    [Tue Jan  8 17:16:47 2019]       Not tainted 4.17.14-custom #1
    [Tue Jan  8 17:16:47 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [Tue Jan  8 17:16:47 2019] nfsd            D    0  4154      2 0x80000000
    [Tue Jan  8 17:16:47 2019] Call Trace:
    [Tue Jan  8 17:16:47 2019]  ? __schedule+0x2cf/0x850
    [Tue Jan  8 17:16:47 2019]  ? iput+0x6f/0x1b0
    [Tue Jan  8 17:16:47 2019]  schedule+0x32/0x80
    [Tue Jan  8 17:16:47 2019]  rwsem_down_write_failed+0x1e0/0x350
    [Tue Jan  8 17:16:47 2019]  call_rwsem_down_write_failed+0x13/0x20
    [Tue Jan  8 17:16:47 2019]  down_write+0x29/0x40
    [Tue Jan  8 17:16:47 2019]  btrfs_file_write_iter+0xac/0x570 [btrfs]
    [Tue Jan  8 17:16:47 2019]  ? nfsd_setuser+0x103/0x270 [nfsd]
    [Tue Jan  8 17:16:47 2019]  do_iter_readv_writev+0xff/0x150
    [Tue Jan  8 17:16:47 2019]  do_iter_write+0x78/0x180
    [Tue Jan  8 17:16:47 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
    [Tue Jan  8 17:16:47 2019]  nfsd_write+0x84/0x150 [nfsd]
    [Tue Jan  8 17:16:47 2019]  nfsd3_proc_write+0xcc/0x150 [nfsd]
    [Tue Jan  8 17:16:47 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
    [Tue Jan  8 17:16:47 2019]  svc_process_common+0x382/0x730 [sunrpc]
    [Tue Jan  8 17:16:47 2019]  svc_process+0xeb/0x100 [sunrpc]
    [Tue Jan  8 17:16:47 2019]  nfsd+0xe3/0x150 [nfsd]
    [Tue Jan  8 17:16:47 2019]  kthread+0xf8/0x130
    [Tue Jan  8 17:16:47 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
    [Tue Jan  8 17:16:47 2019]  ? kthread_bind+0x10/0x10
    [Tue Jan  8 17:16:47 2019]  ret_from_fork+0x35/0x40
    ....

here is the other relevant info...

Linux kanlabfs 4.17.14-custom #1 SMP Sun Aug 12 11:54:00 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

btrfs-progs v4.17.1

Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
        Total devices 3 FS bytes used 75.72TiB
        devid    1 size 47.30TiB used 42.22TiB path /dev/sda1
        devid    2 size 21.83TiB used 16.76TiB path /dev/sdb1
        devid    3 size 21.83TiB used 16.76TiB path /dev/sdc1

btrfs fi df /export/
Data, single: total=75.55TiB, used=75.54TiB
System, single: total=36.00MiB, used=7.89MiB
Metadata, single: total=187.01GiB, used=185.65GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

Thanks for any help,

Cheers,

sb.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-09 10:54 btrfs hang on nfs? Scott E. Blomquist
@ 2019-01-09 12:14 ` Jojo
  2019-01-09 13:31   ` Scott E. Blomquist
  0 siblings, 1 reply; 12+ messages in thread
From: Jojo @ 2019-01-09 12:14 UTC (permalink / raw)
  To: Scott E. Blomquist, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1861 bytes --]

Am 09.01.19 um 11:54 schrieb Scott E. Blomquist:
> 
> Hi All,
> 
> I have a system that has been hanging/wedging frequently.  It seems to
> be load dependent but have not been able to isolate the problem.  The
> only option once the hang happens is to reboot via
> /proc/sysreq_trigger
> 
> btrfs fi df /export/
> Data, single: total=75.55TiB, used=75.54TiB
> System, single: total=36.00MiB, used=7.89MiB
> Metadata, single: total=187.01GiB, used=185.65GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B


Within scheduler there were some changes according to device queueing.
Kernels after 4.16 and before 4.19.8 were problematic, as we recognized
here.
This seemed to be a regression on the "kyber" scheduler, as recognized here:
[root@pc7 sysctl.d]# cat /sys/block/sdc/queue/scheduler
mq-deadline [kyber] bfq none

As work around you may change the disc scheduler to an "old" one, like

Test it like this for all of your devices:
root@pc7:~/ # echo bfq > /sys/block/sdX/queue/scheduler

For permanent add a file config to your /etc/sysctl.d.
EXAMPLE, edit it to your requirements.

[root@pc7 sysctl.d]# cat /etc/sysctl.d/90-sysscheduler.conf
#disk scheduler
block.sda.queue.scheduler = md-deadline
block.sdb.queue.scheduler = md-deadline
block.sdg.queue.scheduler = bfq
block.sdh.queue.scheduler = bfq
block.sdi.queue.scheduler = bfq


Hope this may help you.
It did work for me here.

Better Idea is to upgrade the kernel to 4.19.13 (which is a LTS Kernel)

mit freundlichen Grüßen
Jürgen Sauer
-- 
Jürgen Sauer - automatiX GmbH,
+49-4209-4699, juergen.sauer@automatix.de
Geschäftsführer: Jürgen Sauer,
Gerichtstand: Amtsgericht Walsrode • HRB 120986
Ust-Id: DE191468481 • St.Nr.: 36/211/08000
GPG Public Key zur Signaturprüfung:
http://www.automatix.de/juergen_sauer_publickey.gpg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-09 12:14 ` Jojo
@ 2019-01-09 13:31   ` Scott E. Blomquist
  2019-01-10 11:46     ` Scott E. Blomquist
  0 siblings, 1 reply; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-09 13:31 UTC (permalink / raw)
  To: Jojo; +Cc: Scott E. Blomquist, linux-btrfs


Jojo writes:
 > 
 > Within scheduler there were some changes according to device queueing.
 > Kernels after 4.16 and before 4.19.8 were problematic, as we recognized
 > here.
 > This seemed to be a regression on the "kyber" scheduler, as recognized here:
 > [root@pc7 sysctl.d]# cat /sys/block/sdc/queue/scheduler
 > mq-deadline [kyber] bfq none
 > 
 > As work around you may change the disc scheduler to an "old" one, like
 > 
 > Test it like this for all of your devices:
 > root@pc7:~/ # echo bfq > /sys/block/sdX/queue/scheduler
 > 
 > For permanent add a file config to your /etc/sysctl.d.
 > EXAMPLE, edit it to your requirements.
 > 
 > [root@pc7 sysctl.d]# cat /etc/sysctl.d/90-sysscheduler.conf
 > #disk scheduler
 > block.sda.queue.scheduler = md-deadline
 > block.sdb.queue.scheduler = md-deadline
 > block.sdg.queue.scheduler = bfq
 > block.sdh.queue.scheduler = bfq
 > block.sdi.queue.scheduler = bfq
 > 
 > 
 > Hope this may help you.
 > It did work for me here.
 > 
 > Better Idea is to upgrade the kernel to 4.19.13 (which is a LTS Kernel)
 > 
 > mit freundlichen Gr  en
 > J rgen Sauer

Thank you so much, I would have never guessed that.

In my case the scheduler was set to deadline

    cat /sys/block/sd[a-c]/queue/scheduler
    noop [deadline] cfq
    noop [deadline] cfq
    noop [deadline] cfq

I have set it to cfq, I did not see an option for bfq.

I am also building 4.19.13 and will swap that in ASAP.

Thanks again,

Cheers,

sb. Scott Blomquist

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-09 13:31   ` Scott E. Blomquist
@ 2019-01-10 11:46     ` Scott E. Blomquist
  2019-01-10 11:51       ` Nikolay Borisov
  0 siblings, 1 reply; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-10 11:46 UTC (permalink / raw)
  To: Jojo, linux-btrfs; +Cc: sb


Scott E. Blomquist writes:
 > 
 > Thank you so much, I would have never guessed that.
 > 
 > In my case the scheduler was set to deadline
 > 
 >     cat /sys/block/sd[a-c]/queue/scheduler
 >     noop [deadline] cfq
 >     noop [deadline] cfq
 >     noop [deadline] cfq
 > 
 > I have set it to cfq, I did not see an option for bfq.
 > 
 > I am also building 4.19.13 and will swap that in ASAP.
 > 

Unfortunately the cfq scheduler did not help.  The system wedged.

I did notice this for the first time...

    [Wed Jan  9 06:03:41 2019] BTRFS info (device sda1): the free space cache file (83320273633280) is invalid, skip it

anything I should do about that?

The messages were similar...

    [Wed Jan  9 23:52:04 2019] INFO: task nfsd:2997 blocked for more than 120 seconds.
    [Wed Jan  9 23:52:04 2019]       Not tainted 4.17.14-custom #1
    [Wed Jan  9 23:52:04 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [Wed Jan  9 23:52:04 2019] nfsd            D    0  2997      2 0x80000000
    [Wed Jan  9 23:52:04 2019] Call Trace:
    [Wed Jan  9 23:52:04 2019]  ? __schedule+0x2cf/0x850
    [Wed Jan  9 23:52:04 2019]  ? iput+0x6f/0x1b0
    [Wed Jan  9 23:52:04 2019]  schedule+0x32/0x80
    [Wed Jan  9 23:52:04 2019]  rwsem_down_write_failed+0x1e0/0x350
    [Wed Jan  9 23:52:04 2019]  call_rwsem_down_write_failed+0x13/0x20
    [Wed Jan  9 23:52:04 2019]  down_write+0x29/0x40
    [Wed Jan  9 23:52:04 2019]  btrfs_file_write_iter+0xac/0x570 [btrfs]
    [Wed Jan  9 23:52:04 2019]  ? nfsd_setuser+0x103/0x270 [nfsd]
    [Wed Jan  9 23:52:04 2019]  do_iter_readv_writev+0xff/0x150
    [Wed Jan  9 23:52:04 2019]  do_iter_write+0x78/0x180
    [Wed Jan  9 23:52:04 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
    [Wed Jan  9 23:52:04 2019]  nfsd_write+0x84/0x150 [nfsd]
    [Wed Jan  9 23:52:04 2019]  nfsd3_proc_write+0xcc/0x150 [nfsd]
    [Wed Jan  9 23:52:04 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
    [Wed Jan  9 23:52:04 2019]  svc_process_common+0x382/0x730 [sunrpc]
    [Wed Jan  9 23:52:04 2019]  svc_process+0xeb/0x100 [sunrpc]
    [Wed Jan  9 23:52:04 2019]  nfsd+0xe3/0x150 [nfsd]
    [Wed Jan  9 23:52:04 2019]  kthread+0xf8/0x130
    [Wed Jan  9 23:52:04 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
    [Wed Jan  9 23:52:04 2019]  ? kthread_bind+0x10/0x10
    [Wed Jan  9 23:52:04 2019]  ret_from_fork+0x35/0x40
... these two were different and at the tail of the dmesg...
    [Wed Jan  9 23:52:05 2019] INFO: task kworker/u18:8:24049 blocked for more than 120 seconds.
    [Wed Jan  9 23:52:05 2019]       Not tainted 4.17.14-custom #1
    [Wed Jan  9 23:52:05 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [Wed Jan  9 23:52:05 2019] kworker/u18:8   D    0 24049      2 0x80000000
    [Wed Jan  9 23:52:05 2019] Workqueue: writeback wb_workfn (flush-btrfs-1)
    [Wed Jan  9 23:52:05 2019] Call Trace:
    [Wed Jan  9 23:52:05 2019]  ? __schedule+0x2cf/0x850
    [Wed Jan  9 23:52:05 2019]  schedule+0x32/0x80
    [Wed Jan  9 23:52:05 2019]  wait_current_trans+0x87/0xd0 [btrfs]
    [Wed Jan  9 23:52:05 2019]  ? wait_woken+0x80/0x80
    [Wed Jan  9 23:52:05 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
    [Wed Jan  9 23:52:05 2019]  find_free_extent+0x104b/0x1060 [btrfs]
    [Wed Jan  9 23:52:05 2019]  ? alloc_debug_processing+0x7a/0x12c
    [Wed Jan  9 23:52:05 2019]  btrfs_reserve_extent+0xf6/0x1f0 [btrfs]
    [Wed Jan  9 23:52:05 2019]  cow_file_range.isra.67+0x119/0x440 [btrfs]
    [Wed Jan  9 23:52:05 2019]  run_delalloc_range+0xff/0x380 [btrfs]
    [Wed Jan  9 23:52:05 2019]  writepage_delalloc+0xfb/0x150 [btrfs]
    [Wed Jan  9 23:52:05 2019]  __extent_writepage+0xdf/0x2b0 [btrfs]
    [Wed Jan  9 23:52:05 2019]  ? invalid_page_referenced_vma+0x90/0x90
    [Wed Jan  9 23:52:05 2019]  extent_write_cache_pages+0x22f/0x3d0 [btrfs]
    [Wed Jan  9 23:52:05 2019]  extent_writepages+0x3f/0x60 [btrfs]
    [Wed Jan  9 23:52:05 2019]  do_writepages+0x1a/0x70
    [Wed Jan  9 23:52:05 2019]  __writeback_single_inode+0x3d/0x340
    [Wed Jan  9 23:52:05 2019]  writeback_sb_inodes+0x224/0x480
    [Wed Jan  9 23:52:05 2019]  __writeback_inodes_wb+0x87/0xb0
    [Wed Jan  9 23:52:05 2019]  wb_writeback+0x258/0x2f0
    [Wed Jan  9 23:52:05 2019]  ? cpumask_next+0x16/0x20
    [Wed Jan  9 23:52:05 2019]  wb_workfn+0x2df/0x430
    [Wed Jan  9 23:52:05 2019]  process_one_work+0x14c/0x350
    [Wed Jan  9 23:52:05 2019]  worker_thread+0x47/0x3e0
    [Wed Jan  9 23:52:05 2019]  kthread+0xf8/0x130
    [Wed Jan  9 23:52:05 2019]  ? rescuer_thread+0x2f0/0x2f0
    [Wed Jan  9 23:52:05 2019]  ? kthread_bind+0x10/0x10
    [Wed Jan  9 23:52:05 2019]  ret_from_fork+0x35/0x40
    [Wed Jan  9 23:54:07 2019] INFO: task dsmc:17747 blocked for more than 120 seconds.
    [Wed Jan  9 23:54:07 2019]       Not tainted 4.17.14-custom #1
    [Wed Jan  9 23:54:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [Wed Jan  9 23:54:07 2019] dsmc            D    0 17747      1 0x00000000
    [Wed Jan  9 23:54:07 2019] Call Trace:
    [Wed Jan  9 23:54:07 2019]  ? __schedule+0x2cf/0x850
    [Wed Jan  9 23:54:07 2019]  ? btrfs_set_path_blocking+0x31/0x60 [btrfs]
    [Wed Jan  9 23:54:07 2019]  schedule+0x32/0x80
    [Wed Jan  9 23:54:07 2019]  wait_current_trans+0x87/0xd0 [btrfs]
    [Wed Jan  9 23:54:07 2019]  ? wait_woken+0x80/0x80
    [Wed Jan  9 23:54:07 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
    [Wed Jan  9 23:54:07 2019]  btrfs_dirty_inode+0x49/0xe0 [btrfs]
    [Wed Jan  9 23:54:07 2019]  touch_atime+0x8c/0xb0
    [Wed Jan  9 23:54:07 2019]  iterate_dir+0x10a/0x180
    [Wed Jan  9 23:54:07 2019]  __x64_sys_getdents+0x8e/0x110
    [Wed Jan  9 23:54:07 2019]  ? iterate_dir+0x180/0x180
    [Wed Jan  9 23:54:07 2019]  do_syscall_64+0x55/0x110
    [Wed Jan  9 23:54:07 2019]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [Wed Jan  9 23:54:07 2019] RIP: 0033:0x7f0c982f9215
    [Wed Jan  9 23:54:07 2019] RSP: 002b:00007f0c918a95c0 EFLAGS: 00000246 ORIG_RAX: 000000000000004e
    [Wed Jan  9 23:54:07 2019] RAX: ffffffffffffffda RBX: 00007f0c3f348c60 RCX: 00007f0c982f9215
    [Wed Jan  9 23:54:07 2019] RDX: 0000000000008000 RSI: 00007f0c3f348c60 RDI: 0000000000000007
    [Wed Jan  9 23:54:07 2019] RBP: 00007f0c3f348c60 R08: 00000000000000ff R09: 0000000000001000
    [Wed Jan  9 23:54:07 2019] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6672f
    [Wed Jan  9 23:54:07 2019] R13: 00007f0c918affe0 R14: 0000000000000000 R15: 00007f0c918affb0

I did just upgrade to ...

    Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux

and 

    root@kanlabfs:~/bin# btrfs --version
    btrfs-progs v4.19.1 

We'll see how things go with the new kernel and btrfs progs.

Cheers,

sb. Scott Blomquist

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-10 11:46     ` Scott E. Blomquist
@ 2019-01-10 11:51       ` Nikolay Borisov
  2019-01-10 12:00         ` Scott E. Blomquist
  0 siblings, 1 reply; 12+ messages in thread
From: Nikolay Borisov @ 2019-01-10 11:51 UTC (permalink / raw)
  To: Scott E. Blomquist, Jojo, linux-btrfs



On 10.01.19 г. 13:46 ч., Scott E. Blomquist wrote:
> 
> Scott E. Blomquist writes:
>  > 
>  > Thank you so much, I would have never guessed that.
>  > 
>  > In my case the scheduler was set to deadline
>  > 
>  >     cat /sys/block/sd[a-c]/queue/scheduler
>  >     noop [deadline] cfq
>  >     noop [deadline] cfq
>  >     noop [deadline] cfq
>  > 
>  > I have set it to cfq, I did not see an option for bfq.
>  > 
>  > I am also building 4.19.13 and will swap that in ASAP.
>  > 
> 
> Unfortunately the cfq scheduler did not help.  The system wedged.
> 
> I did notice this for the first time...
> 
>     [Wed Jan  9 06:03:41 2019] BTRFS info (device sda1): the free space cache file (83320273633280) is invalid, skip it

What you could do is mount btrfs with -o clear_cache to make btrfs
rebuild the freespace cache.

> 
> anything I should do about that?
> 
> The messages were similar...
> 
>     [Wed Jan  9 23:52:04 2019] INFO: task nfsd:2997 blocked for more than 120 seconds.
>     [Wed Jan  9 23:52:04 2019]       Not tainted 4.17.14-custom #1
>     [Wed Jan  9 23:52:04 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     [Wed Jan  9 23:52:04 2019] nfsd            D    0  2997      2 0x80000000
>     [Wed Jan  9 23:52:04 2019] Call Trace:
>     [Wed Jan  9 23:52:04 2019]  ? __schedule+0x2cf/0x850
>     [Wed Jan  9 23:52:04 2019]  ? iput+0x6f/0x1b0
>     [Wed Jan  9 23:52:04 2019]  schedule+0x32/0x80
>     [Wed Jan  9 23:52:04 2019]  rwsem_down_write_failed+0x1e0/0x350
>     [Wed Jan  9 23:52:04 2019]  call_rwsem_down_write_failed+0x13/0x20
>     [Wed Jan  9 23:52:04 2019]  down_write+0x29/0x40
>     [Wed Jan  9 23:52:04 2019]  btrfs_file_write_iter+0xac/0x570 [btrfs]
>     [Wed Jan  9 23:52:04 2019]  ? nfsd_setuser+0x103/0x270 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  do_iter_readv_writev+0xff/0x150
>     [Wed Jan  9 23:52:04 2019]  do_iter_write+0x78/0x180
>     [Wed Jan  9 23:52:04 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  nfsd_write+0x84/0x150 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  nfsd3_proc_write+0xcc/0x150 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  svc_process_common+0x382/0x730 [sunrpc]
>     [Wed Jan  9 23:52:04 2019]  svc_process+0xeb/0x100 [sunrpc]
>     [Wed Jan  9 23:52:04 2019]  nfsd+0xe3/0x150 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  kthread+0xf8/0x130
>     [Wed Jan  9 23:52:04 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
>     [Wed Jan  9 23:52:04 2019]  ? kthread_bind+0x10/0x10
>     [Wed Jan  9 23:52:04 2019]  ret_from_fork+0x35/0x40
> ... these two were different and at the tail of the dmesg...
>     [Wed Jan  9 23:52:05 2019] INFO: task kworker/u18:8:24049 blocked for more than 120 seconds.
>     [Wed Jan  9 23:52:05 2019]       Not tainted 4.17.14-custom #1
>     [Wed Jan  9 23:52:05 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     [Wed Jan  9 23:52:05 2019] kworker/u18:8   D    0 24049      2 0x80000000
>     [Wed Jan  9 23:52:05 2019] Workqueue: writeback wb_workfn (flush-btrfs-1)
>     [Wed Jan  9 23:52:05 2019] Call Trace:
>     [Wed Jan  9 23:52:05 2019]  ? __schedule+0x2cf/0x850
>     [Wed Jan  9 23:52:05 2019]  schedule+0x32/0x80
>     [Wed Jan  9 23:52:05 2019]  wait_current_trans+0x87/0xd0 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  ? wait_woken+0x80/0x80
>     [Wed Jan  9 23:52:05 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  find_free_extent+0x104b/0x1060 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  ? alloc_debug_processing+0x7a/0x12c
>     [Wed Jan  9 23:52:05 2019]  btrfs_reserve_extent+0xf6/0x1f0 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  cow_file_range.isra.67+0x119/0x440 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  run_delalloc_range+0xff/0x380 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  writepage_delalloc+0xfb/0x150 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  __extent_writepage+0xdf/0x2b0 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  ? invalid_page_referenced_vma+0x90/0x90
>     [Wed Jan  9 23:52:05 2019]  extent_write_cache_pages+0x22f/0x3d0 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  extent_writepages+0x3f/0x60 [btrfs]
>     [Wed Jan  9 23:52:05 2019]  do_writepages+0x1a/0x70
>     [Wed Jan  9 23:52:05 2019]  __writeback_single_inode+0x3d/0x340
>     [Wed Jan  9 23:52:05 2019]  writeback_sb_inodes+0x224/0x480
>     [Wed Jan  9 23:52:05 2019]  __writeback_inodes_wb+0x87/0xb0
>     [Wed Jan  9 23:52:05 2019]  wb_writeback+0x258/0x2f0
>     [Wed Jan  9 23:52:05 2019]  ? cpumask_next+0x16/0x20
>     [Wed Jan  9 23:52:05 2019]  wb_workfn+0x2df/0x430
>     [Wed Jan  9 23:52:05 2019]  process_one_work+0x14c/0x350
>     [Wed Jan  9 23:52:05 2019]  worker_thread+0x47/0x3e0
>     [Wed Jan  9 23:52:05 2019]  kthread+0xf8/0x130
>     [Wed Jan  9 23:52:05 2019]  ? rescuer_thread+0x2f0/0x2f0
>     [Wed Jan  9 23:52:05 2019]  ? kthread_bind+0x10/0x10
>     [Wed Jan  9 23:52:05 2019]  ret_from_fork+0x35/0x40
>     [Wed Jan  9 23:54:07 2019] INFO: task dsmc:17747 blocked for more than 120 seconds.
>     [Wed Jan  9 23:54:07 2019]       Not tainted 4.17.14-custom #1
>     [Wed Jan  9 23:54:07 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     [Wed Jan  9 23:54:07 2019] dsmc            D    0 17747      1 0x00000000
>     [Wed Jan  9 23:54:07 2019] Call Trace:
>     [Wed Jan  9 23:54:07 2019]  ? __schedule+0x2cf/0x850
>     [Wed Jan  9 23:54:07 2019]  ? btrfs_set_path_blocking+0x31/0x60 [btrfs]
>     [Wed Jan  9 23:54:07 2019]  schedule+0x32/0x80
>     [Wed Jan  9 23:54:07 2019]  wait_current_trans+0x87/0xd0 [btrfs]
>     [Wed Jan  9 23:54:07 2019]  ? wait_woken+0x80/0x80
>     [Wed Jan  9 23:54:07 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
>     [Wed Jan  9 23:54:07 2019]  btrfs_dirty_inode+0x49/0xe0 [btrfs]
>     [Wed Jan  9 23:54:07 2019]  touch_atime+0x8c/0xb0
>     [Wed Jan  9 23:54:07 2019]  iterate_dir+0x10a/0x180
>     [Wed Jan  9 23:54:07 2019]  __x64_sys_getdents+0x8e/0x110
>     [Wed Jan  9 23:54:07 2019]  ? iterate_dir+0x180/0x180
>     [Wed Jan  9 23:54:07 2019]  do_syscall_64+0x55/0x110
>     [Wed Jan  9 23:54:07 2019]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>     [Wed Jan  9 23:54:07 2019] RIP: 0033:0x7f0c982f9215
>     [Wed Jan  9 23:54:07 2019] RSP: 002b:00007f0c918a95c0 EFLAGS: 00000246 ORIG_RAX: 000000000000004e
>     [Wed Jan  9 23:54:07 2019] RAX: ffffffffffffffda RBX: 00007f0c3f348c60 RCX: 00007f0c982f9215
>     [Wed Jan  9 23:54:07 2019] RDX: 0000000000008000 RSI: 00007f0c3f348c60 RDI: 0000000000000007
>     [Wed Jan  9 23:54:07 2019] RBP: 00007f0c3f348c60 R08: 00000000000000ff R09: 0000000000001000
>     [Wed Jan  9 23:54:07 2019] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6672f
>     [Wed Jan  9 23:54:07 2019] R13: 00007f0c918affe0 R14: 0000000000000000 R15: 00007f0c918affb0
> 

These don't tell the full story, what seems to be happening is that
stuff is waiting for transaction to finish but it's not evident which
thread is holding the transaction. Please, paste the output of
"echo w > /proc/sysrq-trigger" so we have full picture of what's blocked
where.


> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-10 11:51       ` Nikolay Borisov
@ 2019-01-10 12:00         ` Scott E. Blomquist
  2019-01-14 11:42           ` Scott E. Blomquist
  0 siblings, 1 reply; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-10 12:00 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Scott E. Blomquist, Jojo, linux-btrfs


Nikolay Borisov writes:
 > > 
 > > Unfortunately the cfq scheduler did not help.  The system wedged.
 > > 
 > > I did notice this for the first time...
 > > 
 > >     [Wed Jan  9 06:03:41 2019] BTRFS info (device sda1): the free space cache file (83320273633280) is invalid, skip it
 > 
 > What you could do is mount btrfs with -o clear_cache to make btrfs
 > rebuild the freespace cache.
 > 
 > > 
 > > anything I should do about that?
 > > 
 > > The messages were similar...
 > > 
 > >     [Wed Jan  9 23:52:04 2019] INFO: task nfsd:2997 blocked for more than 120 seconds.
 > >     [Wed Jan  9 23:52:04 2019]       Not tainted 4.17.14-custom #1
 > >     [Wed Jan  9 23:52:04 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 > >     [stuff deleted]
 > >     [Wed Jan  9 23:54:07 2019] RBP: 00007f0c3f348c60 R08: 00000000000000ff R09: 0000000000001000
 > >     [Wed Jan  9 23:54:07 2019] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6672f
 > >     [Wed Jan  9 23:54:07 2019] R13: 00007f0c918affe0 R14: 0000000000000000 R15: 00007f0c918affb0
 > > 
 > 
 > These don't tell the full story, what seems to be happening is that
 > stuff is waiting for transaction to finish but it's not evident which
 > thread is holding the transaction. Please, paste the output of
 > "echo w > /proc/sysrq-trigger" so we have full picture of what's blocked
 > where.
 > 

Thanks.  I have not seen the free space cache message with the new
kernel.

Next time the hang pops up I'll echo w > /proc/sysrq-trigger and send
it along.

I am hoping that the new kernel will magically fix the problem.

Thanks again,

sb. Scott Blomquist

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-10 12:00         ` Scott E. Blomquist
@ 2019-01-14 11:42           ` Scott E. Blomquist
  2019-01-14 12:11             ` Nikolay Borisov
  0 siblings, 1 reply; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-14 11:42 UTC (permalink / raw)
  To: Scott E. Blomquist; +Cc: Nikolay Borisov, Jojo, linux-btrfs


Scott E. Blomquist writes:
 > Nikolay Borisov writes:
 > 
 > Thanks.  I have not seen the free space cache message with the new
 > kernel.
 > 
 > Next time the hang pops up I'll echo w > /proc/sysrq-trigger and send
 > it along.
 > 
 > I am hoping that the new kernel will magically fix the problem.
 > 
 > Thanks again,
 > 
 > sb. Scott Blomquist

The file system hung again below is the sysrq output

Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux

btrfs-progs v4.19.1 

# btrfs fi df /export/
Data, single: total=79.61TiB, used=79.61TiB
System, single: total=36.00MiB, used=8.31MiB
Metadata, single: total=192.01GiB, used=190.19GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

#  btrfs fi show
Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
        Total devices 3 FS bytes used 79.79TiB
        devid    1 size 47.30TiB used 43.58TiB path /dev/sda1
        devid    2 size 21.83TiB used 18.11TiB path /dev/sdb1
        devid    3 size 21.83TiB used 18.11TiB path /dev/sdc1

[Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State
[Mon Jan 14 06:24:26 2019]   task                        PC stack   pid father
[Mon Jan 14 06:24:26 2019] nmbd            D    0   944      1 0x00000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  schedule_timeout+0x14d/0x2e0
[Mon Jan 14 06:24:26 2019]  ? __next_timer_interrupt+0xd0/0xd0
[Mon Jan 14 06:24:26 2019]  io_schedule_timeout+0x19/0x40
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages+0x726/0x90e
[Mon Jan 14 06:24:26 2019]  ? __wake_up_common_lock+0x77/0x90
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages_ratelimited+0x254/0x3b0
[Mon Jan 14 06:24:26 2019]  fault_dirty_shared_page.isra.83+0x4a/0x90
[Mon Jan 14 06:24:26 2019]  do_wp_page+0x3e1/0x4d0
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  __handle_mm_fault+0x6ff/0x13a0
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  handle_mm_fault+0xfa/0x210
[Mon Jan 14 06:24:26 2019]  __do_page_fault+0x1fd/0x4f0
[Mon Jan 14 06:24:26 2019]  ? page_fault+0x8/0x30
[Mon Jan 14 06:24:26 2019]  page_fault+0x1e/0x30
[Mon Jan 14 06:24:26 2019] RIP: 0033:0x7f7032a41aba
[Mon Jan 14 06:24:26 2019] Code: Bad RIP value.
[Mon Jan 14 06:24:26 2019] RSP: 002b:00007fff0d205978 EFLAGS: 00010206
[Mon Jan 14 06:24:26 2019] RAX: ffffd6dd41976d84 RBX: 0000564d782587e0 RCX: 0000000000004058
[Mon Jan 14 06:24:26 2019] RDX: 000000000000202c RSI: 0000564d7825cdb0 RDI: 00007f70368e4000
[Mon Jan 14 06:24:26 2019] RBP: 0000000000000000 R08: 00002922be684248 R09: 00002922be684238
[Mon Jan 14 06:24:26 2019] R10: 00002922be684228 R11: 0000000000000201 R12: 000000000000202c
[Mon Jan 14 06:24:26 2019] R13: 0000000000003000 R14: 0000564d782587e0 R15: 0000564d7825cdb0
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1448      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1449      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1450      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1451      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1452      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1453      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1454      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? iput+0x6f/0x1b0
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  rwsem_down_write_failed+0x1ca/0x330
[Mon Jan 14 06:24:26 2019]  call_rwsem_down_write_failed+0x13/0x20
[Mon Jan 14 06:24:26 2019]  down_write+0x29/0x40
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0xac/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nfsd            D    0  1455      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x263/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_cont_expand+0x3e6/0x580 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_file_write_iter+0x3e8/0x560 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_iter_readv_writev+0x110/0x170
[Mon Jan 14 06:24:26 2019]  do_iter_write+0x78/0x180
[Mon Jan 14 06:24:26 2019]  nfsd_vfs_write+0xf0/0x440 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_write+0x84/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd3_proc_write+0xd9/0x160 [nfsd]
[Mon Jan 14 06:24:26 2019]  nfsd_dispatch+0xb7/0x250 [nfsd]
[Mon Jan 14 06:24:26 2019]  svc_process_common+0x382/0x730 [sunrpc]
[Mon Jan 14 06:24:26 2019]  svc_process+0xeb/0x100 [sunrpc]
[Mon Jan 14 06:24:26 2019]  nfsd+0xe3/0x150 [nfsd]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? nfsd_destroy+0x60/0x60 [nfsd]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] nscd            D    0  1508      1 0x00000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  schedule_timeout+0x14d/0x2e0
[Mon Jan 14 06:24:26 2019]  ? __next_timer_interrupt+0xd0/0xd0
[Mon Jan 14 06:24:26 2019]  io_schedule_timeout+0x19/0x40
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages+0x726/0x90e
[Mon Jan 14 06:24:26 2019]  ? __wake_up_common_lock+0x77/0x90
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages_ratelimited+0x254/0x3b0
[Mon Jan 14 06:24:26 2019]  fault_dirty_shared_page.isra.83+0x4a/0x90
[Mon Jan 14 06:24:26 2019]  do_wp_page+0x3e1/0x4d0
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  __handle_mm_fault+0x6ff/0x13a0
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  handle_mm_fault+0xfa/0x210
[Mon Jan 14 06:24:26 2019]  __do_page_fault+0x1fd/0x4f0
[Mon Jan 14 06:24:26 2019]  ? page_fault+0x8/0x30
[Mon Jan 14 06:24:26 2019]  page_fault+0x1e/0x30
[Mon Jan 14 06:24:26 2019] RIP: 0033:0x556ee8b26a90
[Mon Jan 14 06:24:26 2019] Code: Bad RIP value.
[Mon Jan 14 06:24:26 2019] RSP: 002b:00007fb600dd2af0 EFLAGS: 00010246
[Mon Jan 14 06:24:26 2019] RAX: 00007fb606fa06b8 RBX: 00000000ffffffff RCX: 00007fb606fa0848
[Mon Jan 14 06:24:26 2019] RDX: 00000000000002c0 RSI: 0000556ee8d3a228 RDI: 00007fb600dd2af0
[Mon Jan 14 06:24:26 2019] RBP: 00007fb600dd2d90 R08: 0000000000000079 R09: 0000000000000079
[Mon Jan 14 06:24:26 2019] R10: 000000005c3c70bd R11: 0000556ee8d3ade8 R12: 0000000000000000
[Mon Jan 14 06:24:26 2019] R13: 0000000000000000 R14: 00007fb606fa03c8 R15: 000000005c3c7103
[Mon Jan 14 06:24:26 2019] nscd            D    0  1509      1 0x00000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  schedule_timeout+0x14d/0x2e0
[Mon Jan 14 06:24:26 2019]  ? __next_timer_interrupt+0xd0/0xd0
[Mon Jan 14 06:24:26 2019]  io_schedule_timeout+0x19/0x40
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages+0x726/0x90e
[Mon Jan 14 06:24:26 2019]  ? __wake_up_common_lock+0x77/0x90
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages_ratelimited+0x254/0x3b0
[Mon Jan 14 06:24:26 2019]  fault_dirty_shared_page.isra.83+0x4a/0x90
[Mon Jan 14 06:24:26 2019]  do_wp_page+0x3e1/0x4d0
[Mon Jan 14 06:24:26 2019]  __handle_mm_fault+0x6ff/0x13a0
[Mon Jan 14 06:24:26 2019]  handle_mm_fault+0xfa/0x210
[Mon Jan 14 06:24:26 2019]  __do_page_fault+0x1fd/0x4f0
[Mon Jan 14 06:24:26 2019]  ? page_fault+0x8/0x30
[Mon Jan 14 06:24:26 2019]  page_fault+0x1e/0x30
[Mon Jan 14 06:24:26 2019] RIP: 0033:0x556ee8b21cf3
[Mon Jan 14 06:24:26 2019] Code: Bad RIP value.
[Mon Jan 14 06:24:26 2019] RSP: 002b:00007fb600bd1340 EFLAGS: 00010246
[Mon Jan 14 06:24:26 2019] RAX: 0000000000000019 RBX: 00007fb600bd1350 RCX: 00007fb602fd5e10
[Mon Jan 14 06:24:26 2019] RDX: 00000000000000ea RSI: 00007fb602fd5e6e RDI: 00007fb600bd13ae
[Mon Jan 14 06:24:26 2019] RBP: 00007fb600bd1570 R08: 0000000000000008 R09: 00007fb600bd139d
[Mon Jan 14 06:24:26 2019] R10: 00007fb600bd139d R11: 00007fb60dd70150 R12: 00007fb600bd1a20
[Mon Jan 14 06:24:26 2019] R13: 0000000000000001 R14: 00007fb600bd139d R15: 0000000000000001
[Mon Jan 14 06:24:26 2019] btrfs-transacti D    0  6808      2 0x80000000
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
[Mon Jan 14 06:24:26 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
[Mon Jan 14 06:24:26 2019]  commit_cowonly_roots+0x1ed/0x280 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  transaction_kthread+0x14d/0x180 [btrfs]
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] rsync           D    0 17161  16189 0x00000004
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_dirty_inode+0x49/0xe0 [btrfs]
[Mon Jan 14 06:24:26 2019]  touch_atime+0x8a/0xb0
[Mon Jan 14 06:24:26 2019]  generic_file_read_iter+0x835/0xa30
[Mon Jan 14 06:24:26 2019]  ? page_cache_tree_insert+0xa0/0xa0
[Mon Jan 14 06:24:26 2019]  __vfs_read+0xea/0x150
[Mon Jan 14 06:24:26 2019]  vfs_read+0x89/0x130
[Mon Jan 14 06:24:26 2019]  ksys_read+0x42/0x90
[Mon Jan 14 06:24:26 2019]  do_syscall_64+0x55/0x110
[Mon Jan 14 06:24:26 2019]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Mon Jan 14 06:24:26 2019] RIP: 0033:0x7fa5c0c8c360
[Mon Jan 14 06:24:26 2019] Code: Bad RIP value.
[Mon Jan 14 06:24:26 2019] RSP: 002b:00007ffca2f09c18 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Mon Jan 14 06:24:26 2019] RAX: ffffffffffffffda RBX: 0000000008060840 RCX: 00007fa5c0c8c360
[Mon Jan 14 06:24:26 2019] RDX: 0000000000040000 RSI: 000000001008d420 RDI: 0000000000000003
[Mon Jan 14 06:24:26 2019] RBP: 0000000000040000 R08: 00000000000c0031 R09: 00007fa5c0f5f7b8
[Mon Jan 14 06:24:26 2019] R10: 00007fa5c0f5f7b8 R11: 0000000000000246 R12: 0000000000000000
[Mon Jan 14 06:24:26 2019] R13: 0000000000040000 R14: 0000000000000000 R15: 0000000000000000
[Mon Jan 14 06:24:26 2019] kworker/u18:4   D    0  5653      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:3   D    0  6259      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:1   D    0  7082      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:5   D    0  7413      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u16:1   D    0  7415      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-freespace-write btrfs_freespace_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  schedule_timeout+0x14d/0x2e0
[Mon Jan 14 06:24:26 2019]  ? __next_timer_interrupt+0xd0/0xd0
[Mon Jan 14 06:24:26 2019]  io_schedule_timeout+0x19/0x40
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages+0x726/0x90e
[Mon Jan 14 06:24:26 2019]  balance_dirty_pages_ratelimited+0x254/0x3b0
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x3d3/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x14/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:0   D    0  7644      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:6   D    0  7695      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:7   D    0  7696      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:9   D    0  7698      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_finish_ordered_io+0x453/0x7c0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  ? syscall_return_via_sysret+0x13/0x83
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jan 14 06:24:26 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jan 14 06:24:26 2019]  normal_work_helper+0xcc/0x310 [btrfs]
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
[Mon Jan 14 06:24:26 2019] kworker/u18:10  D    0  7700      2 0x80000000
[Mon Jan 14 06:24:26 2019] Workqueue: writeback wb_workfn (flush-btrfs-1)
[Mon Jan 14 06:24:26 2019] Call Trace:
[Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
[Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
[Mon Jan 14 06:24:26 2019]  wait_current_trans+0x87/0xd0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
[Mon Jan 14 06:24:26 2019]  start_transaction+0x1ef/0x3f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  find_free_extent+0xf96/0x1080 [btrfs]
[Mon Jan 14 06:24:26 2019]  btrfs_reserve_extent+0xf6/0x1f0 [btrfs]
[Mon Jan 14 06:24:26 2019]  cow_file_range.isra.68+0x119/0x440 [btrfs]
[Mon Jan 14 06:24:26 2019]  run_delalloc_range+0xff/0x380 [btrfs]
[Mon Jan 14 06:24:26 2019]  writepage_delalloc+0xfb/0x150 [btrfs]
[Mon Jan 14 06:24:26 2019]  __extent_writepage+0xdf/0x2b0 [btrfs]
[Mon Jan 14 06:24:26 2019]  ? invalid_page_referenced_vma+0x90/0x90
[Mon Jan 14 06:24:26 2019]  extent_write_cache_pages+0x22f/0x3d0 [btrfs]
[Mon Jan 14 06:24:26 2019]  extent_writepages+0x3f/0x60 [btrfs]
[Mon Jan 14 06:24:26 2019]  do_writepages+0x1a/0x70
[Mon Jan 14 06:24:26 2019]  __writeback_single_inode+0x3d/0x340
[Mon Jan 14 06:24:26 2019]  ? update_load_avg+0x94/0x5e0
[Mon Jan 14 06:24:26 2019]  writeback_sb_inodes+0x224/0x480
[Mon Jan 14 06:24:26 2019]  __writeback_inodes_wb+0x87/0xb0
[Mon Jan 14 06:24:26 2019]  wb_writeback+0x258/0x2f0
[Mon Jan 14 06:24:26 2019]  wb_workfn+0x190/0x430
[Mon Jan 14 06:24:26 2019]  process_one_work+0x165/0x370
[Mon Jan 14 06:24:26 2019]  worker_thread+0x49/0x3e0
[Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
[Mon Jan 14 06:24:26 2019]  ? rescuer_thread+0x2f0/0x2f0
[Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
[Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40

Thanks for any help.

sb. Scott Blomquist

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-14 11:42           ` Scott E. Blomquist
@ 2019-01-14 12:11             ` Nikolay Borisov
  2019-01-14 13:13               ` Scott E. Blomquist
  0 siblings, 1 reply; 12+ messages in thread
From: Nikolay Borisov @ 2019-01-14 12:11 UTC (permalink / raw)
  To: Scott E. Blomquist; +Cc: Jojo, linux-btrfs



On 14.01.19 г. 13:42 ч., Scott E. Blomquist wrote:
> 
> Scott E. Blomquist writes:
>  > Nikolay Borisov writes:
>  > 
>  > Thanks.  I have not seen the free space cache message with the new
>  > kernel.
>  > 
>  > Next time the hang pops up I'll echo w > /proc/sysrq-trigger and send
>  > it along.
>  > 
>  > I am hoping that the new kernel will magically fix the problem.
>  > 
>  > Thanks again,
>  > 
>  > sb. Scott Blomquist
> 
> The file system hung again below is the sysrq output
> 
> Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
> 
> btrfs-progs v4.19.1 
> 
> # btrfs fi df /export/
> Data, single: total=79.61TiB, used=79.61TiB
> System, single: total=36.00MiB, used=8.31MiB
> Metadata, single: total=192.01GiB, used=190.19GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B

So this btrfs is hosted on your local machine but it is exported via
NFS, correct?

> 
> #  btrfs fi show
> Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
>         Total devices 3 FS bytes used 79.79TiB
>         devid    1 size 47.30TiB used 43.58TiB path /dev/sda1
>         devid    2 size 21.83TiB used 18.11TiB path /dev/sdb1
>         devid    3 size 21.83TiB used 18.11TiB path /dev/sdc1

What kind of disks are those, presumably spinning rust due to their size
but what model/make?

> 
> [Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State

<snip>

> [Mon Jan 14 06:24:26 2019] btrfs-transacti D    0  6808      2 0x80000000
> [Mon Jan 14 06:24:26 2019] Call Trace:
> [Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
> [Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
> [Mon Jan 14 06:24:26 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
> [Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
> [Mon Jan 14 06:24:26 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
> [Mon Jan 14 06:24:26 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
> [Mon Jan 14 06:24:26 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
> [Mon Jan 14 06:24:26 2019]  ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
> [Mon Jan 14 06:24:26 2019]  commit_cowonly_roots+0x1ed/0x280 [btrfs]
> [Mon Jan 14 06:24:26 2019]  btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
> [Mon Jan 14 06:24:26 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
> [Mon Jan 14 06:24:26 2019]  transaction_kthread+0x14d/0x180 [btrfs]
> [Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
> [Mon Jan 14 06:24:26 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
> [Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
> [Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40

So the transaction is being committed as a result of that
btrfs_start_ordered_extent, which flushes data to disk. Since you've
compiled your kernel can you run the following command from the kernel's
source:

./scripts/faddr2line  vmlinux  btrfs_start_ordered_extent+0xca/0x100

'vmlinux' should be the kernel executable with debug info that results
from compiling the kernel. I want to figure out which line exactly
btrfs_start_ordered_extent+0xca/0x100 resolves to.


<snip>

All other processes are waiting for the transaction to commit. So we
have to figure out why btrfs_commit_transaction doesn't finish.

> 
> Thanks for any help.
> 
> sb. Scott Blomquist
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-14 12:11             ` Nikolay Borisov
@ 2019-01-14 13:13               ` Scott E. Blomquist
  2019-01-14 13:28                 ` Nikolay Borisov
  2019-01-15  8:16                 ` Nikolay Borisov
  0 siblings, 2 replies; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-14 13:13 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Scott E. Blomquist, Jojo, linux-btrfs


Nikolay Borisov writes:
 > 
 > On 14.01.19 г. 13:42 ч., Scott E. Blomquist wrote:
 > > 
 <snip>
 > > 
 > > The file system hung again below is the sysrq output
 > > 
 > > Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
 > > 
 > > btrfs-progs v4.19.1 
 > > 
 > > # btrfs fi df /export/
 > > Data, single: total=79.61TiB, used=79.61TiB
 > > System, single: total=36.00MiB, used=8.31MiB
 > > Metadata, single: total=192.01GiB, used=190.19GiB
 > > GlobalReserve, single: total=512.00MiB, used=0.00B
 > 
 > So this btrfs is hosted on your local machine but it is exported via
 > NFS, correct?

Correct and via samba also

 > > 
 > > #  btrfs fi show
 > > Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
 > >         Total devices 3 FS bytes used 79.79TiB
 > >         devid    1 size 47.30TiB used 43.58TiB path /dev/sda1
 > >         devid    2 size 21.83TiB used 18.11TiB path /dev/sdb1
 > >         devid    3 size 21.83TiB used 18.11TiB path /dev/sdc1
 > 
 > What kind of disks are those, presumably spinning rust due to their size
 > but what model/make?
 > 

3 x raid 6 on a LSI MegaRAID SAS 9271-8i

 > > [Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State
 > 
 > <snip>
 > 
 > > [Mon Jan 14 06:24:26 2019] btrfs-transacti D    0  6808      2 0x80000000
 > > [Mon Jan 14 06:24:26 2019] Call Trace:
 > > [Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
 > > [Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
 > > [Mon Jan 14 06:24:26 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
 > > [Mon Jan 14 06:24:26 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  commit_cowonly_roots+0x1ed/0x280 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  transaction_kthread+0x14d/0x180 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
 > > [Mon Jan 14 06:24:26 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
 > > [Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
 > > [Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
 > 
 > So the transaction is being committed as a result of that
 > btrfs_start_ordered_extent, which flushes data to disk. Since you've
 > compiled your kernel can you run the following command from the kernel's
 > source:
 > 
 > ./scripts/faddr2line  vmlinux  btrfs_start_ordered_extent+0xca/0x100
 > 
 > 'vmlinux' should be the kernel executable with debug info that results
 > from compiling the kernel. I want to figure out which line exactly
 > btrfs_start_ordered_extent+0xca/0x100 resolves to.

 <snip>

I'll have to rebuild the kernel with debug symbols.  Do I have to be
booted into the kernel for that command to be useful?

Cheers and Thanks,

sb. Scott Blomquist



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-14 13:13               ` Scott E. Blomquist
@ 2019-01-14 13:28                 ` Nikolay Borisov
  2019-01-15  8:16                 ` Nikolay Borisov
  1 sibling, 0 replies; 12+ messages in thread
From: Nikolay Borisov @ 2019-01-14 13:28 UTC (permalink / raw)
  To: Scott E. Blomquist; +Cc: Jojo, linux-btrfs



On 14.01.19 г. 15:13 ч., Scott E. Blomquist wrote:
> 
> Nikolay Borisov writes:
>  > 
>  > On 14.01.19 г. 13:42 ч., Scott E. Blomquist wrote:
>  > > 
>  <snip>
>  > > 
>  > > The file system hung again below is the sysrq output
>  > > 
>  > > Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
>  > > 
>  > > btrfs-progs v4.19.1 
>  > > 
>  > > # btrfs fi df /export/
>  > > Data, single: total=79.61TiB, used=79.61TiB
>  > > System, single: total=36.00MiB, used=8.31MiB
>  > > Metadata, single: total=192.01GiB, used=190.19GiB
>  > > GlobalReserve, single: total=512.00MiB, used=0.00B
>  > 
>  > So this btrfs is hosted on your local machine but it is exported via
>  > NFS, correct?
> 
> Correct and via samba also
> 
>  > > 
>  > > #  btrfs fi show
>  > > Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
>  > >         Total devices 3 FS bytes used 79.79TiB
>  > >         devid    1 size 47.30TiB used 43.58TiB path /dev/sda1
>  > >         devid    2 size 21.83TiB used 18.11TiB path /dev/sdb1
>  > >         devid    3 size 21.83TiB used 18.11TiB path /dev/sdc1
>  > 
>  > What kind of disks are those, presumably spinning rust due to their size
>  > but what model/make?
>  > 
> 
> 3 x raid 6 on a LSI MegaRAID SAS 9271-8i

Has your controller been updated to the latest firmware? In my
experience LSI Megaraid are rubbish controllers and in the past, in a
datacenter environment, we've had a batch of bad controllers which
resulted in controllers resets, causing all IO to die on 10s of machines.

There was a way to query the controller's built-in log for firmware
errors. I can't remember the exact command but googling suggests using:

MegaCli -AdpEventLog -GetEvents -f events.log -aALL && cat events.log

Can you run that and also attach it when a hang occurs?

> 
>  > > [Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State
>  > 
>  > <snip>
>  > 
>  > > [Mon Jan 14 06:24:26 2019] btrfs-transacti D    0  6808      2 0x80000000
>  > > [Mon Jan 14 06:24:26 2019] Call Trace:
>  > > [Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
>  > > [Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  commit_cowonly_roots+0x1ed/0x280 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  transaction_kthread+0x14d/0x180 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
>  > > [Mon Jan 14 06:24:26 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
>  > > [Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
>  > 
>  > So the transaction is being committed as a result of that
>  > btrfs_start_ordered_extent, which flushes data to disk. Since you've
>  > compiled your kernel can you run the following command from the kernel's
>  > source:
>  > 
>  > ./scripts/faddr2line  vmlinux  btrfs_start_ordered_extent+0xca/0x100
>  > 
>  > 'vmlinux' should be the kernel executable with debug info that results
>  > from compiling the kernel. I want to figure out which line exactly
>  > btrfs_start_ordered_extent+0xca/0x100 resolves to.
> 
>  <snip>
> 
> I'll have to rebuild the kernel with debug symbols.  Do I have to be
> booted into the kernel for that command to be useful?

Well the running kernel needs to correspond to the vmlinux since
otherwise the offsets might not match. In any case try rebuilding the
kernel and running it to see if it's going to result in a sane output.

> 
> Cheers and Thanks,
> 
> sb. Scott Blomquist
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-14 13:13               ` Scott E. Blomquist
  2019-01-14 13:28                 ` Nikolay Borisov
@ 2019-01-15  8:16                 ` Nikolay Borisov
  2019-01-15 14:36                   ` Scott E. Blomquist
  1 sibling, 1 reply; 12+ messages in thread
From: Nikolay Borisov @ 2019-01-15  8:16 UTC (permalink / raw)
  To: Scott E. Blomquist; +Cc: Jojo, linux-btrfs



On 14.01.19 г. 15:13 ч., Scott E. Blomquist wrote:
> 
> Nikolay Borisov writes:
>  > 
>  > On 14.01.19 г. 13:42 ч., Scott E. Blomquist wrote:
>  > > 
>  <snip>
>  > > 
>  > > The file system hung again below is the sysrq output
>  > > 
>  > > Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
>  > > 
>  > > btrfs-progs v4.19.1 
>  > > 
>  > > # btrfs fi df /export/
>  > > Data, single: total=79.61TiB, used=79.61TiB
>  > > System, single: total=36.00MiB, used=8.31MiB
>  > > Metadata, single: total=192.01GiB, used=190.19GiB
>  > > GlobalReserve, single: total=512.00MiB, used=0.00B
>  > 
>  > So this btrfs is hosted on your local machine but it is exported via
>  > NFS, correct?
> 
> Correct and via samba also
> 
>  > > 
>  > > #  btrfs fi show
>  > > Label: '/export'  uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
>  > >         Total devices 3 FS bytes used 79.79TiB
>  > >         devid    1 size 47.30TiB used 43.58TiB path /dev/sda1
>  > >         devid    2 size 21.83TiB used 18.11TiB path /dev/sdb1
>  > >         devid    3 size 21.83TiB used 18.11TiB path /dev/sdc1
>  > 
>  > What kind of disks are those, presumably spinning rust due to their size
>  > but what model/make?
>  > 
> 
> 3 x raid 6 on a LSI MegaRAID SAS 9271-8i
> 
>  > > [Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State
>  > 
>  > <snip>
>  > 
>  > > [Mon Jan 14 06:24:26 2019] btrfs-transacti D    0  6808      2 0x80000000
>  > > [Mon Jan 14 06:24:26 2019] Call Trace:
>  > > [Mon Jan 14 06:24:26 2019]  ? __schedule+0x2ea/0x870
>  > > [Mon Jan 14 06:24:26 2019]  schedule+0x32/0x80
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_start_ordered_extent+0xca/0x100 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? wait_woken+0x80/0x80
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_write_dirty_block_groups+0xed/0x360 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  commit_cowonly_roots+0x1ed/0x280 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? start_transaction+0x9b/0x3f0 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  transaction_kthread+0x14d/0x180 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  kthread+0xf8/0x130
>  > > [Mon Jan 14 06:24:26 2019]  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
>  > > [Mon Jan 14 06:24:26 2019]  ? kthread_bind+0x10/0x10
>  > > [Mon Jan 14 06:24:26 2019]  ret_from_fork+0x35/0x40
>  > 
>  > So the transaction is being committed as a result of that
>  > btrfs_start_ordered_extent, which flushes data to disk. Since you've
>  > compiled your kernel can you run the following command from the kernel's
>  > source:
>  > 
>  > ./scripts/faddr2line  vmlinux  btrfs_start_ordered_extent+0xca/0x100
>  > 
>  > 'vmlinux' should be the kernel executable with debug info that results
>  > from compiling the kernel. I want to figure out which line exactly
>  > btrfs_start_ordered_extent+0xca/0x100 resolves to.
> 
>  <snip>
> 
> I'll have to rebuild the kernel with debug symbols.  Do I have to be
> booted into the kernel for that command to be useful?

Actually I think you are hitting the issue fixed by the following patch: 
https://github.com/kdave/btrfs-devel/commit/db0d10b02620b83ee592f6fc023ae146d72c5f65


The patch went into 4.18, yet your initial report said the hang occurs on 4.17. Could you try running 4.19 with e73e81b6d011 ("btrfs: balance dirty metadata pages in btrfs_finish_ordered_io")
reverted. 



> 
> Cheers and Thanks,
> 
> sb. Scott Blomquist
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: btrfs hang on nfs?
  2019-01-15  8:16                 ` Nikolay Borisov
@ 2019-01-15 14:36                   ` Scott E. Blomquist
  0 siblings, 0 replies; 12+ messages in thread
From: Scott E. Blomquist @ 2019-01-15 14:36 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Scott E. Blomquist, Jojo, linux-btrfs


Nikolay Borisov writes:
 > 
 <snip>
 > > 
 > > I'll have to rebuild the kernel with debug symbols.  Do I have to be
 > > booted into the kernel for that command to be useful?
 > 
 > Actually I think you are hitting the issue fixed by the following patch: 
 > https://github.com/kdave/btrfs-devel/commit/db0d10b02620b83ee592f6fc023ae146d72c5f65
 > 
 > 
 > The patch went into 4.18, yet your initial report said the hang occurs on 4.17. Could you try running 4.19 with e73e81b6d011 ("btrfs: balance dirty metadata pages in btrfs_finish_ordered_io")
 > reverted. 
 > 
 <snip>

As an update....

Yesterday after a hang up, I upgraded the firmware on the raid card.
There were items in the changelog that seemed like they might apply.

Unfortunately, later in the day the system hung up again.  It was also
unfortunate that I didn't have the debug kernel installed at the time.

I am now rebuilding a debug kernel with the patch you mention above.

FWIW i do see from fs/inode.c

        /* once for the tree */
        btrfs_put_ordered_extent(ordered_extent);

        /* Try to release some metadata so we don't get an OOM but don't wait */
        btrfs_btree_balance_dirty_nodelay(fs_info);

around the same place in the un-patched linux-4.17.14 source tree that
I built from, so I might have been hitting this in the previous kernel
also.

Cheers and thank you for all the help.

sb.  Scott Blomquist




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-01-15 14:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09 10:54 btrfs hang on nfs? Scott E. Blomquist
2019-01-09 12:14 ` Jojo
2019-01-09 13:31   ` Scott E. Blomquist
2019-01-10 11:46     ` Scott E. Blomquist
2019-01-10 11:51       ` Nikolay Borisov
2019-01-10 12:00         ` Scott E. Blomquist
2019-01-14 11:42           ` Scott E. Blomquist
2019-01-14 12:11             ` Nikolay Borisov
2019-01-14 13:13               ` Scott E. Blomquist
2019-01-14 13:28                 ` Nikolay Borisov
2019-01-15  8:16                 ` Nikolay Borisov
2019-01-15 14:36                   ` Scott E. Blomquist

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.