Linux-XFS Archive on lore.kernel.org
 help / color / Atom feed
* [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS
@ 2019-01-19 16:50 bugzilla-daemon
  2019-01-19 17:15 ` [Bug 202349] " bugzilla-daemon
                   ` (13 more replies)
  0 siblings, 14 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-19 16:50 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

            Bug ID: 202349
           Summary: Extreme desktop freezes during sustained write
                    operations with XFS
           Product: File System
           Version: 2.5
    Kernel Version: 4,19.16
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@kernel-bugs.kernel.org
          Reporter: nfxjfg@googlemail.com
        Regression: No

If you keep appending data to a a file on a XFS filesystem, the entire system
will occasionally freeze for up to 4 seconds or so. Then it recovers and
continues for a few minutes, until it happens again. Although the freezes
happen at sort of random times, it's generally highly reproducible.

This is extremely bothersome for desktop use, as _everything_ stops during the
freeze. Even typing text into a terminal. Even processes that don't appear to
do any I/O are frozen (at least not I/O to disks). For example, a python script
that does nothing else than printing a counter to stdout (running in a X11
terminal emulator) will stop. Everything is completely in memory; the system
disk is on a SSD anyway.

In general, the desktop becomes completely unusable garbage, which in turn
makes XFS unsuitable for desktop operation until this is fixed.

Reproduction is simple enough: if I copy multiple large files (dozens of files
with about 500-5000 MB per file) with rsync from one disk to another (both hard
disks using XFS filesystems), the freezes will happen at least every few
minutes.

Someone else observed something similar independently just now, and suspects
this is happening because XFS blocks the entire kernel when freeing certain
caches: https://twitter.com/marcan42/status/1086652425061056515
I don't know whether my case is the same, but it sure looks very similar.

This is with kernel 4.19.16, deadline scheduler for both source and target (as
recommended by XFS FAQ), and full kernel preemption enabled. The system has
16GB RAM, of which 10GB are usually available for caches. I don't know what
information is useful, so please request whatever information might be useful
for analyzing this.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
@ 2019-01-19 17:15 ` " bugzilla-daemon
  2019-01-19 22:20 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-19 17:15 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #1 from nfxjfg@googlemail.com ---
PS: I've observed absolutely no problems when reformatting exactly the same
target disk with ext4 and performing the same test.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
  2019-01-19 17:15 ` [Bug 202349] " bugzilla-daemon
@ 2019-01-19 22:20 ` bugzilla-daemon
  2019-01-20 21:59 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-19 22:20 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

Arkadiusz Miskiewicz (arekm@maven.pl) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |arekm@maven.pl

--- Comment #2 from Arkadiusz Miskiewicz (arekm@maven.pl) ---
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
  2019-01-19 17:15 ` [Bug 202349] " bugzilla-daemon
  2019-01-19 22:20 ` bugzilla-daemon
@ 2019-01-20 21:59 ` bugzilla-daemon
  2019-01-21 16:16 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-20 21:59 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

Dave Chinner (david@fromorbit.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |david@fromorbit.com

--- Comment #3 from Dave Chinner (david@fromorbit.com) ---
Output of 'echo w > /proc/sysrq-trigger' when the system is having freeze
problems, please, as well as `iostat -dxm 5` over a period of minutes leading
up to the freeze and after the freeze ends.

-Dave.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-01-20 21:59 ` bugzilla-daemon
@ 2019-01-21 16:16 ` bugzilla-daemon
  2019-01-21 19:04 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-21 16:16 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #4 from nfxjfg@googlemail.com ---
I'll try to provide all this information in a few days.

I doubt I'll be able to trigger the sysrq with a shell command while the system
is frozen (because, well, it's frozen). If that's the case, would triggering
the sysrq via keyboard directly work better?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-01-21 16:16 ` bugzilla-daemon
@ 2019-01-21 19:04 ` bugzilla-daemon
  2019-01-24 11:59 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-21 19:04 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

Chris Murphy (bugzilla@colorremedies.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla@colorremedies.com

--- Comment #5 from Chris Murphy (bugzilla@colorremedies.com) ---
My suggestion is enable sysrq, and then type out the trigger line in a shell in
advance without pressing return; and then hit return once the hang happens. It
almost certainly will get through to the kernel even if the command doesn't
return until after the system recovers. It might be more reliable to do this
over ssh than through the desktop environment's terminal.

If you're running systemd, you can extract the full dmesg using 'journalctl -k
> kernel.log' without systemd you may need to reboot with boot param
log_buf_len=1M  and issue 'dmesg -n 8'before reproducing.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-01-21 19:04 ` bugzilla-daemon
@ 2019-01-24 11:59 ` bugzilla-daemon
  2019-01-24 23:31   ` Dave Chinner
  2019-01-24 23:31 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-24 11:59 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #6 from nfxjfg@googlemail.com ---
In all the information below, the test disk was /dev/sdd, and was mounted on
/mnt/tmp1/.

Script which triggers the problem somewhat reliably:

-----

#!/bin/bash

TESTDIR=/mnt/tmp1/tests/
mkdir $TESTDIR

COUNTER=0
while true ; do
    dd if=/dev/zero of=$TESTDIR/f$COUNTER bs=1024 count=$((1*1024))
    COUNTER=$((COUNTER+1))
done

-----

Whether it "freezes" the system after a few seconds (under 1 minute) of running
seems to depend on the size of the files written. The system freeze is of
course not permanent; it's just a performance problem.

When the system freezes, even the mouse pointer can stop moving. As you can see
in the dmesg excerpt below, Xorg got blocked. The script was run from a
terminal emulator running on the X session. I'm fairly sure nothing other than
the test script accessed the test disk/filesystem. Sometimes the script blocks
for a while without freezing the system.

dmesg excerpt from when I triggered the blocked task sysrq via ssh, this
includes the messages from when the disk was hotplugged and mounted:

-----

[585767.299464] ata3: softreset failed (device not ready)
[585772.375433] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[585772.378964] ata3.00: ATA-9: HGST HUH728060ALE600, A4GNT7J0, max UDMA/133
[585772.378968] ata3.00: 11721045168 sectors, multi 0: LBA48 NCQ (depth 32), AA
[585772.386994] ata3.00: configured for UDMA/133
[585772.387134] scsi 2:0:0:0: Direct-Access     ATA      HGST HUH728060AL T7J0
PQ: 0 ANSI: 5
[585772.387453] sd 2:0:0:0: Attached scsi generic sg3 type 0
[585772.387599] sd 2:0:0:0: [sdd] 11721045168 512-byte logical blocks: (6.00
TB/5.46 TiB)
[585772.387604] sd 2:0:0:0: [sdd] 4096-byte physical blocks
[585772.387636] sd 2:0:0:0: [sdd] Write Protect is off
[585772.387640] sd 2:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[585772.387680] sd 2:0:0:0: [sdd] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[585772.432755]  sdd: sdd1
[585772.433234] sd 2:0:0:0: [sdd] Attached SCSI disk
[585790.697722] XFS (sdd1): Mounting V5 Filesystem
[585790.842858] XFS (sdd1): Ending clean mount
[587294.464409] sysrq: SysRq : Show Blocked State
[587294.464419]   task                        PC stack   pid father
[588653.596479] sysrq: SysRq : Show Blocked State
[588653.596488]   task                        PC stack   pid father
[588653.596534] kswapd0         D    0    93      2 0x80000000
[588653.596538] Call Trace:
[588653.596549]  ? __schedule+0x240/0x860
[588653.596553]  ? schedule_timeout+0x274/0x390
[588653.596557]  schedule+0x78/0x110
[588653.596560]  schedule_timeout+0x274/0x390
[588653.596565]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[588653.596568]  wait_for_completion+0xe2/0x140
[588653.596572]  ? wake_up_q+0x70/0x70
[588653.596576]  __flush_work+0xfb/0x1a0
[588653.596580]  ? flush_workqueue_prep_pwqs+0x130/0x130
[588653.596585]  xlog_cil_force_lsn+0x67/0x1d0
[588653.596591]  ? __xfs_iunpin_wait+0x96/0x150
[588653.596594]  xfs_log_force_lsn+0x73/0x130
[588653.596597]  ? xfs_reclaim_inode+0xb5/0x300
[588653.596600]  __xfs_iunpin_wait+0x96/0x150
[588653.596605]  ? init_wait_var_entry+0x40/0x40
[588653.596607]  xfs_reclaim_inode+0xb5/0x300
[588653.596611]  xfs_reclaim_inodes_ag+0x1a6/0x2f0
[588653.596616]  ? preempt_count_sub+0x43/0x50
[588653.596619]  ? _raw_spin_unlock+0x12/0x30
[588653.596621]  ? xfs_inode_set_reclaim_tag+0x9f/0x170
[588653.596624]  ? preempt_count_sub+0x43/0x50
[588653.596627]  xfs_reclaim_inodes_nr+0x31/0x40
[588653.596631]  super_cache_scan+0x14c/0x1a0
[588653.596635]  do_shrink_slab+0x125/0x2a0
[588653.596639]  shrink_slab+0x144/0x260
[588653.596643]  shrink_node+0xd6/0x420
[588653.596647]  kswapd+0x3ce/0x6e0
[588653.596651]  ? mem_cgroup_shrink_node+0x170/0x170
[588653.596654]  kthread+0x110/0x130
[588653.596658]  ? kthread_create_worker_on_cpu+0x40/0x40
[588653.596660]  ret_from_fork+0x24/0x30
[588653.596730] Xorg            D    0   894    880 0x00400004
[588653.596733] Call Trace:
[588653.596737]  ? __schedule+0x240/0x860
[588653.596741]  ? schedule_preempt_disabled+0x23/0xa0
[588653.596744]  schedule+0x78/0x110
[588653.596747]  ? __mutex_lock.isra.5+0x292/0x4b0
[588653.596750]  schedule_preempt_disabled+0x23/0xa0
[588653.596752]  __mutex_lock.isra.5+0x292/0x4b0
[588653.596757]  ? xfs_perag_get_tag+0x52/0xf0
[588653.596760]  xfs_reclaim_inodes_ag+0x287/0x2f0
[588653.596766]  ? radix_tree_gang_lookup_tag+0xc2/0x140
[588653.596770]  ? iput+0x210/0x210
[588653.596772]  ? preempt_count_sub+0x43/0x50
[588653.596775]  xfs_reclaim_inodes_nr+0x31/0x40
[588653.596778]  super_cache_scan+0x14c/0x1a0
[588653.596781]  do_shrink_slab+0x125/0x2a0
[588653.596784]  shrink_slab+0x204/0x260
[588653.596787]  ? __schedule+0x248/0x860
[588653.596791]  shrink_node+0xd6/0x420
[588653.596794]  do_try_to_free_pages+0xb6/0x350
[588653.596798]  try_to_free_pages+0xce/0x1b0
[588653.596802]  __alloc_pages_slowpath+0x33d/0xc80
[588653.596808]  __alloc_pages_nodemask+0x23f/0x260
[588653.596820]  ttm_pool_populate+0x25e/0x480 [ttm]
[588653.596825]  ? kmalloc_large_node+0x37/0x60
[588653.596828]  ? __kmalloc_node+0x20e/0x2b0
[588653.596836]  ttm_populate_and_map_pages+0x24/0x250 [ttm]
[588653.596845]  ttm_tt_populate.part.9+0x1b/0x60 [ttm]
[588653.596853]  ttm_tt_bind+0x42/0x60 [ttm]
[588653.596861]  ttm_bo_handle_move_mem+0x258/0x4e0 [ttm]
[588653.596939]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
[588653.596947]  ttm_bo_validate+0xe7/0x110 [ttm]
[588653.596951]  ? preempt_count_sub+0x43/0x50
[588653.596954]  ? _raw_write_unlock+0x12/0x30
[588653.596974]  ? drm_pci_agp_destroy+0x4d/0x50 [drm]
[588653.596983]  ttm_bo_init_reserved+0x347/0x390 [ttm]
[588653.597059]  amdgpu_bo_do_create+0x19c/0x420 [amdgpu]
[588653.597136]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
[588653.597213]  amdgpu_bo_create+0x30/0x200 [amdgpu]
[588653.597291]  amdgpu_gem_object_create+0x8b/0x110 [amdgpu]
[588653.597404]  amdgpu_gem_create_ioctl+0x1d0/0x290 [amdgpu]
[588653.597417]  ? preempt_count_sub+0x43/0x50
[588653.597421]  ? _raw_spin_unlock+0x12/0x30
[588653.597499]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
[588653.597521]  drm_ioctl_kernel+0x7f/0xd0 [drm]
[588653.597545]  drm_ioctl+0x1e4/0x380 [drm]
[588653.597625]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
[588653.597631]  ? tlb_finish_mmu+0x1f/0x30
[588653.597637]  ? preempt_count_sub+0x43/0x50
[588653.597712]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[588653.597720]  do_vfs_ioctl+0x8d/0x5d0
[588653.597726]  ? do_munmap+0x33c/0x430
[588653.597731]  ? __fget+0x6e/0xa0
[588653.597736]  ksys_ioctl+0x60/0x90
[588653.597742]  __x64_sys_ioctl+0x16/0x20
[588653.597748]  do_syscall_64+0x4a/0xd0
[588653.597754]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[588653.597759] RIP: 0033:0x7f929cf6d747
[588653.597768] Code: Bad RIP value.
[588653.597771] RSP: 002b:00007ffca6b5df18 EFLAGS: 00003246 ORIG_RAX:
0000000000000010
[588653.597780] RAX: ffffffffffffffda RBX: 00007ffca6b5e000 RCX:
00007f929cf6d747
[588653.597783] RDX: 00007ffca6b5df70 RSI: 00000000c0206440 RDI:
000000000000000e
[588653.597787] RBP: 00007ffca6b5df70 R08: 000055f3e86a3290 R09:
000000000000000b
[588653.597791] R10: 000055f3e653c010 R11: 0000000000003246 R12:
00000000c0206440
[588653.597795] R13: 000000000000000e R14: 0000000000000000 R15:
000055f3e86a3290
[588653.598255] kworker/u32:0   D    0  6490      2 0x80000000
[588653.598263] Workqueue: writeback wb_workfn (flush-8:48)
[588653.598265] Call Trace:
[588653.598270]  ? __schedule+0x240/0x860
[588653.598274]  ? blk_flush_plug_list+0x1d9/0x220
[588653.598277]  ? io_schedule+0x12/0x40
[588653.598280]  schedule+0x78/0x110
[588653.598283]  io_schedule+0x12/0x40
[588653.598286]  get_request+0x26b/0x770
[588653.598291]  ? finish_wait+0x80/0x80
[588653.598294]  blk_queue_bio+0x15f/0x4e0
[588653.598297]  generic_make_request+0x1c0/0x460
[588653.598301]  submit_bio+0x45/0x140
[588653.598304]  xfs_submit_ioend+0x9c/0x1e0
[588653.598308]  xfs_vm_writepages+0x68/0x80
[588653.598312]  do_writepages+0x2e/0xb0
[588653.598315]  ? _raw_spin_unlock+0x12/0x30
[588653.598317]  ? list_lru_add+0xf5/0x1a0
[588653.598320]  __writeback_single_inode+0x3d/0x3d0
[588653.598323]  writeback_sb_inodes+0x1c4/0x430
[588653.598328]  __writeback_inodes_wb+0x5d/0xb0
[588653.598331]  wb_writeback+0x265/0x310
[588653.598335]  wb_workfn+0x314/0x410
[588653.598340]  process_one_work+0x199/0x3b0
[588653.598343]  worker_thread+0x30/0x380
[588653.598345]  ? rescuer_thread+0x310/0x310
[588653.598348]  kthread+0x110/0x130
[588653.598351]  ? kthread_create_worker_on_cpu+0x40/0x40
[588653.598354]  ret_from_fork+0x24/0x30
[588653.598380] xfsaild/sdd1    D    0 19755      2 0x80000000
[588653.598383] Call Trace:
[588653.598387]  ? __schedule+0x240/0x860
[588653.598389]  ? enqueue_entity+0xf6/0x6c0
[588653.598392]  ? schedule_timeout+0x274/0x390
[588653.598395]  schedule+0x78/0x110
[588653.598398]  schedule_timeout+0x274/0x390
[588653.598403]  ? tracing_record_taskinfo_skip+0x40/0x50
[588653.598405]  wait_for_completion+0xe2/0x140
[588653.598408]  ? wake_up_q+0x70/0x70
[588653.598411]  __flush_work+0xfb/0x1a0
[588653.598415]  ? flush_workqueue_prep_pwqs+0x130/0x130
[588653.598419]  xlog_cil_force_lsn+0x67/0x1d0
[588653.598422]  ? _raw_spin_unlock_irqrestore+0x22/0x40
[588653.598426]  ? try_to_del_timer_sync+0x3d/0x50
[588653.598429]  xfs_log_force+0x83/0x2d0
[588653.598432]  ? preempt_count_sub+0x43/0x50
[588653.598436]  xfsaild+0x19b/0x7f0
[588653.598440]  ? _raw_spin_unlock_irqrestore+0x22/0x40
[588653.598443]  ? xfs_trans_ail_cursor_first+0x80/0x80
[588653.598446]  kthread+0x110/0x130
[588653.598449]  ? kthread_create_worker_on_cpu+0x40/0x40
[588653.598452]  ret_from_fork+0x24/0x30
[588653.598462] kworker/6:2     D    0 19911      2 0x80000000
[588653.598468] Workqueue: xfs-cil/sdd1 xlog_cil_push_work
[588653.598469] Call Trace:
[588653.598473]  ? __schedule+0x240/0x860
[588653.598476]  ? _raw_spin_lock_irqsave+0x1c/0x40
[588653.598479]  ? xlog_state_get_iclog_space+0xfc/0x2c0
[588653.598481]  ? wake_up_q+0x70/0x70
[588653.598484]  schedule+0x78/0x110
[588653.598487]  xlog_state_get_iclog_space+0xfc/0x2c0
[588653.598490]  ? wake_up_q+0x70/0x70
[588653.598494]  xlog_write+0x153/0x680
[588653.598498]  xlog_cil_push+0x259/0x3e0
[588653.598503]  process_one_work+0x199/0x3b0
[588653.598506]  worker_thread+0x1c6/0x380
[588653.598509]  ? rescuer_thread+0x310/0x310
[588653.598512]  kthread+0x110/0x130
[588653.598515]  ? kthread_create_worker_on_cpu+0x40/0x40
[588653.598518]  ret_from_fork+0x24/0x30
[588653.598542] kworker/8:0     D    0 21308      2 0x80000000
[588653.598547] Workqueue: xfs-sync/sdd1 xfs_log_worker
[588653.598549] Call Trace:
[588653.598553]  ? __schedule+0x240/0x860
[588653.598556]  ? schedule_timeout+0x274/0x390
[588653.598559]  schedule+0x78/0x110
[588653.598561]  schedule_timeout+0x274/0x390
[588653.598565]  wait_for_completion+0xe2/0x140
[588653.598567]  ? wake_up_q+0x70/0x70
[588653.598570]  __flush_work+0xfb/0x1a0
[588653.598574]  ? flush_workqueue_prep_pwqs+0x130/0x130
[588653.598577]  xlog_cil_force_lsn+0x67/0x1d0
[588653.598581]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[588653.598584]  xfs_log_force+0x83/0x2d0
[588653.598587]  ? preempt_count_sub+0x43/0x50
[588653.598590]  xfs_log_worker+0x2f/0xf0
[588653.598593]  process_one_work+0x199/0x3b0
[588653.598595]  worker_thread+0x30/0x380
[588653.598598]  ? rescuer_thread+0x310/0x310
[588653.598601]  kthread+0x110/0x130
[588653.598604]  ? kthread_create_worker_on_cpu+0x40/0x40
[588653.598607]  ret_from_fork+0x24/0x30

-----

"iostat -x -d -m 5 " while running the test:

-----

Linux 4.19.16 (debian)  24.01.2019      _x86_64_        (12 CPU)

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb             49.94    0.13      8.22      0.09     0.01     0.01   0.02  
6.16    1.09  195.85   0.08   168.59   689.71   0.73   3.63
sda              2.09    0.35      0.03      0.02     0.01     0.02   0.24  
6.36    0.25    6.46   0.00    15.34    51.02   0.23   0.06
sdc              4.14    1.01      0.37      0.09     0.02     3.05   0.58 
75.03    0.33   13.48   0.01    90.82    90.05   0.25   0.13
dm-0             0.97    3.56      0.02      0.02     0.00     0.00   0.00  
0.00    0.29    7.14   0.03    19.75     4.56   0.08   0.03
loop0            0.01    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    9.87    1.08   0.00    11.52     8.80   1.50   0.00
loop1            0.06    0.00      0.01      0.00     0.00     0.00   0.00  
0.00    2.66    3.65   0.00    85.79    88.40   0.88   0.01
sdd              0.04    0.48      0.00      0.34     0.00     0.00   2.38  
0.48   13.82  625.41   0.30    10.27   712.62   4.63   0.24

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sda              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdc              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
dm-0             0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop1            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdd              2.80  114.60      0.01     57.53     0.00     0.00   0.00  
0.00   31.57  315.09  44.88     4.00   514.07   2.82  33.12

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sda              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdc              0.00    0.20      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    2.00   0.00     0.00     4.00   0.00   0.00
dm-0             0.00    0.20      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     4.00   0.00   0.00
loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop1            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdd              0.00  309.20      0.00    158.60     0.00     0.00   0.00  
0.00    0.00  379.40 143.99     0.00   525.25   3.24 100.08

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sda              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdc              0.00    0.60      0.00      0.00     0.00     0.20   0.00 
25.00    0.00    3.00   0.00     0.00     4.00   2.67   0.16
dm-0             0.00    0.80      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    2.00   0.00     0.00     4.00   2.00   0.16
loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop1            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdd              0.80  162.20      0.00    159.67     0.00     0.00   0.00  
0.00 1161.50  933.72 130.70     4.00  1008.02   6.13 100.00

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sda              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdc              0.00    0.40      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    2.00   0.00     0.00     4.00   0.00   0.00
dm-0             0.00    0.40      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     4.00   0.00   0.00
loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop1            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdd              0.00  154.00      0.00    153.80     0.00     0.00   0.00  
0.00    0.00  885.47 143.06     0.00  1022.67   6.49 100.00

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sda              0.40    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    1.50    0.00   0.00     4.00     0.00   0.00   0.00
sdc              0.00    1.00      0.00      0.00     0.00     0.20   0.00 
16.67    0.00    1.60   0.00     0.00     4.00   0.80   0.08
dm-0             0.00    1.20      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.67   0.00     0.00     4.00   0.67   0.08
loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop1            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdd             10.60  174.40      0.17    138.13     0.00     7.40   0.00  
4.07    6.43  861.84 130.00    16.00   811.06   5.39  99.76

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sdb              0.00    0.40      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    6.00   0.00     0.00     4.00   6.00   0.24
sda              5.80    0.60      0.05      0.00     0.00     0.00   0.00  
0.00    0.55    2.33   0.00     8.55     7.00   0.62   0.40
sdc              0.00    0.40      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    1.50   0.00     0.00     2.00   0.00   0.00
dm-0             0.00    0.40      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     4.00   0.00   0.00
loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
loop1            0.00    0.00      0.00      0.00     0.00     0.00   0.00  
0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdd             15.60   29.20      0.23      7.46     0.00    36.20   0.00 
55.35    8.55   47.97   1.53    15.08   261.59   3.91  17.52

-----

"vmstat 5" while running the test:

-----

procs -----------memory---------- ---swap-- -----io---- -system--
------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
st
 1  0      0 4379320  64296 6133624    0    0   738   760    4    0  1  0 97  1
 0
 0  1      0 3123424  64296 7369344    0    0    11 58163 4661 8568  1  5 92  2
 0
 1  1      0 2301400  64296 8193912    0    0     1 176743 5350 10644  2  4 86 
8  0
 1  1      0 934552  64304 9550800    0    0     2 170806 5922 12653  2  5 85 
8  0
 0  3      0 132944  64304 10366496    0    0     0 163897 5791 9442  1  4 82
13  0
 2  1      0 135192  64312 10366564    0    0     2 114094 2093 2144  0  0 91 
8  0
 0  0      0 185980  64316 10327628    0    0   454  7708 2964 8098  1  1 97  1
 0

-----

Information requested by the FAQ:

- kernel version (uname -a) 

Linux debian 4.19.16 #1.0.my.2 SMP PREEMPT Thu Jan 17 15:18:14 CET 2019 x86_64
GNU/Linux

(That's just a vanilla kernel. It does _not_ use Debian's default kernel
config.)

- xfsprogs version (xfs_repair -V) 

xfs_repair version 4.15.1

- number of CPUs 

6 cores, total of 12 CPUs (AMD Ryzen 5 2600).

- contents of /proc/meminfo 

MemTotal:       16419320 kB
MemFree:          442860 kB
MemAvailable:    8501360 kB
Buffers:           66816 kB
Cached:          9966896 kB
SwapCached:            0 kB
Active:         10300036 kB
Inactive:        4981936 kB
Active(anon):    6199144 kB
Inactive(anon):   930620 kB
Active(file):    4100892 kB
Inactive(file):  4051316 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               112 kB
Writeback:             0 kB
AnonPages:       5248396 kB
Mapped:          1110368 kB
Shmem:           1881512 kB
Slab:             426008 kB
SReclaimable:     242452 kB
SUnreclaim:       183556 kB
KernelStack:       16064 kB
PageTables:        44408 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8209660 kB
Committed_AS:   15149512 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
Percpu:             9664 kB
HardwareCorrupted:     0 kB
AnonHugePages:   2619392 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:     6745216 kB
DirectMap2M:     9963520 kB
DirectMap1G:     1048576 kB

- contents of /proc/mounts 

Target disk only:

/dev/sdd1 /mnt/tmp1 xfs rw,relatime,attr2,inode64,noquota 0 0

- contents of /proc/partitions 

Target disk only:

   8       48 5860522584 sdd
   8       49 5860521543 sdd1

- RAID layout (hardware and/or software) 

None.

- LVM configuration

None.

- type of disks you are using

Spinning rust, potentially interesting parts from smartctl:

Model Family:     HGST Ultrastar He8
Device Model:     HGST HUH728060ALE600
Firmware Version: A4GNT7J0

- write cache status of drives

hdparm -I /dev/sdd|grep Write.cache

           *    Write cache

- size of BBWC and mode it is running in

No idea how to get this.

- xfs_info output on the filesystem in question 

xfs_info /mnt/tmp1

meta-data=/dev/sdd1              isize=512    agcount=6, agsize=268435455 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=1465130385, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-24 11:59 ` bugzilla-daemon
@ 2019-01-24 23:31   ` Dave Chinner
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Chinner @ 2019-01-24 23:31 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-xfs

On Thu, Jan 24, 2019 at 11:59:44AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=202349
> 
> --- Comment #6 from nfxjfg@googlemail.com ---
> In all the information below, the test disk was /dev/sdd, and was mounted on
> /mnt/tmp1/.

Ok, so you are generating largely page cache memory pressure and
some dirty inodes.

.....

> When the system freezes, even the mouse pointer can stop moving. As you can see
> in the dmesg excerpt below, Xorg got blocked. The script was run from a
> terminal emulator running on the X session. I'm fairly sure nothing other than
> the test script accessed the test disk/filesystem. Sometimes the script blocks
> for a while without freezing the system.

OK, this explains why I'm not seeing any measurable stalls at all
when running similar page cache pressure workloads - no GPU creating
memory pressure and triggering direct reclaim. And from that
perspective, this looks more like a bug in the ttm memory pool
allocator, not an XFS problem.

Yes, XFS is doing memory reclaim and is doing IO during reclaim, but
that's because it's the only thing that has reclaimable objects in
memory (due to your workload). While this may be undesirable, it is
necessary to work around other deficiencies in the memory reclaim
infrastructure and, as such, it is not a bug. We are working to try
to avoid this problem, but we haven't found a solution yet. It won't
prevent the desktop freeze under memory shortage" problem from
occurring, though.

i.e. the reason your desktop freezes is that this allocation
here:

> [588653.596794]  do_try_to_free_pages+0xb6/0x350
> [588653.596798]  try_to_free_pages+0xce/0x1b0
> [588653.596802]  __alloc_pages_slowpath+0x33d/0xc80
> [588653.596808]  __alloc_pages_nodemask+0x23f/0x260
> [588653.596820]  ttm_pool_populate+0x25e/0x480 [ttm]
> [588653.596825]  ? kmalloc_large_node+0x37/0x60
> [588653.596828]  ? __kmalloc_node+0x20e/0x2b0
> [588653.596836]  ttm_populate_and_map_pages+0x24/0x250 [ttm]
> [588653.596845]  ttm_tt_populate.part.9+0x1b/0x60 [ttm]
> [588653.596853]  ttm_tt_bind+0x42/0x60 [ttm]
> [588653.596861]  ttm_bo_handle_move_mem+0x258/0x4e0 [ttm]
> [588653.596939]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
> [588653.596947]  ttm_bo_validate+0xe7/0x110 [ttm]
> [588653.596951]  ? preempt_count_sub+0x43/0x50
> [588653.596954]  ? _raw_write_unlock+0x12/0x30
> [588653.596974]  ? drm_pci_agp_destroy+0x4d/0x50 [drm]
> [588653.596983]  ttm_bo_init_reserved+0x347/0x390 [ttm]
> [588653.597059]  amdgpu_bo_do_create+0x19c/0x420 [amdgpu]
> [588653.597136]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
> [588653.597213]  amdgpu_bo_create+0x30/0x200 [amdgpu]
> [588653.597291]  amdgpu_gem_object_create+0x8b/0x110 [amdgpu]
> [588653.597404]  amdgpu_gem_create_ioctl+0x1d0/0x290 [amdgpu]
> [588653.597417]  ? preempt_count_sub+0x43/0x50
> [588653.597421]  ? _raw_spin_unlock+0x12/0x30
> [588653.597499]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
> [588653.597521]  drm_ioctl_kernel+0x7f/0xd0 [drm]
> [588653.597545]  drm_ioctl+0x1e4/0x380 [drm]
> [588653.597625]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
> [588653.597631]  ? tlb_finish_mmu+0x1f/0x30
> [588653.597637]  ? preempt_count_sub+0x43/0x50
> [588653.597712]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
> [588653.597720]  do_vfs_ioctl+0x8d/0x5d0

is a GFP_USER allocation. That is:

#define GFP_USER        (__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)

__GFP_RECLAIM means direct reclaim is allowed, as is reclaim via
kswapd.

__GFP_FS means "reclaim from filesystem caches is allowed".

__GFP_IO means that it's allowed to do IO during reclaim.

__GFP_HARDWALL means the allocation is limited to the current
cpuset memory policy.

Basically, the ttm infrastructure has said to the allocator that it
is ok to block for as long as it takes for you to do whatever you
need to do to reclaim enough memory for the required allocation.

Given that your workload is creating only filesystem memory
pressure, that means that's where reclaim is directed. And given
that the allocation says "blocking is fine" and "reclaim from
filesystems", it's no surprise that the GPU operations are getting
stuck behind filesytem reclaim.

> Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> sdd              0.80  162.20      0.00    159.67     0.00     0.00   0.00  
> 0.00 1161.50  933.72 130.70     4.00  1008.02   6.13 100.00

Yup, there's a second long wait for any specific read or write IO to
complete here.

GPU operations are interactive, so they really need to have bound
response times. Using "block until required memory is available"
operations guarantees that whenever the system gets low on memory,
desktop interactivity will go to shit.....

In the meantime, it might be worth checking if you have:

CONFIG_BLK_WBT=y
CONFIG_BLK_WBT_MQ=y

active in your kernel. The block layer writeback throttle should
minimise the impact of bulk data writeback IO on metadata writeback
and read IO latency and so help minimise long blocking times for
metadata IO.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-01-24 11:59 ` bugzilla-daemon
@ 2019-01-24 23:31 ` bugzilla-daemon
  2019-01-25  9:55 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-24 23:31 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #7 from Dave Chinner (david@fromorbit.com) ---
On Thu, Jan 24, 2019 at 11:59:44AM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=202349
> 
> --- Comment #6 from nfxjfg@googlemail.com ---
> In all the information below, the test disk was /dev/sdd, and was mounted on
> /mnt/tmp1/.

Ok, so you are generating largely page cache memory pressure and
some dirty inodes.

.....

> When the system freezes, even the mouse pointer can stop moving. As you can
> see
> in the dmesg excerpt below, Xorg got blocked. The script was run from a
> terminal emulator running on the X session. I'm fairly sure nothing other
> than
> the test script accessed the test disk/filesystem. Sometimes the script
> blocks
> for a while without freezing the system.

OK, this explains why I'm not seeing any measurable stalls at all
when running similar page cache pressure workloads - no GPU creating
memory pressure and triggering direct reclaim. And from that
perspective, this looks more like a bug in the ttm memory pool
allocator, not an XFS problem.

Yes, XFS is doing memory reclaim and is doing IO during reclaim, but
that's because it's the only thing that has reclaimable objects in
memory (due to your workload). While this may be undesirable, it is
necessary to work around other deficiencies in the memory reclaim
infrastructure and, as such, it is not a bug. We are working to try
to avoid this problem, but we haven't found a solution yet. It won't
prevent the desktop freeze under memory shortage" problem from
occurring, though.

i.e. the reason your desktop freezes is that this allocation
here:

> [588653.596794]  do_try_to_free_pages+0xb6/0x350
> [588653.596798]  try_to_free_pages+0xce/0x1b0
> [588653.596802]  __alloc_pages_slowpath+0x33d/0xc80
> [588653.596808]  __alloc_pages_nodemask+0x23f/0x260
> [588653.596820]  ttm_pool_populate+0x25e/0x480 [ttm]
> [588653.596825]  ? kmalloc_large_node+0x37/0x60
> [588653.596828]  ? __kmalloc_node+0x20e/0x2b0
> [588653.596836]  ttm_populate_and_map_pages+0x24/0x250 [ttm]
> [588653.596845]  ttm_tt_populate.part.9+0x1b/0x60 [ttm]
> [588653.596853]  ttm_tt_bind+0x42/0x60 [ttm]
> [588653.596861]  ttm_bo_handle_move_mem+0x258/0x4e0 [ttm]
> [588653.596939]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
> [588653.596947]  ttm_bo_validate+0xe7/0x110 [ttm]
> [588653.596951]  ? preempt_count_sub+0x43/0x50
> [588653.596954]  ? _raw_write_unlock+0x12/0x30
> [588653.596974]  ? drm_pci_agp_destroy+0x4d/0x50 [drm]
> [588653.596983]  ttm_bo_init_reserved+0x347/0x390 [ttm]
> [588653.597059]  amdgpu_bo_do_create+0x19c/0x420 [amdgpu]
> [588653.597136]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
> [588653.597213]  amdgpu_bo_create+0x30/0x200 [amdgpu]
> [588653.597291]  amdgpu_gem_object_create+0x8b/0x110 [amdgpu]
> [588653.597404]  amdgpu_gem_create_ioctl+0x1d0/0x290 [amdgpu]
> [588653.597417]  ? preempt_count_sub+0x43/0x50
> [588653.597421]  ? _raw_spin_unlock+0x12/0x30
> [588653.597499]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
> [588653.597521]  drm_ioctl_kernel+0x7f/0xd0 [drm]
> [588653.597545]  drm_ioctl+0x1e4/0x380 [drm]
> [588653.597625]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
> [588653.597631]  ? tlb_finish_mmu+0x1f/0x30
> [588653.597637]  ? preempt_count_sub+0x43/0x50
> [588653.597712]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
> [588653.597720]  do_vfs_ioctl+0x8d/0x5d0

is a GFP_USER allocation. That is:

#define GFP_USER        (__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)

__GFP_RECLAIM means direct reclaim is allowed, as is reclaim via
kswapd.

__GFP_FS means "reclaim from filesystem caches is allowed".

__GFP_IO means that it's allowed to do IO during reclaim.

__GFP_HARDWALL means the allocation is limited to the current
cpuset memory policy.

Basically, the ttm infrastructure has said to the allocator that it
is ok to block for as long as it takes for you to do whatever you
need to do to reclaim enough memory for the required allocation.

Given that your workload is creating only filesystem memory
pressure, that means that's where reclaim is directed. And given
that the allocation says "blocking is fine" and "reclaim from
filesystems", it's no surprise that the GPU operations are getting
stuck behind filesytem reclaim.

> Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm 
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> sdd              0.80  162.20      0.00    159.67     0.00     0.00   0.00  
> 0.00 1161.50  933.72 130.70     4.00  1008.02   6.13 100.00

Yup, there's a second long wait for any specific read or write IO to
complete here.

GPU operations are interactive, so they really need to have bound
response times. Using "block until required memory is available"
operations guarantees that whenever the system gets low on memory,
desktop interactivity will go to shit.....

In the meantime, it might be worth checking if you have:

CONFIG_BLK_WBT=y
CONFIG_BLK_WBT_MQ=y

active in your kernel. The block layer writeback throttle should
minimise the impact of bulk data writeback IO on metadata writeback
and read IO latency and so help minimise long blocking times for
metadata IO.

Cheers,

Dave.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (6 preceding siblings ...)
  2019-01-24 23:31 ` bugzilla-daemon
@ 2019-01-25  9:55 ` bugzilla-daemon
  2019-01-25 12:50 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-25  9:55 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

Lucas Stach (dev@lynxeye.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dev@lynxeye.de

--- Comment #8 from Lucas Stach (dev@lynxeye.de) ---
So with my GPU developer hat on: What is it that the GPU driver devs should do
to avoid this?
GPU tasks aren't per-se realtime or interactive (think GPU compute tasks with
big working sets, that can take hours or even days to complete). So from the
driver perspective we _want_ FS reclaim to happen instead of failing a task,
but in general for the interactive workload we would rather trash the working
set a bit (by purging clean caches) instead of doing a high latency writeback
of dirty caches.

Also to put the obvious question up again: Why doesn't the reporter see any of
those catastrophic latency events with ext4? It it just that XFS can queue up
way more IO and thus direct reclaim getting stuck behind the already scheduled
IO for a way longer time?

If I understand the problem right, the basic issue is that we still don't have
have IO less dirty throttling in Linux, so we make the interactive GPU process
pay part of the bill of the background process dirtying lots of pages, instead
of throttling this one earlier.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (7 preceding siblings ...)
  2019-01-25  9:55 ` bugzilla-daemon
@ 2019-01-25 12:50 ` bugzilla-daemon
  2019-01-29 22:03 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-25 12:50 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #9 from nfxjfg@googlemail.com ---
On the kernel which I ran the tests:

# zcat /proc/config.gz |grep WBT
CONFIG_BLK_WBT=y
# CONFIG_BLK_WBT_SQ is not set
CONFIG_BLK_WBT_MQ=y


I can't reproduce any of this when running the same test script on an ext4
formatted disk under the same circumstances (different disk, but comparable
speed).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (8 preceding siblings ...)
  2019-01-25 12:50 ` bugzilla-daemon
@ 2019-01-29 22:03 ` bugzilla-daemon
  2019-01-30 17:58 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-29 22:03 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #10 from Dave Chinner (david@fromorbit.com) ---
Hi nfxjfg,

Can you try a plain 4.19.0 kernel and see if it has the same problem? I suspect
not, read https://bugzilla.kernel.org/show_bug.cgi?id=202441 for more details.
I'd like to confirm that this is the same issue.

-Dave.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (9 preceding siblings ...)
  2019-01-29 22:03 ` bugzilla-daemon
@ 2019-01-30 17:58 ` bugzilla-daemon
  2019-01-31 14:56 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-30 17:58 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #11 from nfxjfg@googlemail.com ---
I also have to note that I found it impossible to reproduce the issue using the
test script on some other hard disks. It must depend a lot on the work load and
the speed/size of the disk, or other unknown circumstances. This also means
that my comment about ext4 might be misleading, although I definitely confirmed
that my problem went away when I reformatted yet another disk from xfs to ext4.

Dave Chinner: would a 4.18.20 kernel also do? (I still have that on my disk.)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (10 preceding siblings ...)
  2019-01-30 17:58 ` bugzilla-daemon
@ 2019-01-31 14:56 ` bugzilla-daemon
  2019-09-29 11:52 ` bugzilla-daemon
  2019-11-22 10:46 ` bugzilla-daemon
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-01-31 14:56 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

--- Comment #12 from nfxjfg@googlemail.com ---
So I tried on the following kernel versions: 4.19.19 4.19.16 4.19.0 4.18.20
4.16.14
It happened on all of them.

Reproduction is a bit spotty. The script I first posted doesn't work reliably
anymore. I guess it depends on the kind and amount of memory pressure.

Despite hard reproduction, it's not an obscure issue. I've also hit it when
compiling the kernel on a XFS filesystem on a hard disk.

My reproduction steps are now as following (and yes they're absurd):

- run memtester 12G (make sure "free memory" as shown in top goes to very low
while running the test)
- start video playback (I used mpv with some random 720p video)
- run test.sh (maybe until 100k files)
- run sync
- run rm -rf /mnt/tmp1/tests/
- switch to another virtual desktop with lots of firefox windows (yeah...), and
switch back
- video playback gets noticeably interrupted for a moment

This happened even on 4.16.14.

dmesg except when I "caught" it again on 4.19.19 (there's nothing new I guess):

[  250.656494] sysrq: SysRq : Show Blocked State
[  250.656505]   task                        PC stack   pid father
[  250.656581] kswapd0         D    0    91      2 0x80000000
[  250.656585] Call Trace:
[  250.656600]  ? __schedule+0x23d/0x830
[  250.656604]  schedule+0x28/0x80
[  250.656608]  schedule_timeout+0x23e/0x360
[  250.656612]  wait_for_completion+0xeb/0x150
[  250.656617]  ? wake_up_q+0x70/0x70
[  250.656623]  ? __xfs_buf_submit+0x112/0x230
[  250.656625]  ? xfs_bwrite+0x25/0x60
[  250.656628]  xfs_buf_iowait+0x22/0xc0
[  250.656631]  __xfs_buf_submit+0x112/0x230
[  250.656633]  xfs_bwrite+0x25/0x60
[  250.656637]  xfs_reclaim_inode+0x2e5/0x310
[  250.656640]  xfs_reclaim_inodes_ag+0x19e/0x2d0
[  250.656645]  xfs_reclaim_inodes_nr+0x31/0x40
[  250.656650]  super_cache_scan+0x14c/0x1a0
[  250.656656]  do_shrink_slab+0x129/0x270
[  250.656660]  shrink_slab+0x201/0x280
[  250.656663]  shrink_node+0xd6/0x420
[  250.656666]  kswapd+0x3d3/0x6c0
[  250.656670]  ? mem_cgroup_shrink_node+0x140/0x140
[  250.656674]  kthread+0x110/0x130
[  250.656677]  ? kthread_create_worker_on_cpu+0x40/0x40
[  250.656680]  ret_from_fork+0x24/0x30
[  250.656785] Xorg            D    0   850    836 0x00400004
[  250.656789] Call Trace:
[  250.656792]  ? __schedule+0x23d/0x830
[  250.656795]  schedule+0x28/0x80
[  250.656798]  schedule_preempt_disabled+0xa/0x10
[  250.656801]  __mutex_lock.isra.5+0x28b/0x460
[  250.656806]  ? xfs_perag_get_tag+0x2d/0xc0
[  250.656808]  xfs_reclaim_inodes_ag+0x286/0x2d0
[  250.656811]  ? isolate_lru_pages.isra.55+0x34f/0x400
[  250.656817]  ? list_lru_add+0xb2/0x190
[  250.656819]  ? list_lru_isolate_move+0x40/0x60
[  250.656824]  ? iput+0x1f0/0x1f0
[  250.656827]  xfs_reclaim_inodes_nr+0x31/0x40
[  250.656829]  super_cache_scan+0x14c/0x1a0
[  250.656832]  do_shrink_slab+0x129/0x270
[  250.656836]  shrink_slab+0x144/0x280
[  250.656838]  shrink_node+0xd6/0x420
[  250.656841]  do_try_to_free_pages+0xb6/0x350
[  250.656844]  try_to_free_pages+0xce/0x180
[  250.656856]  __alloc_pages_slowpath+0x347/0xc70
[  250.656863]  __alloc_pages_nodemask+0x25c/0x280
[  250.656875]  ttm_pool_populate+0x25e/0x480 [ttm]
[  250.656880]  ? kmalloc_large_node+0x37/0x60
[  250.656883]  ? __kmalloc_node+0x204/0x2a0
[  250.656891]  ttm_populate_and_map_pages+0x24/0x250 [ttm]
[  250.656899]  ttm_tt_populate.part.9+0x1b/0x60 [ttm]
[  250.656907]  ttm_tt_bind+0x42/0x60 [ttm]
[  250.656915]  ttm_bo_handle_move_mem+0x258/0x4e0 [ttm]
[  250.656995]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
[  250.657003]  ttm_bo_validate+0xe7/0x110 [ttm]
[  250.657079]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
[  250.657105]  ? drm_vma_offset_add+0x46/0x50 [drm]
[  250.657113]  ttm_bo_init_reserved+0x342/0x380 [ttm]
[  250.657189]  amdgpu_bo_do_create+0x19c/0x400 [amdgpu]
[  250.657266]  ? amdgpu_bo_subtract_pin_size+0x50/0x50 [amdgpu]
[  250.657269]  ? try_to_wake_up+0x44/0x450
[  250.657343]  amdgpu_bo_create+0x30/0x200 [amdgpu]
[  250.657349]  ? cpumask_next_wrap+0x2c/0x70
[  250.657352]  ? sched_clock_cpu+0xc/0xb0
[  250.657355]  ? select_idle_sibling+0x293/0x3a0
[  250.657431]  amdgpu_gem_object_create+0x8b/0x110 [amdgpu]
[  250.657509]  amdgpu_gem_create_ioctl+0x1d0/0x290 [amdgpu]
[  250.657516]  ? tracing_record_taskinfo_skip+0x40/0x50
[  250.657518]  ? tracing_record_taskinfo+0xe/0xa0
[  250.657594]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
[  250.657614]  drm_ioctl_kernel+0x7f/0xd0 [drm]
[  250.657619]  ? sock_sendmsg+0x30/0x40
[  250.657639]  drm_ioctl+0x1e4/0x380 [drm]
[  250.657715]  ? amdgpu_gem_object_close+0x1c0/0x1c0 [amdgpu]
[  250.657720]  ? do_futex+0x2a1/0xa30
[  250.657802]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[  250.657828]  do_vfs_ioctl+0x8d/0x5d0
[  250.657832]  ? __x64_sys_futex+0x133/0x15b
[  250.657835]  ksys_ioctl+0x60/0x90
[  250.657838]  __x64_sys_ioctl+0x16/0x20
[  250.657842]  do_syscall_64+0x4a/0xd0
[  250.657845]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  250.657849] RIP: 0033:0x7f12b52dc747
[  250.657855] Code: Bad RIP value.
[  250.657856] RSP: 002b:00007ffceccab168 EFLAGS: 00003246 ORIG_RAX:
0000000000000010
[  250.657860] RAX: ffffffffffffffda RBX: 00007ffceccab250 RCX:
00007f12b52dc747
[  250.657861] RDX: 00007ffceccab1c0 RSI: 00000000c0206440 RDI:
000000000000000e
[  250.657863] RBP: 00007ffceccab1c0 R08: 0000559b8f644890 R09:
00007f12b53a7cb0
[  250.657864] R10: 0000559b8e72a010 R11: 0000000000003246 R12:
00000000c0206440
[  250.657865] R13: 000000000000000e R14: 0000559b8e7bf500 R15:
0000559b8f644890


Also I noticed some more bad behavior. When I copied hundreds of gigabytes from
a SSD block device to a XFS file system on a HDD, I got _severe_ problems with
tasks hanging. They got stuck in something like io_scheduler (I don't think I
have the log anymore; could probably reproduce if needed). This was also a
"desktop randomly freezes on heavy background I/O". Although the freezes were
worse (waiting for up to a minute for small I/O to finish!), it's overall not
as bad as the one this bug is about, because most hangs seemed to be about
accesses to the same filesystem.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (11 preceding siblings ...)
  2019-01-31 14:56 ` bugzilla-daemon
@ 2019-09-29 11:52 ` bugzilla-daemon
  2019-11-22 10:46 ` bugzilla-daemon
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-09-29 11:52 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

Hector Martin (hector@marcansoft.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hector@marcansoft.com

--- Comment #13 from Hector Martin (hector@marcansoft.com) ---
I'm the author of the mentioned tweet.

What I was seeing was that on a system with loads of free RAM, XFS reclaiming
inodes would randomly block on IO.

This is completely unexpected. I do expect that a system under memory and IO
pressure ("used" memory, not just "available" memory used for clean caches)
will block on IO during allocations that trigger writeback. I do *not* expect
that on a system with tons of clean data to evict, but that is what I saw with
XFS.

I had a process writing real-time data to disk (on a moderately busy system
with gigabytes of free RAM), and even though disk bandwidth was plenty to keep
up with the data, I was seeing buffer underruns due to big random latency
spikes. After I introduced a process in the pipeline doing up to several
gigabytes of RAM buffering to even out the spikes, I was *still* getting the
buffer input stuttering for several seconds, breaking the real-time capture.
That's where I realized that a kernel pipe buffer allocation was somehow
blocking on XFS doing IO.

I would echo 3 > /proc/sys/vm/drop_caches, and latencies would become normal. I
would then watch RAM used for caches slowly creep up, and when it hit 95% or
so, latency would randomly shoot through the roof again. This is obviously
broken behavior. Allocating from RAM used for caches should **never** block on
IO. The system should never slow down because extra RAM is being used for
caches. That is just insane. The whole point of using RAM for caches is to
utilize otherwise wasted RAM. If this is causing allocations to block on IO
when freeing said RAM, randomly causing huge latency spikes for everything,
then that is broken.

I've since switched to ext4 on the same disks and haven't had a problem ever
since.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 202349] Extreme desktop freezes during sustained write operations with XFS
  2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
                   ` (12 preceding siblings ...)
  2019-09-29 11:52 ` bugzilla-daemon
@ 2019-11-22 10:46 ` bugzilla-daemon
  13 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2019-11-22 10:46 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202349

Ivan Kozik (ivan@ludios.org) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ivan@ludios.org

--- Comment #14 from Ivan Kozik (ivan@ludios.org) ---
I have also had this problem with XFS; running 4.19 with
https://github.com/bobrik/linux/pull/3 ("xfs: add a sysctl to disable memory
reclaim participation") and fs.xfs.memory_reclaim = 1 ("async inode reclaim of
clean inodes only") has improved things for me.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, back to index

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-19 16:50 [Bug 202349] New: Extreme desktop freezes during sustained write operations with XFS bugzilla-daemon
2019-01-19 17:15 ` [Bug 202349] " bugzilla-daemon
2019-01-19 22:20 ` bugzilla-daemon
2019-01-20 21:59 ` bugzilla-daemon
2019-01-21 16:16 ` bugzilla-daemon
2019-01-21 19:04 ` bugzilla-daemon
2019-01-24 11:59 ` bugzilla-daemon
2019-01-24 23:31   ` Dave Chinner
2019-01-24 23:31 ` bugzilla-daemon
2019-01-25  9:55 ` bugzilla-daemon
2019-01-25 12:50 ` bugzilla-daemon
2019-01-29 22:03 ` bugzilla-daemon
2019-01-30 17:58 ` bugzilla-daemon
2019-01-31 14:56 ` bugzilla-daemon
2019-09-29 11:52 ` bugzilla-daemon
2019-11-22 10:46 ` bugzilla-daemon

Linux-XFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-xfs/0 linux-xfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-xfs linux-xfs/ https://lore.kernel.org/linux-xfs \
		linux-xfs@vger.kernel.org
	public-inbox-index linux-xfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-xfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git