All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
@ 2016-08-29 10:37 Eryu Guan
  2016-08-30  2:39 ` Dave Chinner
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Eryu Guan @ 2016-08-29 10:37 UTC (permalink / raw)
  To: xfs

[-- Attachment #1: Type: text/plain, Size: 6266 bytes --]

Hi,

I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
kernel but not with 4.8-rc2

[ 8841.923617] XFS (sda6): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c.  Caller xfs_iomap_write_allocate+0x2d7/0x380 [xfs] 
[ 8841.938286] CPU: 3 PID: 56 Comm: kswapd0 Not tainted 4.8.0-rc3 #1 
[ 8841.945073] Hardware name: IBM IBM System x3550 M4 Server -[7914I21]-/00J6242, BIOS -[D7E120CUS-1.20]- 08/23/2012 
[ 8841.956526]  0000000000000286 00000000c8d39410 ffff88046890b7a8 ffffffff8135c53c 
[ 8841.964818]  ffff8804144e4cb0 0000000000000001 ffff88046890b7c0 ffffffffa02d99cb 
[ 8841.973116]  ffffffffa02e5537 ffff88046890b7e8 ffffffffa02f53e6 ffff8801ad37e580 
[ 8841.981402] Call Trace: 
[ 8841.984134]  [<ffffffff8135c53c>] dump_stack+0x63/0x87 
[ 8841.989900]  [<ffffffffa02d99cb>] xfs_error_report+0x3b/0x40 [xfs] 
[ 8841.996813]  [<ffffffffa02e5537>] ? xfs_iomap_write_allocate+0x2d7/0x380 [xfs] 
[ 8842.004891]  [<ffffffffa02f53e6>] xfs_trans_cancel+0xb6/0xe0 [xfs] 
[ 8842.011803]  [<ffffffffa02e5537>] xfs_iomap_write_allocate+0x2d7/0x380 [xfs] 
[ 8842.019684]  [<ffffffffa02cf949>] xfs_map_blocks+0x1a9/0x220 [xfs] 
[ 8842.026593]  [<ffffffffa02d0c5b>] xfs_do_writepage+0x16b/0x560 [xfs] 
[ 8842.033695]  [<ffffffffa02d108b>] xfs_vm_writepage+0x3b/0x70 [xfs] 
[ 8842.040584]  [<ffffffff811b00dd>] pageout.isra.41+0x18d/0x2d0 
[ 8842.046993]  [<ffffffff811b1f3a>] shrink_page_list+0x78a/0x9b0 
[ 8842.053501]  [<ffffffff811b293d>] shrink_inactive_list+0x21d/0x570 
[ 8842.060396]  [<ffffffff811b350e>] shrink_node_memcg+0x51e/0x7d0 
[ 8842.067000]  [<ffffffff810a3900>] ? workqueue_congested+0x70/0x90 
[ 8842.073799]  [<ffffffff810a4862>] ? __queue_work+0x142/0x420 
[ 8842.080112]  [<ffffffff810a4862>] ? __queue_work+0x142/0x420 
[ 8842.086425]  [<ffffffff811b38a1>] shrink_node+0xe1/0x310 
[ 8842.092351]  [<ffffffff811b48d1>] kswapd+0x301/0x6f0 
[ 8842.097889]  [<ffffffff811b45d0>] ? mem_cgroup_shrink_node+0x180/0x180 
[ 8842.105172]  [<ffffffff810aca28>] kthread+0xd8/0xf0 
[ 8842.110614]  [<ffffffff816f8dbf>] ret_from_fork+0x1f/0x40 
[ 8842.116636]  [<ffffffff810ac950>] ? kthread_park+0x60/0x60 
[ 8842.122784] XFS (sda6): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffa02f53ff 
[ 8842.522306] XFS (sda6): Corruption of in-memory data detected.  Shutting down filesystem 
[ 8842.531358] XFS (sda6): Please umount the filesystem and rectify the problem(s) 
[ 8842.540470] Buffer I/O error on dev sda6, logical block 56162821, lost async page write 
[ 8842.549431] audit: netlink_unicast sending to audit_pid=1123 returned error: -111 
[ 8842.549433] audit: audit_lost=1 audit_rate_limit=0 audit_backlog_limit=64 
[ 8842.549434] audit: audit_pid=1123 reset 
[ 8842.552890] audit: type=1701 audit(1472234261.632:184): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:syslogd_t:s0 pid=2064 comm="in:imjournal" exe="/usr/sbin/rsyslogd" sig=7 
[ 8842.552909] XFS (sda6): xfs_do_force_shutdown(0x1) called from line 203 of file fs/xfs/libxfs/xfs_defer.c.  Return address = 0xffffffffa02b5459 
[ 8842.554230] audit: type=1701 audit(1472234261.633:185): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:init_t:s0 pid=1 comm="systemd" exe="/usr/lib/systemd/systemd" sig=11 
[ 8842.555324] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b 

So it's likely a regression introduced in 4.8-rc3, and my bisect test
pointed to commit 0af32fb468b4 ("xfs: fix bogus space reservation in
xfs_iomap_write_allocate").


The test I ran is "bash-shared-mapping", it's available in
autotest(bash-shared-mapping.c from ext3-tools.tar.gz).
https://github.com/autotest/autotest-client-tests/raw/master/bash_shared_mapping/ext3-tools.tar.gz

You may have to do some modifications to make it compile. I attached an
updated version of bash-shared-mapping.c, you can downloand and compile
it directly.

I attached a script too to reproduce it. Please note that the XFS
partition needs about 40G frees space, and it may take hours to finish
based on your memory setup on your host.

I reproduced it on multiple hosts e.g. host with 64G memory & 16 cpus
and host with 16G memory & 16 cpus, but I haven't seen it on my test vm
which has 8G memory & 4 vcpus.

Detailed information of the host with 64G memory is:
[root@hp-dl360g9-15 ~]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    1
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
Stepping:              2
CPU MHz:               2400.000
BogoMIPS:              4802.86
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-3,8-11
NUMA node1 CPU(s):     4-7,12-15
[root@hp-dl360g9-15 ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:          64305       38757         224           5       25323       61944
Swap:         16379          65       16314
[root@hp-dl360g9-15 ~]# xfs_info /
meta-data=/dev/mapper/systemvg-root isize=256    agcount=16, agsize=2927744 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=46843904, imaxpct=25
         =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=22912, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@hp-dl360g9-15 ~]# lvs
  LV   VG       Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home systemvg -wi-ao----   2.54t
  root systemvg -wi-ao---- 178.70g
  swap systemvg -wi-ao----  16.00g

If more information is needed please let me know.

Thanks,
Eryu

[-- Attachment #2: bashmemory.sh --]
[-- Type: application/x-sh, Size: 917 bytes --]

[-- Attachment #3: console.log --]
[-- Type: text/plain, Size: 20067 bytes --]

[ 2872.431497] run test /mnt/tests/kernel/filesystems/ext4/576202-bashmemory 
[-- MARK -- Fri Aug 26 16:20:00 2016] 
[ 3073.429520] INFO: task bash-shared-map:9527 blocked for more than 120 seconds. 
[ 3073.437589]       Not tainted 4.8.0-rc3 #1 
[ 3073.442163] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3073.450903] bash-shared-map D ffff88036e207c30     0  9527   9522 0x00000080 
[ 3073.458795]  ffff88036e207c30 ffff880150f34000 ffff880457f55a00 ffff880457f55a00 
[ 3073.467090]  ffff88036e208000 ffff8801ad37e790 ffff8801ad37e790 ffffffff00000000 
[ 3073.475386]  ffff8801ad37e7a8 ffff88036e207c48 ffffffff816f49e5 ffff880457f55a00 
[ 3073.483683] Call Trace: 
[ 3073.486419]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3073.491961]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3073.499157]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3073.506654]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3073.512459]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3073.520578]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3073.527883]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3073.533829]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3073.539578]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3073.546593]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3073.552540]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3073.558671]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3073.565880] INFO: task bash-shared-map:9532 blocked for more than 120 seconds. 
[ 3073.573950]       Not tainted 4.8.0-rc3 #1 
[ 3073.578531] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3073.587282] bash-shared-map D ffff88037663fc30     0  9532   9516 0x00000080 
[ 3073.595190]  ffff88037663fc30 ffff88037663fc40 ffff880459ff0000 ffff88037663fc30 
[ 3073.603520]  ffff880376640000 ffff8802a0f514d0 ffff8802a0f514d0 ffffffff00000000 
[ 3073.611820]  ffff8802a0f514e8 ffff88037663fc48 ffffffff816f49e5 ffff880459ff0000 
[ 3073.620112] Call Trace: 
[ 3073.622848]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3073.628390]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3073.635580]  [<ffffffff816f71fe>] ? rwsem_down_read_failed+0x10e/0x160 
[ 3073.642869]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3073.650352]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3073.656122]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3073.664201]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3073.671490]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3073.677420]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3073.683149]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3073.690151]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3073.696080]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3073.702204]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3073.709414] INFO: task bash-shared-map:9535 blocked for more than 120 seconds. 
[ 3073.717492]       Not tainted 4.8.0-rc3 #1 
[ 3073.722077] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3073.730822] bash-shared-map D ffff8801ca7dfc30     0  9535   9531 0x00000080 
[ 3073.738702]  ffff8801ca7dfc30 ffff8801536ca000 ffff880468985a00 ffff880468985a00 
[ 3073.746998]  ffff8801ca7e0000 ffff8801ad37e3d0 ffff8801ad37e3d0 ffffffff00000000 
[ 3073.755295]  ffff8801ad37e3e8 ffff8801ca7dfc48 ffffffff816f49e5 ffff880468985a00 
[ 3073.763593] Call Trace: 
[ 3073.766326]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3073.771871]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3073.779062]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3073.786543]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3073.792311]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3073.800391]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3073.807680]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3073.813630]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3073.819389]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3073.826390]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3073.832321]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3073.838449]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3073.845643] INFO: task bash-shared-map:9538 blocked for more than 120 seconds. 
[ 3073.853706]       Not tainted 4.8.0-rc3 #1 
[ 3073.858277] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3073.867018] bash-shared-map D ffff88037338fc30     0  9538   9531 0x00000080 
[ 3073.874909]  ffff88037338fc30 ffff8801536ca000 ffff88045a3fc380 ffff88045a3fc380 
[ 3073.883206]  ffff880373390000 ffff8801ad37e3d0 ffff8801ad37e3d0 ffffffff00000000 
[ 3073.891502]  ffff8801ad37e3e8 ffff88037338fc48 ffffffff816f49e5 ffff88045a3fc380 
[ 3073.899798] Call Trace: 
[ 3073.902529]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3073.908073]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3073.915266]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3073.922768]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3073.928556]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3073.936668]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3073.943969]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3073.949904]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3073.955640]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3073.962634]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3073.968564]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3073.974689]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3073.981884] INFO: task bash-shared-map:9539 blocked for more than 120 seconds. 
[ 3073.989947]       Not tainted 4.8.0-rc3 #1 
[ 3073.994516] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3074.003256] bash-shared-map D ffff880386eefc30     0  9539   9532 0x00000080 
[ 3074.011160]  ffff880386eefc30 ffff880386eefc40 ffff8803baed5a00 ffff880386eefc30 
[ 3074.019457]  ffff880386ef0000 ffff8802a0f514d0 ffff8802a0f514d0 ffffffff00000000 
[ 3074.027752]  ffff8802a0f514e8 ffff880386eefc48 ffffffff816f49e5 ffff8803baed5a00 
[ 3074.036047] Call Trace: 
[ 3074.038783]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3074.044325]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3074.051516]  [<ffffffff816f71fe>] ? rwsem_down_read_failed+0x10e/0x160 
[ 3074.058806]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3074.066293]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3074.072066]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3074.080146]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3074.087437]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3074.093366]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3074.099103]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3074.106099]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3074.112028]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3074.118154]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3074.125348] INFO: task bash-shared-map:9546 blocked for more than 120 seconds. 
[ 3074.133411]       Not tainted 4.8.0-rc3 #1 
[ 3074.137985] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3074.146724] bash-shared-map D ffff8801b9e2bc30     0  9546   9531 0x00000080 
[ 3074.154615]  ffff8801b9e2bc30 ffff8801536ca000 ffff88045aa30000 ffff88045aa30000 
[ 3074.162912]  ffff8801b9e2c000 ffff8801ad37e3d0 ffff8801ad37e3d0 ffffffff00000000 
[ 3074.171207]  ffff8801ad37e3e8 ffff8801b9e2bc48 ffffffff816f49e5 ffff88045aa30000 
[ 3074.179505] Call Trace: 
[ 3074.182236]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3074.187778]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3074.194968]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3074.202441]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3074.208194]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3074.216273]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3074.223563]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3074.229494]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3074.235231]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3074.242227]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3074.248165]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3074.254292]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3074.261484] INFO: task bash-shared-map:9550 blocked for more than 120 seconds. 
[ 3074.269545]       Not tainted 4.8.0-rc3 #1 
[ 3074.274116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3074.282855] bash-shared-map D ffff8801c962bc30     0  9550   9531 0x00000080 
[ 3074.290745]  ffff8801c962bc30 ffff8801536ca000 ffff88045aa35a00 ffff88045aa35a00 
[ 3074.299040]  ffff8801c962c000 ffff8801ad37e3d0 ffff8801ad37e3d0 ffffffff00000000 
[ 3074.307335]  ffff8801ad37e3e8 ffff8801c962bc48 ffffffff816f49e5 ffff88045aa35a00 
[ 3074.315632] Call Trace: 
[ 3074.318362]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3074.323904]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3074.331095]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3074.338577]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3074.344346]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3074.352424]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3074.359713]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3074.365660]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3074.371417]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3074.378438]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3074.384374]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3074.390503]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3074.397698] INFO: task bash-shared-map:9562 blocked for more than 120 seconds. 
[ 3074.405759]       Not tainted 4.8.0-rc3 #1 
[ 3074.410331] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3074.419071] bash-shared-map D ffff880198ab3c30     0  9562   9527 0x00000080 
[ 3074.426960]  ffff880198ab3c30 ffff880150f34000 ffff88038d37ad00 ffff88038d37ad00 
[ 3074.435256]  ffff880198ab4000 ffff8801ad37e790 ffff8801ad37e790 ffffffff00000000 
[ 3074.443554]  ffff8801ad37e7a8 ffff880198ab3c48 ffffffff816f49e5 ffff88038d37ad00 
[ 3074.451849] Call Trace: 
[ 3074.454581]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3074.460141]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3074.467339]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3074.474818]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3074.480586]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3074.488665]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3074.495955]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3074.501884]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3074.507623]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3074.514618]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3074.520549]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3074.526674]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3074.533866] INFO: task bash-shared-map:9563 blocked for more than 120 seconds. 
[ 3074.541928]       Not tainted 4.8.0-rc3 #1 
[ 3074.546500] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3074.555241] bash-shared-map D ffff880198a33c30     0  9563   9527 0x00000080 
[ 3074.563133]  ffff880198a33c30 ffff880150f34000 ffff88038d37c380 ffff88038d37c380 
[ 3074.571428]  ffff880198a34000 ffff8801ad37e790 ffff8801ad37e790 ffffffff00000000 
[ 3074.579738]  ffff8801ad37e7a8 ffff880198a33c48 ffffffff816f49e5 ffff88038d37c380 
[ 3074.588039] Call Trace: 
[ 3074.590774]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3074.596317]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3074.603507]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3074.610989]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3074.616754]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3074.624833]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3074.632122]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3074.638054]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3074.643792]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3074.650789]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3074.656710]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3074.662825]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[ 3074.670016] INFO: task bash-shared-map:9576 blocked for more than 120 seconds. 
[ 3074.678079]       Not tainted 4.8.0-rc3 #1 
[ 3074.682648] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
[ 3074.691389] bash-shared-map D ffff88037066bc30     0  9576   9534 0x00000080 
[ 3074.699279]  ffff88037066bc30 ffff8801a0df7000 ffff88032eaf8000 ffff88032eaf8000 
[ 3074.707577]  ffff88037066c000 ffff8802a0f51890 ffff8802a0f51890 ffffffff00000000 
[ 3074.715874]  ffff8802a0f518a8 ffff88037066bc48 ffffffff816f49e5 ffff88032eaf8000 
[ 3074.724165] Call Trace: 
[ 3074.726895]  [<ffffffff816f49e5>] schedule+0x35/0x80 
[ 3074.732437]  [<ffffffff816f7468>] rwsem_down_write_failed+0x218/0x390 
[ 3074.739628]  [<ffffffff8126298e>] ? block_commit_write+0xe/0x20 
[ 3074.746237]  [<ffffffff8136b4d7>] call_rwsem_down_write_failed+0x17/0x30 
[ 3074.753719]  [<ffffffff816f6add>] down_write+0x2d/0x40 
[ 3074.759472]  [<ffffffffa02dc974>] xfs_file_buffered_aio_write+0x64/0x260 [xfs] 
[ 3074.767552]  [<ffffffffa02dcc5d>] xfs_file_write_iter+0xed/0x130 [xfs] 
[ 3074.774833]  [<ffffffff8122a433>] __vfs_write+0xe3/0x160 
[ 3074.780762]  [<ffffffff8122b682>] vfs_write+0xb2/0x1b0 
[ 3074.786499]  [<ffffffff8100365d>] ? syscall_trace_enter+0x1dd/0x2c0 
[ 3074.793514]  [<ffffffff8122cc77>] SyS_pwrite64+0x87/0xb0 
[ 3074.799460]  [<ffffffff81003a47>] do_syscall_64+0x67/0x160 
[ 3074.805589]  [<ffffffff816f8c61>] entry_SYSCALL64_slow_path+0x25/0x25 
[-- MARK -- Fri Aug 26 16:25:00 2016] 
[-- MARK -- Fri Aug 26 16:30:00 2016] 
[-- MARK -- Fri Aug 26 16:35:00 2016] 
[ 3909.818209] perf: interrupt took too long (2511 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 
[-- MARK -- Fri Aug 26 16:40:00 2016] 
[-- MARK -- Fri Aug 26 16:45:00 2016] 
[-- MARK -- Fri Aug 26 16:50:00 2016] 
[-- MARK -- Fri Aug 26 16:55:00 2016] 
[-- MARK -- Fri Aug 26 17:00:00 2016] 
[-- MARK -- Fri Aug 26 17:05:00 2016] 
[-- MARK -- Fri Aug 26 17:10:00 2016] 
[-- MARK -- Fri Aug 26 17:15:00 2016] 
[-- MARK -- Fri Aug 26 17:20:00 2016] 
[-- MARK -- Fri Aug 26 17:25:00 2016] 
[-- MARK -- Fri Aug 26 17:30:00 2016] 
[-- MARK -- Fri Aug 26 17:35:00 2016] 
[-- MARK -- Fri Aug 26 17:40:00 2016] 
[-- MARK -- Fri Aug 26 17:45:00 2016] 
[ 8179.104300] perf: interrupt took too long (3151 > 3138), lowering kernel.perf_event_max_sample_rate to 63000 
[-- MARK -- Fri Aug 26 17:50:00 2016] 
[-- MARK -- Fri Aug 26 17:55:00 2016] 
[ 8841.923617] XFS (sda6): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c.  Caller xfs_iomap_write_allocate+0x2d7/0x380 [xfs] 
[ 8841.938286] CPU: 3 PID: 56 Comm: kswapd0 Not tainted 4.8.0-rc3 #1 
[ 8841.945073] Hardware name: IBM IBM System x3550 M4 Server -[7914I21]-/00J6242, BIOS -[D7E120CUS-1.20]- 08/23/2012 
[ 8841.956526]  0000000000000286 00000000c8d39410 ffff88046890b7a8 ffffffff8135c53c 
[ 8841.964818]  ffff8804144e4cb0 0000000000000001 ffff88046890b7c0 ffffffffa02d99cb 
[ 8841.973116]  ffffffffa02e5537 ffff88046890b7e8 ffffffffa02f53e6 ffff8801ad37e580 
[ 8841.981402] Call Trace: 
[ 8841.984134]  [<ffffffff8135c53c>] dump_stack+0x63/0x87 
[ 8841.989900]  [<ffffffffa02d99cb>] xfs_error_report+0x3b/0x40 [xfs] 
[ 8841.996813]  [<ffffffffa02e5537>] ? xfs_iomap_write_allocate+0x2d7/0x380 [xfs] 
[ 8842.004891]  [<ffffffffa02f53e6>] xfs_trans_cancel+0xb6/0xe0 [xfs] 
[ 8842.011803]  [<ffffffffa02e5537>] xfs_iomap_write_allocate+0x2d7/0x380 [xfs] 
[ 8842.019684]  [<ffffffffa02cf949>] xfs_map_blocks+0x1a9/0x220 [xfs] 
[ 8842.026593]  [<ffffffffa02d0c5b>] xfs_do_writepage+0x16b/0x560 [xfs] 
[ 8842.033695]  [<ffffffffa02d108b>] xfs_vm_writepage+0x3b/0x70 [xfs] 
[ 8842.040584]  [<ffffffff811b00dd>] pageout.isra.41+0x18d/0x2d0 
[ 8842.046993]  [<ffffffff811b1f3a>] shrink_page_list+0x78a/0x9b0 
[ 8842.053501]  [<ffffffff811b293d>] shrink_inactive_list+0x21d/0x570 
[ 8842.060396]  [<ffffffff811b350e>] shrink_node_memcg+0x51e/0x7d0 
[ 8842.067000]  [<ffffffff810a3900>] ? workqueue_congested+0x70/0x90 
[ 8842.073799]  [<ffffffff810a4862>] ? __queue_work+0x142/0x420 
[ 8842.080112]  [<ffffffff810a4862>] ? __queue_work+0x142/0x420 
[ 8842.086425]  [<ffffffff811b38a1>] shrink_node+0xe1/0x310 
[ 8842.092351]  [<ffffffff811b48d1>] kswapd+0x301/0x6f0 
[ 8842.097889]  [<ffffffff811b45d0>] ? mem_cgroup_shrink_node+0x180/0x180 
[ 8842.105172]  [<ffffffff810aca28>] kthread+0xd8/0xf0 
[ 8842.110614]  [<ffffffff816f8dbf>] ret_from_fork+0x1f/0x40 
[ 8842.116636]  [<ffffffff810ac950>] ? kthread_park+0x60/0x60 
[ 8842.122784] XFS (sda6): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffa02f53ff 
[ 8842.522306] XFS (sda6): Corruption of in-memory data detected.  Shutting down filesystem 
[ 8842.531358] XFS (sda6): Please umount the filesystem and rectify the problem(s) 
[ 8842.540470] Buffer I/O error on dev sda6, logical block 56162821, lost async page write 
[ 8842.549431] audit: netlink_unicast sending to audit_pid=1123 returned error: -111 
[ 8842.549433] audit: audit_lost=1 audit_rate_limit=0 audit_backlog_limit=64 
[ 8842.549434] audit: audit_pid=1123 reset 
[ 8842.552890] audit: type=1701 audit(1472234261.632:184): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:syslogd_t:s0 pid=2064 comm="in:imjournal" exe="/usr/sbin/rsyslogd" sig=7 
[ 8842.552909] XFS (sda6): xfs_do_force_shutdown(0x1) called from line 203 of file fs/xfs/libxfs/xfs_defer.c.  Return address = 0xffffffffa02b5459 
[ 8842.554230] audit: type=1701 audit(1472234261.633:185): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:init_t:s0 pid=1 comm="systemd" exe="/usr/lib/systemd/systemd" sig=11 
[ 8842.555324] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b 
[ 8842.555324]  
[ 8842.555327] CPU: 2 PID: 1 Comm: systemd Not tainted 4.8.0-rc3 #1 
[ 8842.555327] Hardware name: IBM IBM System x3550 M4 Server -[7914I21]-/00J6242, BIOS -[D7E120CUS-1.20]- 08/23/2012 
[ 8842.555331]  0000000000000086 0000000061573c1d ffff88017e223c28 ffffffff8135c53c 
[ 8842.555332]  ffff88017e1b8b00 ffffffff81a28e80 ffff88017e223ca8 ffffffff81198014 
[ 8842.555334]  ffffffff00000010 ffff88017e223cb8 ffff88017e223c58 0000000061573c1d 
[ 8842.555334] Call Trace: 
[ 8842.555342]  [<ffffffff8135c53c>] dump_stack+0x63/0x87 
[ 8842.555346]  [<ffffffff81198014>] panic+0xeb/0x232 
[ 8842.555349]  [<ffffffff81091b0b>] do_exit+0xa1b/0xb30 
[ 8842.555350]  [<ffffffff81091c9f>] do_group_exit+0x3f/0xb0 
[ 8842.555352]  [<ffffffff8109cb8c>] get_signal+0x1cc/0x600 
[ 8842.555356]  [<ffffffff8102d1b7>] do_signal+0x37/0x6d0 
[ 8842.555359]  [<ffffffff811d3a24>] ? handle_mm_fault+0xee4/0x1300 
[ 8842.555361]  [<ffffffff81089236>] ? mm_fault_error+0x11a/0x157 
[ 8842.555365]  [<ffffffff8106b2b0>] ? __do_page_fault+0x430/0x4a0 
[ 8842.555367]  [<ffffffff810878e3>] exit_to_usermode_loop+0x59/0xa2 
[ 8842.555370]  [<ffffffff81003938>] prepare_exit_to_usermode+0x38/0x40 
[ 8842.555374]  [<ffffffff816f952f>] retint_user+0x8/0x13 
[ 8842.557801] Kernel Offset: disabled 
[ 8842.759432] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b 
[ 8842.759432]  
[-- MARK -- Fri Aug 26 18:00:00 2016] 
[-- MARK -- Fri Aug 26 18:05:00 2016] 

[-- Attachment #4: bash-shared-mapping.c --]
[-- Type: text/plain, Size: 5428 bytes --]

/*
 * bash-shared-mapping.c - Andrew Morton <andrewm@uow.edu.au>
 *
 * Create a huge and holey shared mapping, then conduct multithreaded
 * write() I/O on some of it, while truncating and expanding it.  General
 * idea is to try to force pageout activity into the file while the kernel
 * is writing to and truncating the file.  We also perform pageout of areas
 * which are subject to write() and vice versa.  All sorts of stuff.
 *
 * It is good to run a concurrent task which uses heaps of memory, to force
 * pageouts.
 *
 * A good combination on a 1gigabyte machine is:
 *
 *	bash-shared-mapping -t5 foo 1000000000 &
 *	while true
 *	do
 *		usemem 1000
 *	done
 */

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <time.h>
#include <sys/mman.h>
#include <sys/time.h>
#include <sys/signal.h>
#include <sys/stat.h>

//ssize_t pwrite(unsigned int fd, const char * buf, size_t count, loff_t pos);
long ftruncate64(unsigned int fd, loff_t length);


#ifndef O_LARGEFILE
#define O_LARGEFILE	0100000
#endif

int verbose;
char *progname;
loff_t size;
int fd;
char *filename;
void *mapped_mem;
int got_sigbus;
loff_t sigbus_offset;
int ntasks = 1;
int niters = -1;

void open_file()
{
	fd = open(filename, O_RDWR|O_LARGEFILE|O_TRUNC|O_CREAT, 0666);
	if (fd < 0) {
		fprintf(stderr, "%s: Cannot open `%s': %s\n",
			progname, filename, strerror(errno));
		exit(1);
	}
}

ssize_t my_pwrite(unsigned int fd, const char * buf, size_t count, loff_t pos)
{
	if (pos > 2000000000)
		printf("DRAT\n");
	return pwrite(fd, buf, count, pos);
}

void mmap_file(void)
{
	mapped_mem = mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	if (mapped_mem == MAP_FAILED) {
		perror("mmap");
		exit(1);
	}
}

void stretch_file(loff_t size)
{
	long c = 1;
	int ret;

	if (verbose)
		printf("stretch file to %Ld\n", size);
	if ((ret = my_pwrite(fd, (const char *)&c,
				sizeof(c), size - sizeof(c))) != sizeof(c)) {
		fprintf(stderr, "%s: my_pwrite returned %d\n", __FUNCTION__, ret);
		perror("my_pwrite");
		exit(1);
	}
}	

/*
 * If another thread truncates the file, we get SIGBUS.
 * Who cares :)
 */
void sigbus(int sig)
{
	long c = 1;
	struct stat statbuf;
	int ret;
	loff_t new_len = sigbus_offset + sizeof(c);

	if (verbose)
		printf("sigbus - stretch to %Ld\n", new_len);
	got_sigbus = 1;
	/* Instantiate the file up to the sigbus address */
	if ((ret = my_pwrite(fd, (const char *)&c, sizeof(c), sigbus_offset)) != sizeof(c)) {
		fprintf(stderr, "%s: my_pwrite returned %d\n",__FUNCTION__,  ret);
		perror("sigbus my_pwrite");
	}
	if (fstat(fd, &statbuf)) {
		perror("fstat");
	}
	if (verbose)
		printf("length is now %ld\n", statbuf.st_size);
}

void set_sigbus_offset(loff_t offset)
{
	sigbus_offset = offset;
}

void install_signal_handler()
{
	signal(SIGBUS, sigbus);
}

void dirty_pages(loff_t offset, loff_t amount)
{
	long *p, val;
	loff_t idx;

	if (offset + amount > size)
		amount = size - offset;

	if (verbose)
		printf("dirty %Ld bytes at %Ld\n", amount, offset);

	val = 0;
	p = mapped_mem;
	amount /= sizeof(*p);
	offset /= sizeof(*p);
	got_sigbus = 0;
	for (idx = 0; idx < amount; idx++) {
		set_sigbus_offset((idx + offset) * sizeof(*p));
		p[idx + offset] = val++;
		if (got_sigbus) {
			if (verbose)
				printf("dirty_pages: sigbus\n");
			break;
		}
	}
}	

void write_stuff(loff_t from, loff_t to, loff_t amount)
{
	int ret;

	if (from + amount > size)
		amount = size - from;
	if (to + amount > size)
		amount = size - to;

	if (verbose)
		printf("my_pwrite %Ld bytes from %Ld to %Ld\n", amount, from, to);

	if ((ret = my_pwrite(fd, (char *)mapped_mem + from, amount, to)) != amount) {
		if (verbose)
			printf("%s: my_pwrite returned %d, not %Ld\n",__FUNCTION__, 
					ret, amount);
		if (errno == EFAULT) {
			/*
			 * It was unmapped under us
			 */
			if (verbose)
				printf("my_pwrite: EFAULT\n");
		} else if (ret < 0) {
			perror("my_pwrite");
			exit(1);
		}
	}
}

#if 1
loff_t rand_of(loff_t arg)
{
	double dret = arg;
	loff_t ret;

	dret *= drand48();
	ret = dret;
#if 0
	if (ret < 0 || ret > 0x40000000)
		printf("I goofed: %Ld\n", ret);
#endif
	return ret;
}
#else
loff_t rand_of(loff_t arg)
{
	return rand() % arg;
}
#endif

void usage(void)
{
	fprintf(stderr, "Usage: %s [-v] [-nN] [-tN] filename size-in-bytes\n", progname);
	fprintf(stderr, "      -v:         Verbose\n"); 
	fprintf(stderr, "     -nN:         Run N iterations\n"); 
	fprintf(stderr, "     -tN:         Run N tasks\n"); 
	exit(1);
}

int main(int argc, char *argv[])
{
	int c;
	int i;
	int niters = -1;

	progname = argv[0];
	while ((c = getopt(argc, argv, "vn:t:")) != -1) {
		switch (c) {
		case 'n':
			niters = strtol(optarg, NULL, 10);
			break;
		case 't':
			ntasks = strtol(optarg, NULL, 10);
			break;
		case 'v':
			verbose++;
			break;
		}
	}

	if (optind == argc)
		usage();
	filename = argv[optind++];
	if (optind == argc)
		usage();
	sscanf(argv[optind++], "%Ld", &size);
	if (optind != argc)
		usage();
	if (size < 10)
		size = 10;
	open_file();

	for (i = 1; i < ntasks; i++) {
		if (fork() == 0)
			break;
	}

	stretch_file(size);
	mmap_file();
	install_signal_handler();
	srand48(time(0) * getpid());
	srand(10 * getpid());

	while (niters--) {
		dirty_pages(rand_of(size), rand_of(size));
		write_stuff(rand_of(size), rand_of(size), rand_of(size));
		ftruncate64(fd, rand_of(size));
		stretch_file(rand_of(size) + 10);
	}
	exit(0);
}

[-- Attachment #5: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-08-29 10:37 BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c Eryu Guan
@ 2016-08-30  2:39 ` Dave Chinner
  2016-08-30 14:48   ` Eryu Guan
  2016-09-29  7:54   ` Eryu Guan
  2016-08-31  8:56 ` Eryu Guan
  2016-09-01 10:39 ` Eryu Guan
  2 siblings, 2 replies; 8+ messages in thread
From: Dave Chinner @ 2016-08-30  2:39 UTC (permalink / raw)
  To: Eryu Guan; +Cc: xfs

On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> Hi,
> 
> I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> kernel but not with 4.8-rc2
.....
> I attached a script too to reproduce it. Please note that the XFS
> partition needs about 40G frees space, and it may take hours to finish
> based on your memory setup on your host.

Ugh. can you try to narrow the cause so it takes less time to
reproduce? This is almost certainly one of two things:

	1) a ENOSPC issue where an AG is almost-but-not-quite full,
	but fixing up the freelist results in there being not enough
	blocks left to allocate the data extent; or

	2) we've split a delalloc extent so many times that we've
	run out of indirect block reservation and we hit ENOSPC as a
	result.

For the latter, I suspect a test case where we take a large delalloc
range and use sync_file_range to do single page writeback to "binary
split" the delalloc range. i.e. start with a 128MB delalloc, then
sync a 4k block at offset 64MB, then 4k at 32MB, then 16MB, then
8MB, ... all the way down to writing the first block in the file,
and also all the way up to the final block in the file.

Then write every second 4k block to cause worse case growth of the
bmbt and hopefully then exhaust the indirect block reservation for
that delalloc region...

> [root@hp-dl360g9-15 ~]# xfs_info /
> meta-data=/dev/mapper/systemvg-root isize=256    agcount=16, agsize=2927744 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=0        finobt=0 spinodes=0
> data     =                       bsize=4096   blocks=46843904, imaxpct=25
>          =                       sunit=64     swidth=192 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
> log      =internal               bsize=4096   blocks=22912, version=2
>          =                       sectsz=512   sunit=64 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Does it reproduce on a CRC enabled filesystem?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-08-30  2:39 ` Dave Chinner
@ 2016-08-30 14:48   ` Eryu Guan
  2016-09-29  7:54   ` Eryu Guan
  1 sibling, 0 replies; 8+ messages in thread
From: Eryu Guan @ 2016-08-30 14:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Tue, Aug 30, 2016 at 12:39:05PM +1000, Dave Chinner wrote:
> > [root@hp-dl360g9-15 ~]# xfs_info /
> > meta-data=/dev/mapper/systemvg-root isize=256    agcount=16, agsize=2927744 blks
> >          =                       sectsz=512   attr=2, projid32bit=1
> >          =                       crc=0        finobt=0 spinodes=0
> > data     =                       bsize=4096   blocks=46843904, imaxpct=25
> >          =                       sunit=64     swidth=192 blks
> > naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
> > log      =internal               bsize=4096   blocks=22912, version=2
> >          =                       sectsz=512   sunit=64 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> Does it reproduce on a CRC enabled filesystem?

Yes, it does. And I tried to reduce the test time by reducing the
workload (less processes forked 30->10, less iterations in each process
10->5) but failed. I'll continue work on that.

[root@hp-dl360g9-15 ~]# xfs_info /mnt/xfs/
meta-data=/dev/mapper/systemvg-lv50g isize=512    agcount=16, agsize=819200 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
data     =                       bsize=4096   blocks=13107200, imaxpct=25
         =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=6400, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[39200.052565] XFS (dm-3): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c.  Caller xfs_iomap_write_allocate+0x2d7/0x380 [xfs]
[39200.117990] CPU: 2 PID: 13175 Comm: kworker/u33:0 Tainted: G        W       4.8.0-rc3 #1
[39200.155312] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[39200.184549] Workqueue: writeback wb_workfn (flush-253:3)
[39200.208417]  0000000000000286 00000000ee09c3c6 ffff88005208b7a0 ffffffff8135c53c
[39200.241625]  ffff880316eb8bc8 0000000000000001 ffff88005208b7b8 ffffffffa02e99cb
[39200.274892]  ffffffffa02f5537 ffff88005208b7e0 ffffffffa03053e6 ffff880fe8d28f00
[39200.308115] Call Trace:
[39200.319060]  [<ffffffff8135c53c>] dump_stack+0x63/0x87
[39200.342241]  [<ffffffffa02e99cb>] xfs_error_report+0x3b/0x40 [xfs]
[39200.370008]  [<ffffffffa02f5537>] ? xfs_iomap_write_allocate+0x2d7/0x380 [xfs]
[39200.402497]  [<ffffffffa03053e6>] xfs_trans_cancel+0xb6/0xe0 [xfs]
[39200.430260]  [<ffffffffa02f5537>] xfs_iomap_write_allocate+0x2d7/0x380 [xfs]
[39200.461871]  [<ffffffffa02df949>] xfs_map_blocks+0x1a9/0x220 [xfs]
[39200.489701]  [<ffffffffa02e0c5b>] xfs_do_writepage+0x16b/0x560 [xfs]
[39200.518171]  [<ffffffff811a7d8f>] write_cache_pages+0x26f/0x510
[39200.544290]  [<ffffffff81330cbb>] ? blk_queue_bio+0x1ab/0x3a0
[39200.570043]  [<ffffffffa02e0af0>] ? xfs_vm_set_page_dirty+0x1e0/0x1e0 [xfs]
[39200.606240]  [<ffffffffa02e08e6>] xfs_vm_writepages+0xb6/0xe0 [xfs]
[39200.636151]  [<ffffffff811a8c5e>] do_writepages+0x1e/0x30
[39200.660355]  [<ffffffff8125a1f5>] __writeback_single_inode+0x45/0x330
[39200.689176]  [<ffffffff8125aa32>] writeback_sb_inodes+0x282/0x570
[39200.716875]  [<ffffffff8125adac>] __writeback_inodes_wb+0x8c/0xc0
[39200.744181]  [<ffffffff8125b066>] wb_writeback+0x286/0x320
[39200.768725]  [<ffffffff8125b889>] wb_workfn+0x109/0x3f0
[39200.792110]  [<ffffffff810a6642>] process_one_work+0x152/0x400
[39200.818480]  [<ffffffff810a6f35>] worker_thread+0x125/0x4b0
[39200.843463]  [<ffffffff810a6e10>] ? rescuer_thread+0x380/0x380
[39200.869279]  [<ffffffff810aca28>] kthread+0xd8/0xf0
[39200.891301]  [<ffffffff816f8dbf>] ret_from_fork+0x1f/0x40
[39200.916321]  [<ffffffff810ac950>] ? kthread_park+0x60/0x60
[39200.941595] XFS (dm-3): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffa03053ff
[39201.097512] XFS (dm-3): Corruption of in-memory data detected.  Shutting down filesystem
[39201.137777] XFS (dm-3): Please umount the filesystem and rectify the problem(s)
[39201.170534] Buffer I/O error on dev dm-3, logical block 2653783, lost async page write
[39201.206111] Buffer I/O error on dev dm-3, logical block 2653784, lost async page write
[39201.241644] Buffer I/O error on dev dm-3, logical block 2653785, lost async page write
[39201.276902] Buffer I/O error on dev dm-3, logical block 2653786, lost async page write
[39201.312405] Buffer I/O error on dev dm-3, logical block 2653787, lost async page write
[39201.347927] Buffer I/O error on dev dm-3, logical block 2653788, lost async page write
[39201.383415] Buffer I/O error on dev dm-3, logical block 2653789, lost async page write
[39201.419254] Buffer I/O error on dev dm-3, logical block 2653790, lost async page write
[39201.454723] Buffer I/O error on dev dm-3, logical block 2653791, lost async page write
[39222.364615] XFS (dm-3): xfs_log_force: error -5 returned.
[39252.572790] XFS (dm-3): xfs_log_force: error -5 returned.
[39282.780966] XFS (dm-3): xfs_log_force: error -5 returned.

Thanks,
Eryu

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-08-29 10:37 BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c Eryu Guan
  2016-08-30  2:39 ` Dave Chinner
@ 2016-08-31  8:56 ` Eryu Guan
  2016-09-01 10:39 ` Eryu Guan
  2 siblings, 0 replies; 8+ messages in thread
From: Eryu Guan @ 2016-08-31  8:56 UTC (permalink / raw)
  To: xfs

On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> Hi,
> 
> I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> kernel but not with 4.8-rc2

Sometimes I hit the following warning instead of the fs shutdown, if I
lowered the stress load.

[15276.032482] ------------[ cut here ]------------
[15276.055649] WARNING: CPU: 1 PID: 5535 at fs/xfs/xfs_aops.c:1069 xfs_vm_releasepage+0x106/0x130 [xfs]
[15276.101221] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ipt_REJECT nf_reject_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul iTCO_wdt glue_helper ipmi_ssif ablk_helper iTCO_vendor_support cryptd i2c_i801 hpwdt ipmi_si hpilo sg pcspkr wmi i2c_smbus ioatdma ipmi_msghandler pcc_cpufreq lpc_ich dca shpchp acpi_cpufreq acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit 
 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm tg3 uas ptp serio_raw usb_storage crc32c_intel hpsa i2c_core pps_core scsi_transport_sas fjes dm_mirror dm_region_hash dm_log dm_mod
[15276.593111] CPU: 1 PID: 5535 Comm: bash-shared-map Not tainted 4.8.0-rc3 #1
[15276.627509] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[15276.658663]  0000000000000286 00000000b9ab484d ffff88085269f500 ffffffff8135c53c
[15276.693463]  0000000000000000 0000000000000000 ffff88085269f540 ffffffff8108d661
[15276.728306]  0000042d18524440 ffffea0018524460 ffffea0018524440 ffff88085e615028
[15276.762986] Call Trace:
[15276.774250]  [<ffffffff8135c53c>] dump_stack+0x63/0x87
[15276.798320]  [<ffffffff8108d661>] __warn+0xd1/0xf0
[15276.820742]  [<ffffffff8108d79d>] warn_slowpath_null+0x1d/0x20
[15276.848141]  [<ffffffffa02c3226>] xfs_vm_releasepage+0x106/0x130 [xfs]
[15276.878802]  [<ffffffff8119a9fd>] try_to_release_page+0x3d/0x60
[15276.906568]  [<ffffffff811b1fec>] shrink_page_list+0x83c/0x9b0
[15276.933952]  [<ffffffff811b293d>] shrink_inactive_list+0x21d/0x570
[15276.962881]  [<ffffffff811b350e>] shrink_node_memcg+0x51e/0x7d0
[15276.990564]  [<ffffffff812176d7>] ? mem_cgroup_iter+0x127/0x2c0
[15277.017923]  [<ffffffff811b38a1>] shrink_node+0xe1/0x310
[15277.042940]  [<ffffffff811b3dcb>] do_try_to_free_pages+0xeb/0x370
[15277.071624]  [<ffffffff811b413f>] try_to_free_pages+0xef/0x1b0
[15277.100457]  [<ffffffff81225b96>] __alloc_pages_slowpath+0x33d/0x865
[15277.132333]  [<ffffffff811a4874>] __alloc_pages_nodemask+0x2d4/0x320
[15277.162990]  [<ffffffff811f5de8>] alloc_pages_current+0x88/0x120
[15277.191163]  [<ffffffff8119a9ae>] __page_cache_alloc+0xae/0xc0
[15277.218596]  [<ffffffff811a93c8>] __do_page_cache_readahead+0xf8/0x250
[15277.249416]  [<ffffffff81262841>] ? mark_buffer_dirty+0x91/0x120
[15277.277823]  [<ffffffff813628dd>] ? radix_tree_lookup+0xd/0x10
[15277.305062]  [<ffffffff811a9655>] ondemand_readahead+0x135/0x260
[15277.332764]  [<ffffffff811a97ec>] page_cache_async_readahead+0x6c/0x70
[15277.363440]  [<ffffffff8119e1a3>] filemap_fault+0x393/0x550
[15277.389663]  [<ffffffffa02cce3f>] xfs_filemap_fault+0x5f/0xf0 [xfs]
[15277.418997]  [<ffffffff811cda3f>] __do_fault+0x7f/0x100
[15277.443617]  [<ffffffffa02c29d4>] ? xfs_vm_set_page_dirty+0xc4/0x1e0 [xfs]
[15277.475880]  [<ffffffff811d319d>] handle_mm_fault+0x65d/0x1300
[15277.503198]  [<ffffffff8106b04b>] __do_page_fault+0x1cb/0x4a0
[15277.530218]  [<ffffffff8106b350>] do_page_fault+0x30/0x80
[15277.555708]  [<ffffffff816fa048>] page_fault+0x28/0x30
[15277.579871] ---[ end trace 5211814c2a051103 ]---

And I'm still trying to find a more reliable & efficient reproducer.

Thanks,
Eryu

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-08-29 10:37 BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c Eryu Guan
  2016-08-30  2:39 ` Dave Chinner
  2016-08-31  8:56 ` Eryu Guan
@ 2016-09-01 10:39 ` Eryu Guan
  2016-09-01 21:46   ` Dave Chinner
  2 siblings, 1 reply; 8+ messages in thread
From: Eryu Guan @ 2016-09-01 10:39 UTC (permalink / raw)
  To: xfs

On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> Hi,
> 
> I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> kernel but not with 4.8-rc2
> 
[snip]
>
> So it's likely a regression introduced in 4.8-rc3, and my bisect test
> pointed to commit 0af32fb468b4 ("xfs: fix bogus space reservation in
> xfs_iomap_write_allocate").

This might be buried in the report, I've bisected this down to

commit 0af32fb468b4a4434dd759d68611763658650b59
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Aug 17 08:30:28 2016 +1000

    xfs: fix bogus space reservation in xfs_iomap_write_allocate
    
    The space reservations was without an explaination in commit
    
        "Add error reporting calls in error paths that return EFSCORRUPTED"
    
    back in 2003.  There is no reason to reserve disk blocks in the
    transaction when allocating blocks for delalloc space as we already
    reserved the space when creating the delalloc extent.
    
    With this fix we stop running out of the reserved pool in
    generic/229, which has happened for long time with small blocksize
    file systems, and has increased in severity with the new buffered
    write path.
    
    [ dchinner: we still need to pass the block reservation into
      xfs_bmapi_write() to ensure we don't deadlock during AG selection.
      See commit dbd5c8c ("xfs: pass total block res. as total
      xfs_bmapi_write() parameter") for more details on why this is
      necessary. ]
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

Just make it more clear.

Thanks,
Eryu

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-09-01 10:39 ` Eryu Guan
@ 2016-09-01 21:46   ` Dave Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2016-09-01 21:46 UTC (permalink / raw)
  To: Eryu Guan; +Cc: xfs

On Thu, Sep 01, 2016 at 06:39:55PM +0800, Eryu Guan wrote:
> On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> > Hi,
> > 
> > I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> > kernel but not with 4.8-rc2
> > 
> [snip]
> >
> > So it's likely a regression introduced in 4.8-rc3, and my bisect test
> > pointed to commit 0af32fb468b4 ("xfs: fix bogus space reservation in
> > xfs_iomap_write_allocate").
> 
> This might be buried in the report, I've bisected this down to
> 
> commit 0af32fb468b4a4434dd759d68611763658650b59
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Wed Aug 17 08:30:28 2016 +1000
> 
>     xfs: fix bogus space reservation in xfs_iomap_write_allocate

*nod*. I did notice that - it's what I based my previous "this is
what I think it causing the warning" email on. i.e. we're dirtying a
AGF/AGFL to update the freelist, then finding we don't have space in
the AG for the data allocation, then we ENOSPC and cancel a dirty
transaction.

I'm not going to back this change out right now, because it does fix
other, much easier to hit issues. I think it has probably uncovered
another "off-by-one" corner case in the "can we use the AG"
calculations, so when I get a chance I'll look through the code
and see what I can find. A faster reproducer would make that a lot
easier, so if you manage to find one, I woul dbe very helpful.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-08-30  2:39 ` Dave Chinner
  2016-08-30 14:48   ` Eryu Guan
@ 2016-09-29  7:54   ` Eryu Guan
  2016-09-29  8:00     ` Eryu Guan
  1 sibling, 1 reply; 8+ messages in thread
From: Eryu Guan @ 2016-09-29  7:54 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-xfs

On Tue, Aug 30, 2016 at 12:39:05PM +1000, Dave Chinner wrote:
> On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> > Hi,
> > 
> > I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> > kernel but not with 4.8-rc2
> .....
> > I attached a script too to reproduce it. Please note that the XFS
> > partition needs about 40G frees space, and it may take hours to finish
> > based on your memory setup on your host.
> 
> Ugh. can you try to narrow the cause so it takes less time to
> reproduce? This is almost certainly one of two things:
> 
> 	1) a ENOSPC issue where an AG is almost-but-not-quite full,
> 	but fixing up the freelist results in there being not enough
> 	blocks left to allocate the data extent; or
> 
> 	2) we've split a delalloc extent so many times that we've
> 	run out of indirect block reservation and we hit ENOSPC as a
> 	result.

[Sorry for getting back on this after so long time..]

> 
> For the latter, I suspect a test case where we take a large delalloc
> range and use sync_file_range to do single page writeback to "binary
> split" the delalloc range. i.e. start with a 128MB delalloc, then
> sync a 4k block at offset 64MB, then 4k at 32MB, then 16MB, then
> 8MB, ... all the way down to writing the first block in the file,
> and also all the way up to the final block in the file.
> 
> Then write every second 4k block to cause worse case growth of the
> bmbt and hopefully then exhaust the indirect block reservation for
> that delalloc region...

Seems it's only reproducible on certain hosts, and I haven't been able
to work out an efficient & reliable reproducer. I tried to tuned the
parameters of bash-shared-mapping run, but failed to find a efficient
parameter conbination. Also tried to write a script (attached, not sure
if it's correct) based on the second case above, but still cannot
reproduce it.

By adding debug logs, I got the following stack trace, it's
xfs_bmbt_alloc_block() returns first ENOSPC:

        if (!args.wasdel && args.tp->t_blk_res == 0) {
		error = -ENOSPC;
		goto error0;
	}

[  783.154400] xfs_bmbt_alloc_block: set -ENOSPC and returned                                                                                                                                  
[  783.183011] __xfs_btree_split: cur->bc_ops->alloc_block returned -28                                                                                                                        
[  783.214758] xfs_btree_split: workqueue result returned -28                                                                                                                                  
[  783.239319] xfs_btree_make_block_unfull: xfs_btree_split returned -28                                                                                                                       
[  783.268190] xfs_btree_insrec: xfs_btree_make_block_unfull returned -28                                                                                                                      
[  783.297550] xfs_btree_insert: xfs_btree_insrec returned -28                                                                                                                                 
[  783.322672] xfs_bmap_add_extent_hole_real: case 0 xfs_btree_insert returned -28                                                                                                             
[  783.355441] xfs_bmapi_allocate: xfs_bmap_add_extent_hole_real returned -28                                                                                                                  
[  783.386272] xfs_bmapi_write: xfs_bmapi_allocate returned -28                                                                                                                                
[  783.411332] xfs_iomap_write_allocate: goto trans_cancel after xfs_bmapi_write, error=-28

Not sure if this helps.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c
  2016-09-29  7:54   ` Eryu Guan
@ 2016-09-29  8:00     ` Eryu Guan
  0 siblings, 0 replies; 8+ messages in thread
From: Eryu Guan @ 2016-09-29  8:00 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, linux-xfs

On Thu, Sep 29, 2016 at 03:54:14PM +0800, Eryu Guan wrote:
> On Tue, Aug 30, 2016 at 12:39:05PM +1000, Dave Chinner wrote:
> > On Mon, Aug 29, 2016 at 06:37:54PM +0800, Eryu Guan wrote:
> > > Hi,
> > > 
> > > I've hit an XFS internal error then filesystem shutdown with 4.8-rc3
> > > kernel but not with 4.8-rc2
> > .....
> > > I attached a script too to reproduce it. Please note that the XFS
> > > partition needs about 40G frees space, and it may take hours to finish
> > > based on your memory setup on your host.
> > 
> > Ugh. can you try to narrow the cause so it takes less time to
> > reproduce? This is almost certainly one of two things:
> > 
> > 	1) a ENOSPC issue where an AG is almost-but-not-quite full,
> > 	but fixing up the freelist results in there being not enough
> > 	blocks left to allocate the data extent; or
> > 
> > 	2) we've split a delalloc extent so many times that we've
> > 	run out of indirect block reservation and we hit ENOSPC as a
> > 	result.
> 
> [Sorry for getting back on this after so long time..]
> 
> > 
> > For the latter, I suspect a test case where we take a large delalloc
> > range and use sync_file_range to do single page writeback to "binary
> > split" the delalloc range. i.e. start with a 128MB delalloc, then
> > sync a 4k block at offset 64MB, then 4k at 32MB, then 16MB, then
> > 8MB, ... all the way down to writing the first block in the file,
> > and also all the way up to the final block in the file.
> > 
> > Then write every second 4k block to cause worse case growth of the
> > bmbt and hopefully then exhaust the indirect block reservation for
> > that delalloc region...
> 
> Seems it's only reproducible on certain hosts, and I haven't been able
> to work out an efficient & reliable reproducer. I tried to tuned the
> parameters of bash-shared-mapping run, but failed to find a efficient
> parameter conbination. Also tried to write a script (attached, not sure
> if it's correct) based on the second case above, but still cannot
> reproduce it.

Test script:

#!/bin/bash
dev=/dev/mapper/systemvg-lv50g
mnt=/mnt/xfs
testfile=/mnt/xfs/testfile

umount $dev
mount $dev $mnt

rm -f $testfile.*

do_test()
{
        local testfile=$1
        echo "pwrite -b 128M 0 128M && sync_range 64M 4k"
        #xfs_io -fc "falloc 0 128M" -c "sync_range 64M 4k" $testfile >/dev/null
        #xfs_io -fc "pwrite $((128*1024*1024 - 4096)) 4096" -c "sync_range 64M 4k" $testfile >/dev/null
        xfs_io -fc "pwrite -b 128M 0 128M" -c "sync_range 64M 4k" $testfile >/dev/null
        step=33554432           # 32M
        off_down=67108864       # 64M
        off_up=67108864         # 64M
        while [ $off_down -gt 0 ]; do
                if [ $step -lt 4096 ]; then
                        step=4096
                fi
                off_down=$((off_down - step))
                off_up=$((off_up + step))
                step=$((step / 2))
                echo "sync_range $off_down 4k && sync_range $off_up 4k"
                xfs_io -c "sync_range $off_down 4k" -c "sync_range $off_up 4k" $testfile
        done

        offset=4096
        while [ $offset -lt 134217728 ]; do
        #       echo "pwrite $offset 4k"
                xfs_io -c "pwrite $offset 4k" $testfile >/dev/null
                offset=$((offset + 8192))
        done
}

for i in `seq 1 10`; do
        do_test $testfile.$i &
done
wait

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-09-29  8:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-29 10:37 BUG: Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c Eryu Guan
2016-08-30  2:39 ` Dave Chinner
2016-08-30 14:48   ` Eryu Guan
2016-09-29  7:54   ` Eryu Guan
2016-09-29  8:00     ` Eryu Guan
2016-08-31  8:56 ` Eryu Guan
2016-09-01 10:39 ` Eryu Guan
2016-09-01 21:46   ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.