* XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
@ 2012-02-29 18:20 Rafał Kupka
2012-03-02 10:43 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Rafał Kupka @ 2012-02-29 18:20 UTC (permalink / raw)
To: xfs
Hello,
Spotted on two different machines (one is a KVM guest) during xfs_dump
and some filesystem load.
I can reproduce it by doing
while : ; do xfsdump -f /dev/null -l0 -p1 -J -L "test" -M "test" /; done
simultaneously with some postmark benchmarks (any read/write load would
trigger it).
Bad kernels:
deb 3.2.4-1~bpo60+1 Debian/squeeze with backports
linus git tree - 164974a8f2a482f1abcb027c6d1a89dd79b14297 (post v3.3-rc5)
Good kernel:
2.6.32 (deb ver 2.6.32-41) Debian/squeeze (both amd64/i686)
linus git tree - v2.6.38
Backtrace (from kernel 164974a8f2a482f1abcb027c6d1a89dd79b14297):
Feb 29 17:01:37 goscie kernel: [ 265.409709] XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
Feb 29 17:01:37 goscie kernel: [ 265.410412] ------------[ cut here ]------------
Feb 29 17:01:37 goscie kernel: [ 265.410749] kernel BUG at fs/xfs/xfs_message.c:101!
Feb 29 17:01:37 goscie kernel: [ 265.411124] invalid opcode: 0000 [#1] SMP
Feb 29 17:01:37 goscie kernel: [ 265.411535] CPU 0
Feb 29 17:01:37 goscie kernel: [ 265.411625] Modules linked in: loop processor i2c_piix4 button i2c_core psmouse serio_raw tpm_tis tpm tpm_bios evdev pcspkr thermal_sys xfs sg ata_generic ata_piix libata scsi_mod virtio_net [last unloaded: scsi_wait_scan]
Feb 29 17:01:37 goscie kernel: [ 265.412034]
Feb 29 17:01:37 goscie kernel: [ 265.412034] Pid: 1465, comm: xfsdump Not tainted 3.3.0-rc5xfsdump1+ #1 Bochs Bochs
Feb 29 17:01:37 goscie kernel: [ 265.412034] RIP: 0010:[<ffffffffa00913aa>] [<ffffffffa00913aa>] assfail+0x1d/0x23 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] RSP: 0018:ffff880009ee5938 EFLAGS: 00010296
Feb 29 17:01:37 goscie kernel: [ 265.412034] RAX: 000000000000006e RBX: ffff8800063ec980 RCX: 00000000099b099b
Feb 29 17:01:37 goscie kernel: [ 265.412034] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
Feb 29 17:01:37 goscie kernel: [ 265.412034] RBP: 0000000000028001 R08: 0000000000000000 R09: ffff88001780d000
Feb 29 17:01:37 goscie kernel: [ 265.412034] R10: 0000000000000000 R11: ffff88001780d000 R12: 0000000000000000
Feb 29 17:01:37 goscie kernel: [ 265.412034] R13: ffffffffa00dff34 R14: ffff880009ee5a10 R15: ffff88001202b980
Feb 29 17:01:37 goscie kernel: [ 265.412034] FS: 0000000000000000(0000) GS:ffff880017c00000(0063) knlGS:00000000f7639700
Feb 29 17:01:37 goscie kernel: [ 265.412034] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
Feb 29 17:01:37 goscie kernel: [ 265.412034] CR2: 00000000f73a5000 CR3: 0000000014724000 CR4: 00000000000006f0
Feb 29 17:01:37 goscie kernel: [ 265.412034] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 29 17:01:37 goscie kernel: [ 265.412034] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 29 17:01:37 goscie kernel: [ 265.412034] Process xfsdump (pid: 1465, threadinfo ffff880009ee4000, task ffff88000329f120)
Feb 29 17:01:37 goscie kernel: [ 265.412034] Stack:
Feb 29 17:01:37 goscie kernel: [ 265.412034] 0000000000028001 ffffffffa0085a82 0000000000008000 ffff8800063ec980
Feb 29 17:01:37 goscie kernel: [ 265.412034] 0000000000028001 ffffffffa00860d6 0000025c00000000 0000000000000000
Feb 29 17:01:37 goscie kernel: [ 265.412034] ffff880012084800 0000000000000000 0000000000008000 ffffffffa00dff34
Feb 29 17:01:37 goscie kernel: [ 265.412034] Call Trace:
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa0085a82>] ? _xfs_buf_read+0x4b/0x86 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00860d6>] ? xfs_buf_read+0x8a/0xc0 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00dff34>] ? xfs_trans_read_buf+0x4b/0x537 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00cab65>] ? xfs_imap_to_bp+0x47/0x209 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00ccdf3>] ? xfs_iread+0x59/0x194 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa008ba2a>] ? xfs_iget+0x408/0x5ea [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa0099015>] ? kmem_zone_zalloc+0x13/0x2e [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa0090c2b>] ? xfs_bulkstat_one_int+0x9a/0x319 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00eafec>] ? xfs_bstime_store_compat+0x4d/0x4d [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00eb790>] ? xfs_bulkstat_one_compat+0x17/0x1c [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00909e0>] ? xfs_bulkstat+0x58c/0x73d [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00eb779>] ? xfs_compat_fssetdm_by_handle+0xc7/0xc7 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00ebc99>] ? xfs_compat_ioc_bulkstat+0x162/0x1cf [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00ec0b2>] ? xfs_file_compat_ioctl+0x3ac/0x486 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa00a7981>] ? xfs_bmapi_read+0x1ad/0x31a [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa0088d5f>] ? xfs_file_aio_read+0x24c/0x263 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa008b323>] ? xfs_iunlock+0xea/0x146 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffffa009600b>] ? xfs_free_eofblocks+0xa5/0x217 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffff81142259>] ? compat_sys_ioctl+0x1a5/0xe28
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffff81071d1e>] ? timekeeping_get_ns+0xd/0x2a
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffff81072986>] ? getnstimeofday+0x4d/0x80
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffff81072a06>] ? do_gettimeofday+0x10/0x2f
Feb 29 17:01:37 goscie kernel: [ 265.412034] [<ffffffff813615b6>] ? cstar_dispatch+0x7/0x21
Feb 29 17:01:37 goscie kernel: [ 265.412034] Code: 89 64 24 28 e8 86 fc ff ff 48 83 c4 68 c3 48 89 f1 41 89 d0 48 83 ec 08 48 89 fa 48 c7 c6 8d 7e 0f a0 31 ff 31 c0 e8 8d ff ff ff <0f> 0b eb fe 90 90 45 31 c0 31 c0 48 83 7f 38 00 75 67 e9 bd 00
Feb 29 17:01:37 goscie kernel: [ 265.412034] RIP [<ffffffffa00913aa>] assfail+0x1d/0x23 [xfs]
Feb 29 17:01:37 goscie kernel: [ 265.412034] RSP <ffff880009ee5938>
Feb 29 17:01:37 goscie kernel: [ 265.445134] ---[ end trace 1e67152b46e37c22 ]---
I'm trying to bisect but it would take a while.
root@goscie:~# xfs_info /
meta-data=/dev/disk/by-uuid/07910403-cd4b-4388-bab2-df7372aa7227 isize=256 agcount=9, agsize=114816 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=983552, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
root@goscie:~# grep xfs /proc/mounts
/dev/disk/by-uuid/07910403-cd4b-4388-bab2-df7372aa7227 / xfs rw,noatime,attr2,noquota 0 0
Regards,
Kupson
PS: Please do CC: me. I'm not subscribed.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-02-29 18:20 XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598 Rafał Kupka
@ 2012-03-02 10:43 ` Christoph Hellwig
2012-03-03 11:37 ` Rafał Kupka
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2012-03-02 10:43 UTC (permalink / raw)
To: Rafa?? Kupka; +Cc: xfs
>From the trace this looks like we got incorrect results from xfs_imap,
which got some fairly significant changes between the two kernels
you tested. Can you check if 2.6.36 shows this issue? It's the first
kernel that had these big changes in inode lookup.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-02 10:43 ` Christoph Hellwig
@ 2012-03-03 11:37 ` Rafał Kupka
2012-03-03 11:44 ` Christoph Hellwig
0 siblings, 1 reply; 12+ messages in thread
From: Rafał Kupka @ 2012-03-03 11:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Fri, 2012-03-02 at 05:43 -0500, Christoph Hellwig wrote:
Hello,
> From the trace this looks like we got incorrect results from xfs_imap,
> which got some fairly significant changes between the two kernels
> you tested. Can you check if 2.6.36 shows this issue? It's the first
> kernel that had these big changes in inode lookup.
I've tested more kernels and here are results (linus git tree, no
additional patches):
- v2.6.36 -- good
- v2.6.38 -- good
- v2.6.39 -- bad, I/O error log entry and locked processes:
Mar 2 20:13:58 goscie kernel: [ 431.392301] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0xe6700 ("xfs_trans_read_buf") error 5 buf count 8192
Mar 2 20:13:58 goscie kernel: [ 431.427181] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
Mar 2 20:13:59 goscie kernel: [ 433.086049] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0xe6700 ("xfs_trans_read_buf") error 5 buf count 8192
Mar 2 20:13:59 goscie kernel: [ 433.088638] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
Mar 2 20:16:47 goscie kernel: [ 600.584162] INFO: task cron:718 blocked for more than 120 seconds.
Mar 2 20:16:47 goscie kernel: [ 600.585520] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 2 20:16:47 goscie kernel: [ 600.587651] cron D ffff8800174d2180 0 718 1 0x00020000
Mar 2 20:16:47 goscie kernel: [ 600.589632] ffff8800174d2180 0000000000000082 ffffffff8100e0d1 ffff88001822c7a0
Mar 2 20:16:47 goscie kernel: [ 600.592651] 0000000000013d40 ffff880013831fd8 ffff880013831fd8 0000000000013d40
Mar 2 20:16:47 goscie kernel: [ 600.595718] ffff8800174d2180 ffff880013830010 ffff88001fc13d40 ffff88001821ed00
Mar 2 20:16:47 goscie kernel: [ 600.598314] Call Trace:
Mar 2 20:16:47 goscie kernel: [ 600.598640] [<ffffffff8100e0d1>] ? sched_clock+0x5/0x8
Mar 2 20:16:47 goscie kernel: [ 600.599099] [<ffffffff8139166c>] ? schedule_timeout+0x2d/0xd7
Mar 2 20:16:47 goscie kernel: [ 600.599503] [<ffffffff81391fab>] ? __down_common+0x92/0xe3
Mar 2 20:16:47 goscie kernel: [ 600.599935] [<ffffffff811d7ba0>] ? _xfs_buf_find+0x1ed/0x2c8
Mar 2 20:16:47 goscie kernel: [ 600.600383] [<ffffffff8105f3bc>] ? down+0x27/0x37
Mar 2 20:16:47 goscie kernel: [ 600.600782] [<ffffffff811d6365>] ? xfs_buf_lock+0x64/0xb0
Mar 2 20:16:47 goscie kernel: [ 600.601179] [<ffffffff811d7ba0>] ? _xfs_buf_find+0x1ed/0x2c8
Mar 2 20:16:47 goscie kernel: [ 600.601581] [<ffffffff811d7cd6>] ? xfs_buf_get+0x5b/0x14f
Mar 2 20:16:47 goscie kernel: [ 600.601986] [<ffffffff811d7de0>] ? xfs_buf_read+0x16/0xbf
Mar 2 20:16:47 goscie kernel: [ 600.602360] [<ffffffff811ce10a>] ? xfs_trans_read_buf+0x52/0x573
Mar 2 20:16:47 goscie kernel: [ 600.602750] [<ffffffff811b65ee>] ? xfs_imap_to_bp+0x47/0x206
Mar 2 20:16:47 goscie kernel: [ 600.603171] [<ffffffff811d6c56>] ? xfs_buf_rele+0xfa/0x17a
Mar 2 20:16:47 goscie kernel: [ 600.603567] [<ffffffff811b9531>] ? xfs_iread+0x59/0x1cc
Mar 2 20:16:47 goscie kernel: [ 600.603983] [<ffffffff811b3727>] ? xfs_iget+0x3e4/0x611
Mar 2 20:16:47 goscie kernel: [ 600.612400] [<ffffffff811d0d80>] ? xfs_lookup+0xaa/0xd9
Mar 2 20:16:47 goscie kernel: [ 600.612757] [<ffffffff811dc6f3>] ? xfs_vn_lookup+0x3d/0x7c
Mar 2 20:16:47 goscie kernel: [ 600.613142] [<ffffffff810fed3f>] ? d_alloc_and_lookup+0x43/0x60
Mar 2 20:16:47 goscie kernel: [ 600.613611] [<ffffffff8110061c>] ? walk_component+0x161/0x30c
Mar 2 20:16:47 goscie kernel: [ 600.613967] [<ffffffff81101683>] ? path_lookupat+0xad/0x33f
Mar 2 20:16:47 goscie kernel: [ 600.614398] [<ffffffff81033b5a>] ? update_cfs_shares+0x7c/0x149
Mar 2 20:16:47 goscie kernel: [ 600.614779] [<ffffffff81101933>] ? do_path_lookup+0x1e/0x9a
Mar 2 20:16:47 goscie kernel: [ 600.615125] [<ffffffff810ff44d>] ? getname_flags+0x14f/0x1c6
Mar 2 20:16:47 goscie kernel: [ 600.615496] [<ffffffff811023df>] ? user_path_at+0x48/0x7a
Mar 2 20:16:47 goscie kernel: [ 600.615857] [<ffffffff8105e7f5>] ? hrtimer_cancel+0xc/0x16
Mar 2 20:16:47 goscie kernel: [ 600.616220] [<ffffffff81391de2>] ? do_nanosleep+0x77/0xae
Mar 2 20:16:47 goscie kernel: [ 600.616583] [<ffffffff811078e7>] ? dput+0x84/0xe8
Mar 2 20:16:47 goscie kernel: [ 600.616909] [<ffffffff810fa00b>] ? vfs_fstatat+0x43/0x70
Mar 2 20:16:47 goscie kernel: [ 600.617247] [<ffffffff8102dbc0>] ? sys32_stat64+0x11/0x29
Mar 2 20:16:47 goscie kernel: [ 600.617621] [<ffffffff81399180>] ? cstar_dispatch+0x7/0x2e
Mar 2 20:16:47 goscie kernel: [ 600.617984] INFO: task xfsdump:1849 blocked for more than 120 seconds.
Mar 2 20:16:47 goscie kernel: [ 600.618346] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 2 20:16:47 goscie kernel: [ 600.618964] xfsdump D ffff8800178936e0 0 1849 1826 0x00020000
Mar 2 20:16:47 goscie kernel: [ 600.619448] ffff8800178936e0 0000000000000086 ffff880000000000 ffffffff8160b020
Mar 2 20:16:47 goscie kernel: [ 600.620276] 0000000000013d40 ffff88000d841fd8 ffff88000d841fd8 0000000000013d40
Mar 2 20:16:47 goscie kernel: [ 600.621068] ffff8800178936e0 ffff88000d840010 ffffea000053c950 0000000000000008
Mar 2 20:16:47 goscie kernel: [ 600.621814] Call Trace:
Mar 2 20:16:47 goscie kernel: [ 600.622155] [<ffffffff8139166c>] ? schedule_timeout+0x2d/0xd7
Mar 2 20:16:47 goscie kernel: [ 600.622509] [<ffffffff811ce10a>] ? xfs_trans_read_buf+0x52/0x573
Mar 2 20:16:47 goscie kernel: [ 600.622888] [<ffffffff8119a361>] ? xfs_btree_check_sblock+0x64/0x15e
Mar 2 20:16:47 goscie kernel: [ 600.623308] [<ffffffff81391fab>] ? __down_common+0x92/0xe3
Mar 2 20:16:47 goscie kernel: [ 600.623650] [<ffffffff811d7ba0>] ? _xfs_buf_find+0x1ed/0x2c8
Mar 2 20:16:47 goscie kernel: [ 600.624065] [<ffffffff8105f3bc>] ? down+0x27/0x37
Mar 2 20:16:47 goscie kernel: [ 600.624398] [<ffffffff811d6365>] ? xfs_buf_lock+0x64/0xb0
Mar 2 20:16:47 goscie kernel: [ 600.624739] [<ffffffff811d7ba0>] ? _xfs_buf_find+0x1ed/0x2c8
Mar 2 20:16:47 goscie kernel: [ 600.625127] [<ffffffff811d7cd6>] ? xfs_buf_get+0x5b/0x14f
Mar 2 20:16:47 goscie kernel: [ 600.625475] [<ffffffff811d7de0>] ? xfs_buf_read+0x16/0xbf
Mar 2 20:16:47 goscie kernel: [ 600.625820] [<ffffffff811ce10a>] ? xfs_trans_read_buf+0x52/0x573
Mar 2 20:16:47 goscie kernel: [ 600.626198] [<ffffffff811b65ee>] ? xfs_imap_to_bp+0x47/0x206
Mar 2 20:16:47 goscie kernel: [ 600.626550] [<ffffffff811b9531>] ? xfs_iread+0x59/0x1cc
Mar 2 20:16:47 goscie kernel: [ 600.626888] [<ffffffff811b3727>] ? xfs_iget+0x3e4/0x611
Mar 2 20:16:47 goscie kernel: [ 600.627277] [<ffffffff811d3715>] ? kmem_alloc+0x28/0x9e
Mar 2 20:16:47 goscie kernel: [ 600.627641] [<ffffffff811bc59b>] ? xfs_bulkstat_one_int+0x9a/0x318
Mar 2 20:16:47 goscie kernel: [ 600.628000] [<ffffffff8117f3b4>] ? xfs_bstime_store_compat+0x4d/0x4d
Mar 2 20:16:47 goscie kernel: [ 600.628433] [<ffffffff8117fb58>] ? xfs_bulkstat_one_compat+0x17/0x1c
Mar 2 20:16:47 goscie kernel: [ 600.628857] [<ffffffff811bc350>] ? xfs_bulkstat+0x58c/0x73d
Mar 2 20:16:47 goscie kernel: [ 600.629220] [<ffffffff8117fb41>] ? xfs_compat_fssetdm_by_handle+0xc7/0xc7
Mar 2 20:16:47 goscie kernel: [ 600.629635] [<ffffffff8118006a>] ? xfs_compat_ioc_bulkstat+0x162/0x1cf
Mar 2 20:16:47 goscie kernel: [ 600.630080] [<ffffffff8103680c>] ? task_rq_lock+0x47/0x7a
Mar 2 20:16:47 goscie kernel: [ 600.630463] [<ffffffff81180462>] ? xfs_file_compat_ioctl+0x38b/0x465
Mar 2 20:16:47 goscie kernel: [ 600.630829] [<ffffffff81030962>] ? __wake_up_common+0x41/0x78
Mar 2 20:16:47 goscie kernel: [ 600.631177] [<ffffffff810366c2>] ? __wake_up+0x35/0x46
Mar 2 20:16:47 goscie kernel: [ 600.631537] [<ffffffff812915c0>] ? pty_write+0x48/0x53
Mar 2 20:16:47 goscie kernel: [ 600.631895] [<ffffffff8103e770>] ? perf_event_task_sched_out+0x48/0x54
Mar 2 20:16:47 goscie kernel: [ 600.632285] [<ffffffff8128cc83>] ? n_tty_write+0x322/0x360
Mar 2 20:16:47 goscie kernel: [ 600.632635] [<ffffffff8112dec3>] ? compat_sys_ioctl+0x1bb/0xfb4
Mar 2 20:16:47 goscie kernel: [ 600.632993] [<ffffffff810086b8>] ? __switch_to+0x1c9/0x288
Mar 2 20:16:47 goscie kernel: [ 600.633340] [<ffffffff810379f3>] ? pick_next_task_fair+0xf9/0x10a
Mar 2 20:16:47 goscie kernel: [ 600.633707] [<ffffffff8102602a>] ? pvclock_clocksource_read+0x46/0xb4
Mar 2 20:16:47 goscie kernel: [ 600.634079] [<ffffffff810622ff>] ? timekeeping_get_ns+0xe/0x2e
Mar 2 20:16:47 goscie kernel: [ 600.634430] [<ffffffff81062975>] ? getnstimeofday+0x4d/0x80
Mar 2 20:16:47 goscie kernel: [ 600.634778] [<ffffffff810629f5>] ? do_gettimeofday+0x10/0x2e
Mar 2 20:16:47 goscie kernel: [ 600.635136] [<ffffffff81399180>] ? cstar_dispatch+0x7/0x2e
- git revision 243b422af9ea9af4ead07a8ad54c90d4f9b6081a -- good
It's one patch before "xfs: preallocation transactions do not need to be synchronous"
- git revision c5850150d0b9ae16840c5d9846f9d5a759996a15:
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: stop using the page cache to back the buffer cache
xfs: register the inode cache shrinker before quotachecks
xfs: xfs_trans_read_buf() should return an error on failure
xfs: introduce inode cluster buffer trylocks for xfs_iflush
vmap: flush vmap aliases when mapping fails
xfs: preallocation transactions do not need to be synchronous
Hung xfsdump (only?) without any additional log entries
(CONFIG_XFS_DEBUG is set):
Mar 3 12:05:57 goscie kernel: [ 960.816216] INFO: task xfsdump:1519 blocked for more than 120 seconds.
Mar 3 12:05:57 goscie kernel: [ 960.818113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 3 12:05:57 goscie kernel: [ 960.820223] xfsdump D ffff88001e196950 0 1519 1490 0x00020000
Mar 3 12:05:57 goscie kernel: [ 960.821572] ffff88001e196950 0000000000000086 ffff880000000000 ffffffff8160b020
Mar 3 12:05:57 goscie kernel: [ 960.823618] 0000000000013d00 ffff880009903fd8 ffff880009903fd8 0000000000013d00
Mar 3 12:05:57 goscie kernel: [ 960.825707] ffff88001e196950 ffff880009902010 0000000100000000 0000000000000008
Mar 3 12:05:57 goscie kernel: [ 960.827746] Call Trace:
Mar 3 12:05:57 goscie kernel: [ 960.828672] [<ffffffff81391493>] ? schedule_timeout+0x2d/0xd7
Mar 3 12:05:57 goscie kernel: [ 960.829937] [<ffffffff811ce4d2>] ? xfs_trans_read_buf+0x52/0x573
Mar 3 12:05:57 goscie kernel: [ 960.831118] [<ffffffff8119a8a5>] ? xfs_btree_check_sblock+0x64/0x15e
Mar 3 12:05:57 goscie kernel: [ 960.831459] [<ffffffff81391dd2>] ? __down_common+0x92/0xe3
Mar 3 12:05:57 goscie kernel: [ 960.831780] [<ffffffff811d7fab>] ? _xfs_buf_find+0x1ed/0x2c8
Mar 3 12:05:57 goscie kernel: [ 960.832125] [<ffffffff8105f2fc>] ? down+0x27/0x37
Mar 3 12:05:57 goscie kernel: [ 960.832424] [<ffffffff811d6776>] ? xfs_buf_lock+0x7c/0xc8
Mar 3 12:05:57 goscie kernel: [ 960.832739] [<ffffffff811d7fab>] ? _xfs_buf_find+0x1ed/0x2c8
Mar 3 12:05:57 goscie kernel: [ 960.833068] [<ffffffff811d80e1>] ? xfs_buf_get+0x5b/0x14f
Mar 3 12:05:57 goscie kernel: [ 960.833412] [<ffffffff811d81eb>] ? xfs_buf_read+0x16/0xbf
Mar 3 12:05:57 goscie kernel: [ 960.833737] [<ffffffff811ce4d2>] ? xfs_trans_read_buf+0x52/0x573
Mar 3 12:05:57 goscie kernel: [ 960.834071] [<ffffffff811b6b2a>] ? xfs_imap_to_bp+0x47/0x206
Mar 3 12:05:57 goscie kernel: [ 960.834396] [<ffffffff811b9a6d>] ? xfs_iread+0x59/0x1cc
Mar 3 12:05:57 goscie kernel: [ 960.834706] [<ffffffff811b3c63>] ? xfs_iget+0x3e4/0x611
Mar 3 12:05:57 goscie kernel: [ 960.835022] [<ffffffff811d3b15>] ? kmem_alloc+0x60/0x9e
Mar 3 12:05:57 goscie kernel: [ 960.835338] [<ffffffff811bca6b>] ? xfs_bulkstat_one_int+0x9a/0x318
Mar 3 12:05:57 goscie kernel: [ 960.835677] [<ffffffff8117f8d4>] ? xfs_bstime_store_compat+0x4d/0x4d
Mar 3 12:05:57 goscie kernel: [ 960.836029] [<ffffffff81180078>] ? xfs_bulkstat_one_compat+0x17/0x1c
Mar 3 12:05:57 goscie kernel: [ 960.836377] [<ffffffff811bc820>] ? xfs_bulkstat+0x58c/0x73d
Mar 3 12:05:57 goscie kernel: [ 960.836697] [<ffffffff81180061>] ? xfs_compat_fssetdm_by_handle+0xc7/0xc7
Mar 3 12:05:57 goscie kernel: [ 960.837049] [<ffffffff8118058a>] ? xfs_compat_ioc_bulkstat+0x162/0x1cf
Mar 3 12:05:57 goscie kernel: [ 960.837439] [<ffffffff810b1d7a>] ? generic_file_aio_read+0x58d/0x5a2
Mar 3 12:05:57 goscie kernel: [ 960.837790] [<ffffffff81180982>] ? xfs_file_compat_ioctl+0x38b/0x465
Mar 3 12:05:57 goscie kernel: [ 960.838132] [<ffffffff811b3219>] ? xfs_iunlock+0xea/0x147
Mar 3 12:05:57 goscie kernel: [ 960.838467] [<ffffffff811d1221>] ? xfs_free_eofblocks+0xaa/0x222
Mar 3 12:05:57 goscie kernel: [ 960.838810] [<ffffffff8112e413>] ? compat_sys_ioctl+0x1bb/0xfb4
Mar 3 12:05:57 goscie kernel: [ 960.846656] [<ffffffff8102602a>] ? pvclock_clocksource_read+0x46/0xb4
Mar 3 12:05:57 goscie kernel: [ 960.847015] [<ffffffff8106222f>] ? timekeeping_get_ns+0xe/0x2e
Mar 3 12:05:57 goscie kernel: [ 960.847343] [<ffffffff810628a5>] ? getnstimeofday+0x4d/0x80
Mar 3 12:05:57 goscie kernel: [ 960.847664] [<ffffffff81062925>] ? do_gettimeofday+0x10/0x2e
Mar 3 12:05:57 goscie kernel: [ 960.847994] [<ffffffff81398fc0>] ? cstar_dispatch+0x7/0x2e
I don't know if it's the same bug - there is no assertion nor I/O error
log entry...
Regards,
Rafał Kupka
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-03 11:37 ` Rafał Kupka
@ 2012-03-03 11:44 ` Christoph Hellwig
2012-03-05 21:40 ` Rafał Kupka
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2012-03-03 11:44 UTC (permalink / raw)
To: Rafa?? Kupka; +Cc: Christoph Hellwig, xfs
On Sat, Mar 03, 2012 at 12:37:07PM +0100, Rafa?? Kupka wrote:
> On Fri, 2012-03-02 at 05:43 -0500, Christoph Hellwig wrote:
> Hello,
>
> > From the trace this looks like we got incorrect results from xfs_imap,
> > which got some fairly significant changes between the two kernels
> > you tested. Can you check if 2.6.36 shows this issue? It's the first
> > kernel that had these big changes in inode lookup.
>
> I've tested more kernels and here are results (linus git tree, no
> additional patches):
> - v2.6.36 -- good
> - v2.6.38 -- good
> - v2.6.39 -- bad, I/O error log entry and locked processes:
> - git revision 243b422af9ea9af4ead07a8ad54c90d4f9b6081a -- good
> It's one patch before "xfs: preallocation transactions do not need to be synchronous"
>
> - git revision c5850150d0b9ae16840c5d9846f9d5a759996a15:
> Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
>
> * 'for-linus' of git://oss.sgi.com/xfs/xfs:
> xfs: stop using the page cache to back the buffer cache
> xfs: register the inode cache shrinker before quotachecks
> xfs: xfs_trans_read_buf() should return an error on failure
This is the only patch in that area. Give me a bit more time to
figure out how it could related to your issues.
> Hung xfsdump (only?) without any additional log entries
> (CONFIG_XFS_DEBUG is set):
> I don't know if it's the same bug - there is no assertion nor I/O error
> log entry...
It's the same code path at least..
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-03 11:44 ` Christoph Hellwig
@ 2012-03-05 21:40 ` Rafał Kupka
2012-03-06 20:30 ` Dave Chinner
0 siblings, 1 reply; 12+ messages in thread
From: Rafał Kupka @ 2012-03-05 21:40 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Sat, 2012-03-03 at 06:44 -0500, Christoph Hellwig wrote:
> On Sat, Mar 03, 2012 at 12:37:07PM +0100, Rafa?? Kupka wrote:
> > On Fri, 2012-03-02 at 05:43 -0500, Christoph Hellwig wrote:
Hi,
I've completed bisection process. It looks like kernel up to
704b2907 "xfs: register the inode cache shrinker before quotachecks"
is good and first bad commit is:
0e6e847f "xfs: stop using the page cache to back the buffer cache"
I double tested it - 704b2907 can do 60+ loops of xfsdump without any
trouble and 0e6e847f breaks in <10 iterations.
> > - git revision c5850150d0b9ae16840c5d9846f9d5a759996a15:
> > Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
> >
> > * 'for-linus' of git://oss.sgi.com/xfs/xfs:
> > xfs: stop using the page cache to back the buffer cache
> > xfs: register the inode cache shrinker before quotachecks
> > xfs: xfs_trans_read_buf() should return an error on failure
>
> This is the only patch in that area. Give me a bit more time to
> figure out how it could related to your issues.
Can I do something more to help?
Regards,
Rafał Kupka
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-05 21:40 ` Rafał Kupka
@ 2012-03-06 20:30 ` Dave Chinner
2012-03-08 19:10 ` Rafał Kupka
0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-03-06 20:30 UTC (permalink / raw)
To: Rafał Kupka; +Cc: Christoph Hellwig, xfs
On Mon, Mar 05, 2012 at 10:40:12PM +0100, Rafał Kupka wrote:
> On Sat, 2012-03-03 at 06:44 -0500, Christoph Hellwig wrote:
> > On Sat, Mar 03, 2012 at 12:37:07PM +0100, Rafa?? Kupka wrote:
> > > On Fri, 2012-03-02 at 05:43 -0500, Christoph Hellwig wrote:
> Hi,
>
> I've completed bisection process. It looks like kernel up to
>
> 704b2907 "xfs: register the inode cache shrinker before quotachecks"
>
> is good and first bad commit is:
>
> 0e6e847f "xfs: stop using the page cache to back the buffer cache"
Which changed internals of the buffer cache, but not the external
interfaces.
> I double tested it - 704b2907 can do 60+ loops of xfsdump without any
> trouble and 0e6e847f breaks in <10 iterations.
What is your test case?
Also, it might be worthwhile to capture an event trace (e.g. all the
xfs_buf events) to see what occurs just before the error is
triggered. That might tell us exactl what sequence has occurred
leading up to the error.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-06 20:30 ` Dave Chinner
@ 2012-03-08 19:10 ` Rafał Kupka
2012-03-26 16:05 ` Rafał Kupka
0 siblings, 1 reply; 12+ messages in thread
From: Rafał Kupka @ 2012-03-08 19:10 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Wed, 2012-03-07 at 07:30 +1100, Dave Chinner wrote:
> On Mon, Mar 05, 2012 at 10:40:12PM +0100, Rafał Kupka wrote:
Hi,
> > is good and first bad commit is:
> >
> > 0e6e847f "xfs: stop using the page cache to back the buffer cache"
>
> Which changed internals of the buffer cache, but not the external
> interfaces.
>
> > I double tested it - 704b2907 can do 60+ loops of xfsdump without any
> > trouble and 0e6e847f breaks in <10 iterations.
>
> What is your test case?
I run xfsdump (to /dev/null) in a loop and execute postmark benchmark at
the same time. It's a minimal Debian/Squeeze installation in KVM guest
but it happened on another system too. An Xen server with 512MB memory
in Dom0.
xfsdump -f /dev/null -l0 -p1 -J -L "test" -M "test" /
and postmark with file (also in loop):
set size 500 1000000
set number 1000
set transactions 1000
run
quit
Typically after few iterations xfsdump is stuck with log messages
mentioned in previous e-mails.
Important thing I recently notice - I'm able to reproduce this bug only
on small memory KVM guests (384M). This very same system started with 1G
memory runs fine for 100+ xfsdump executions.
> Also, it might be worthwhile to capture an event trace (e.g. all the
> xfs_buf events) to see what occurs just before the error is
> triggered. That might tell us exactl what sequence has occurred
> leading up to the error.
I will try to provide this information in a few days.
Best Regards,
Rafal Kupka
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-08 19:10 ` Rafał Kupka
@ 2012-03-26 16:05 ` Rafał Kupka
2012-03-27 3:28 ` Dave Chinner
0 siblings, 1 reply; 12+ messages in thread
From: Rafał Kupka @ 2012-03-26 16:05 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Thu, 2012-03-08 at 20:10 +0100, Rafał Kupka wrote:
> On Wed, 2012-03-07 at 07:30 +1100, Dave Chinner wrote:
Hi,
Finally I've captured this event trace. Hope it will help.
> > Also, it might be worthwhile to capture an event trace (e.g. all the
> > xfs_buf events) to see what occurs just before the error is
> > triggered. That might tell us exactl what sequence has occurred
> > leading up to the error.
>
> I will try to provide this information in a few days.
Kernel version - Linus tree up to:
0e6e847f "xfs: stop using the page cache to back the buffer cache"
It's uniprocessor virtual machine (KVM) with 384 MB RAM.
Full event trace is available at https://gist.github.com/2206109
(sorry for it's *.perl extension, my mistake).
Kernel log entry:
[ 495.197571] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80 ("xfs_trans_read_buf") error 5 buf count 8192
[ 495.198283] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
[ 501.540477] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80 ("xfs_trans_read_buf") error 5 buf count 8192
[ 501.543047] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
It's possible that this "I/O error" is caused by memory pressure?
Related trace line:
xfsdump-2498 [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
Events related to "bno 0x13d80":
xfsdump-2498 [000] 24195.782266: xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-2498 [000] 24195.782266: xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf
xfsdump-2498 [000] 24195.782267: xfs_buf_iorequest: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller _xfs_buf_read
xfsdump-2498 [000] 24195.782267: xfs_buf_hold: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
xfsdump-2498 [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
xfsdump-2498 [000] 24195.782270: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
xfsdump-2498 [000] 24195.782985: xfs_buf_unlock: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_buf_relse
xfsdump-2498 [000] 24195.782986: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_trans_read_buf
Do this data put some light on this issue?
Regards,
Rafał Kupka
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-26 16:05 ` Rafał Kupka
@ 2012-03-27 3:28 ` Dave Chinner
2012-03-27 10:17 ` Dave Chinner
0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-03-27 3:28 UTC (permalink / raw)
To: Rafał Kupka; +Cc: Christoph Hellwig, xfs
On Mon, Mar 26, 2012 at 06:05:25PM +0200, Rafał Kupka wrote:
> On Thu, 2012-03-08 at 20:10 +0100, Rafał Kupka wrote:
> > On Wed, 2012-03-07 at 07:30 +1100, Dave Chinner wrote:
>
> Hi,
>
> Finally I've captured this event trace. Hope it will help.
>
> > > Also, it might be worthwhile to capture an event trace (e.g. all the
> > > xfs_buf events) to see what occurs just before the error is
> > > triggered. That might tell us exactl what sequence has occurred
> > > leading up to the error.
> >
> > I will try to provide this information in a few days.
>
> Kernel version - Linus tree up to:
> 0e6e847f "xfs: stop using the page cache to back the buffer cache"
>
> It's uniprocessor virtual machine (KVM) with 384 MB RAM.
Ok, so limited memory and no parallelism. Can you send me you
.config so I can have a look at other various config options?
> Full event trace is available at https://gist.github.com/2206109
> (sorry for it's *.perl extension, my mistake).
>
> Kernel log entry:
> [ 495.197571] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80 ("xfs_trans_read_buf") error 5 buf count 8192
> [ 495.198283] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> [ 501.540477] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80 ("xfs_trans_read_buf") error 5 buf count 8192
> [ 501.543047] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
>
> It's possible that this "I/O error" is caused by memory pressure?
>
> Related trace line:
> xfsdump-2498 [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
>
> Events related to "bno 0x13d80":
> xfsdump-2498 [000] 24195.782266: xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
> xfsdump-2498 [000] 24195.782266: xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf
> xfsdump-2498 [000] 24195.782267: xfs_buf_iorequest: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller _xfs_buf_read
> xfsdump-2498 [000] 24195.782267: xfs_buf_hold: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
> xfsdump-2498 [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
> xfsdump-2498 [000] 24195.782270: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
> xfsdump-2498 [000] 24195.782985: xfs_buf_unlock: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_buf_relse
> xfsdump-2498 [000] 24195.782986: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_trans_read_buf
>
> Do this data put some light on this issue?
Yes, that it was errored out because the bio that was built from the
buffer was zero length. That means the buffer was not set up
properly in the first place. Hmmm, this is unusual - the two lines
above your traces:
xfs_buf_cond_lock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|ASYNC|TRYLOCK|PAGES caller _xfs_buf_find
xfs_buf_find: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf
The lookup found an uninitialised buffer in the cache. That's
exceedingly curious - the match is done on bp->b_file_offset, and
after it has been inserted it the bno is then filled out.
Oh, wow, I think I see the bug. The key is the initial state of that
uninitialised buffer: READ|ASYNC|TRYLOCK|PAGES, and the state of
the it from the xfs_buf_get() trace READ|LOCK|DONT_BLOCK. Those two
sets of flags do not match up - the initial set indicate the buffer
has been inserted into the cache for readahead, the second set
indicate a blocking read.
You've only got a small amount of memory, so I suspect that a memory
allocation has failed in a specific place to cause this. I'm not
exactly sure how the second thread trips over it and initialises it
somewhat (there's 3 initialisation steps to a buffer) but the
initial failure is probably this:
async readhead started
_xfs_buf_find fails
allocate new buffer A
partially initialise A
_xfs_buf_find
fail to find match
insert buffer A into rbtree
returns buffer A
A == A
fails allocate pages to buffer
unlock buffer
release buffer
<uninitialised buffer now sitting in cache. THIS IS WRONG!>
That would lead to a second lookup finding it and using it, but I'm
still not quite sure how the buffer bno gets initialised. I can
construct valid scenarios that will cause it to happen, but none
match the event trace you've posted.
And just looking at the way we clean up this error, this race
condition could lead to a use after free if we immediately free
if we fail to allocate the pages to the buffer. Messy. I'll have to
think on this a bit - maybe a "broken" state is necessary.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-27 3:28 ` Dave Chinner
@ 2012-03-27 10:17 ` Dave Chinner
2012-03-30 16:02 ` Rafał Kupka
0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2012-03-27 10:17 UTC (permalink / raw)
To: Rafał Kupka; +Cc: Christoph Hellwig, xfs
On Tue, Mar 27, 2012 at 02:28:43PM +1100, Dave Chinner wrote:
> On Mon, Mar 26, 2012 at 06:05:25PM +0200, Rafał Kupka wrote:
> > On Thu, 2012-03-08 at 20:10 +0100, Rafał Kupka wrote:
> > > On Wed, 2012-03-07 at 07:30 +1100, Dave Chinner wrote:
> >
> > Hi,
> >
> > Finally I've captured this event trace. Hope it will help.
> >
> > > > Also, it might be worthwhile to capture an event trace (e.g. all the
> > > > xfs_buf events) to see what occurs just before the error is
> > > > triggered. That might tell us exactl what sequence has occurred
> > > > leading up to the error.
> > >
> > > I will try to provide this information in a few days.
> >
> > Kernel version - Linus tree up to:
> > 0e6e847f "xfs: stop using the page cache to back the buffer cache"
> >
> > It's uniprocessor virtual machine (KVM) with 384 MB RAM.
>
> Ok, so limited memory and no parallelism. Can you send me you
> .config so I can have a look at other various config options?
>
> > Full event trace is available at https://gist.github.com/2206109
> > (sorry for it's *.perl extension, my mistake).
> >
> > Kernel log entry:
> > [ 495.197571] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80 ("xfs_trans_read_buf") error 5 buf count 8192
> > [ 495.198283] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> > [ 501.540477] XFS (vda2): I/O error occurred: meta-data dev vda2 block 0x13d80 ("xfs_trans_read_buf") error 5 buf count 8192
> > [ 501.543047] XFS (vda2): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> >
> > It's possible that this "I/O error" is caused by memory pressure?
> >
> > Related trace line:
> > xfsdump-2498 [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
> >
> > Events related to "bno 0x13d80":
> > xfsdump-2498 [000] 24195.782266: xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
> > xfsdump-2498 [000] 24195.782266: xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf
> > xfsdump-2498 [000] 24195.782267: xfs_buf_iorequest: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller _xfs_buf_read
> > xfsdump-2498 [000] 24195.782267: xfs_buf_hold: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
> > xfsdump-2498 [000] 24195.782269: xfs_buf_ioerror: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 error 5 flags READ|TRYLOCK|PAGES caller _xfs_buf_ioapply
> > xfsdump-2498 [000] 24195.782270: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 3 pincount 0 lock 0 flags READ|TRYLOCK|PAGES caller xfs_buf_iorequest
> > xfsdump-2498 [000] 24195.782985: xfs_buf_unlock: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_buf_relse
> > xfsdump-2498 [000] 24195.782986: xfs_buf_rele: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 1 flags READ|TRYLOCK|PAGES caller xfs_trans_read_buf
> >
> > Do this data put some light on this issue?
>
> Yes, that it was errored out because the bio that was built from the
> buffer was zero length. That means the buffer was not set up
> properly in the first place. Hmmm, this is unusual - the two lines
> above your traces:
>
> xfs_buf_cond_lock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|ASYNC|TRYLOCK|PAGES caller _xfs_buf_find
> xfs_buf_find: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
> xfs_buf_get: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_read
> xfs_buf_read: dev 254:2 bno 0x13d80 len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_trans_read_buf
>
> The lookup found an uninitialised buffer in the cache. That's
> exceedingly curious - the match is done on bp->b_file_offset, and
> after it has been inserted it the bno is then filled out.
>
> Oh, wow, I think I see the bug. The key is the initial state of that
> uninitialised buffer: READ|ASYNC|TRYLOCK|PAGES, and the state of
> the it from the xfs_buf_get() trace READ|LOCK|DONT_BLOCK. Those two
> sets of flags do not match up - the initial set indicate the buffer
> has been inserted into the cache for readahead, the second set
> indicate a blocking read.
>
> You've only got a small amount of memory, so I suspect that a memory
> allocation has failed in a specific place to cause this. I'm not
> exactly sure how the second thread trips over it and initialises it
> somewhat (there's 3 initialisation steps to a buffer) but the
> initial failure is probably this:
>
> async readhead started
> _xfs_buf_find fails
> allocate new buffer A
> partially initialise A
> _xfs_buf_find
> fail to find match
> insert buffer A into rbtree
> returns buffer A
> A == A
> fails allocate pages to buffer
> unlock buffer
> release buffer
>
> <uninitialised buffer now sitting in cache. THIS IS WRONG!>
>
>
> That would lead to a second lookup finding it and using it, but I'm
> still not quite sure how the buffer bno gets initialised. I can
> construct valid scenarios that will cause it to happen, but none
> match the event trace you've posted.
>
> And just looking at the way we clean up this error, this race
> condition could lead to a use after free if we immediately free
> if we fail to allocate the pages to the buffer. Messy. I'll have to
> think on this a bit - maybe a "broken" state is necessary.
So here's a patch that implements a "broken" state. It's not pretty,
but I think it will fix the race you look to be hitting.
Can you run your test with this patch while running a trace for the
new xfs_buf_*broken events to verify you are hitting that code and
that it doesn't trigger EIO problems anymore?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
xfs: fix buffer lookup race on allocation failure
From: Dave Chinner <dchinner@redhat.com>
When memory allocation fails to add the page array or tht epages to
a buffer during xfs_buf_get(), the buffer is left in the cache in a
partially initialised state. There is enough state left for the next
lookup on that buffer to find the buffer, and for the buffer to then
be used without finishing the initialisation. As a result, when an
attempt to do IO on the buffer occurs, it fails with EIO because
there are no pages attached to the buffer.
We cannot remove the buffer from the cache immediately and free it,
because there may already be a racing lookup that is blocked on the
buffer lock. hence the moment we unlock the buffer to then free it,
the other user is woken and we have a use-after-free situation.
Hence we have to mark the buffer as "broken" and check that after we
have gained the buffer lock on a cache hit lookup. This enables
racing lookups to avoid the broken buffer and drop their references,
allowing the buffer to be freed.
This however, doesn't solve the problem completely - there may be a
delay in the buffer getting freed (e.g. pre-emption), so when we try
the lookup a second time with a new buffer to insert into the tree,
if we find the broken buffer again, drop the buffer lock, sleep for
a short while, and try the lookup again. When the broken bufer is
finally removed from the cache we will make forwards progress.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_buf.c | 33 ++++++++++++++++++++++++++++++++-
fs/xfs/xfs_buf.h | 2 ++
fs/xfs/xfs_trace.h | 2 ++
3 files changed, 36 insertions(+), 1 deletions(-)
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 6819b51..3588460 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -441,6 +441,7 @@ _xfs_buf_find(
ASSERT(!(range_base & (xfs_off_t)btp->bt_smask));
/* get tree root */
+lookup_again:
pag = xfs_perag_get(btp->bt_mount,
xfs_daddr_to_agno(btp->bt_mount, ioff));
@@ -505,6 +506,24 @@ found:
}
/*
+ * If the buffer initialisation was not completed correctly after
+ * insertion into the cache we can't use this buffer and we are pinning
+ * it preventing it from being removed from the cache. Hence just avoid
+ * the buffer if it is broken, allowing it to be reclaimed and removed
+ * from the cache. If we are here with a new buffer to insert into the
+ * cache, we want to try the lookup again so that the expected "insert
+ * on lookup failure" semantics are preserved for this case.
+ */
+ if (bp->b_flags & XBF_BROKEN) {
+ trace_xfs_buf_find_broken(bp, flags, _RET_IP_);
+ xfs_buf_relse(bp);
+ if (!new_bp)
+ return NULL;
+ delay(1);
+ goto lookup_again;
+ }
+
+ /*
* if the buffer is stale, clear all the external state associated with
* it. We need to keep flags such as how we allocated the buffer memory
* intact here.
@@ -552,8 +571,20 @@ xfs_buf_get(
if (bp == new_bp) {
error = xfs_buf_allocate_memory(bp, flags);
- if (error)
+ if (error) {
+ /*
+ * The buffer has been inserted into the cache now, so
+ * will be visible to other callers. Once we unlock the
+ * buffer, someone else can grab it out of the tree.
+ * It's a broken buffer, so we have to ensure that it is
+ * noticed - just freeing the buffer here can result in
+ * use-after-free if someone is already waiting on the
+ * buffer lock.
+ */
+ bp->b_flags |= XBF_BROKEN;
+ trace_xfs_buf_get_broken(bp, flags, _RET_IP_);
goto no_buffer;
+ }
} else
kmem_zone_free(xfs_buf_zone, new_bp);
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index df7ffb0..b4545b9 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -52,6 +52,7 @@ typedef enum {
#define XBF_DONE (1 << 5) /* all pages in the buffer uptodate */
#define XBF_DELWRI (1 << 6) /* buffer has dirty pages */
#define XBF_STALE (1 << 7) /* buffer has been staled, do not find it */
+#define XBF_BROKEN (1 << 8) /* buffer is broken, do not find it */
/* I/O hints for the BIO layer */
#define XBF_SYNCIO (1 << 10)/* treat this buffer as synchronous I/O */
@@ -79,6 +80,7 @@ typedef unsigned int xfs_buf_flags_t;
{ XBF_DONE, "DONE" }, \
{ XBF_DELWRI, "DELWRI" }, \
{ XBF_STALE, "STALE" }, \
+ { XBF_BROKEN, "BROKEN" }, \
{ XBF_SYNCIO, "SYNCIO" }, \
{ XBF_FUA, "FUA" }, \
{ XBF_FLUSH, "FLUSH" }, \
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 270775f..e6f7d75 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -386,7 +386,9 @@ DEFINE_EVENT(xfs_buf_flags_class, name, \
TP_PROTO(struct xfs_buf *bp, unsigned flags, unsigned long caller_ip), \
TP_ARGS(bp, flags, caller_ip))
DEFINE_BUF_FLAGS_EVENT(xfs_buf_find);
+DEFINE_BUF_FLAGS_EVENT(xfs_buf_find_broken);
DEFINE_BUF_FLAGS_EVENT(xfs_buf_get);
+DEFINE_BUF_FLAGS_EVENT(xfs_buf_get_broken);
DEFINE_BUF_FLAGS_EVENT(xfs_buf_read);
TRACE_EVENT(xfs_buf_ioerror,
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-27 10:17 ` Dave Chinner
@ 2012-03-30 16:02 ` Rafał Kupka
2012-04-04 9:19 ` Rafał Kupka
0 siblings, 1 reply; 12+ messages in thread
From: Rafał Kupka @ 2012-03-30 16:02 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Tue, 2012-03-27 at 21:17 +1100, Dave Chinner wrote:
Hello,
My kernel config - http://pastebin.com/QvqFVWUw
Mount options (/proc/mounts):
/dev/vda2 / xfs rw,noatime,attr2,noquota 0 0
> Can you run your test with this patch while running a trace for the
> new xfs_buf_*broken events to verify you are hitting that code and
> that it doesn't trigger EIO problems anymore?
I've applied this patch to linus/v3.3 kernel. There is no more I/O
errors but xfsdump still hangs.
Trace recorded just after hung up:
# tracer: nop
#
# entries-in-buffer/entries-written: 78/78 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
xfsdump-1534 [000] .... 28544.651904: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651908: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651910: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651911: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651912: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651914: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651915: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651916: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651918: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651919: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651921: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651923: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651924: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651926: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651927: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651928: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651930: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651931: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651932: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651934: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651935: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651937: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651939: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651940: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651941: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651942: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651943: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651945: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651946: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651947: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651949: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651950: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651951: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651953: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651955: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651956: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651958: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651960: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651962: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651963: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651964: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651965: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.651966: xfs_buf_get_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 1 pincount 0 lock 0 flags READ|READ_AHEAD|ASYNC|TRYLOCK|DONT_BLOCK caller xfs_buf_read
xfsdump-1534 [000] .... 28544.661785: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.661788: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.663330: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.667218: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.671297: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.675416: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.682188: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.683811: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.687418: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.693069: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.695414: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.699315: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.703417: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.707416: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.711416: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.715338: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.719413: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.723412: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.727410: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 28544.731408: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
Later, this lines keep repeating, there is no more xfs_buf_get_broken: entries in trace file:
xfsaild/vda2-156 [000] .... 29791.119322: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 29791.119332: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 29791.123296: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsaild/vda2-156 [000] .... 29791.123306: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|DONT_BLOCK caller xfs_buf_get
xfsaild/vda2-156 [000] .... 29791.127341: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|TRYLOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1534 [000] .... 29791.127351: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
Regards,
Rafal Kupka
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598
2012-03-30 16:02 ` Rafał Kupka
@ 2012-04-04 9:19 ` Rafał Kupka
0 siblings, 0 replies; 12+ messages in thread
From: Rafał Kupka @ 2012-04-04 9:19 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Fri, 2012-03-30 at 18:02 +0200, Rafał Kupka wrote:
> On Tue, 2012-03-27 at 21:17 +1100, Dave Chinner wrote:
Hi,
I've collected more data.
> I've applied this patch to linus/v3.3 kernel. There is no more I/O
> errors but xfsdump still hangs.
[other details in my previous e-mail]
This new code is executed but broken struct xfs_buf is never reclaimed
from the cache:
xfsdump-1455 [000] .... 40888.509350: xfs_perag_get: dev 254:2 agno 1 refcount 441 caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.509355: xfs_perag_put: dev 254:2 agno 1 refcount 440 caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.509356: xfs_buf_trylock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.509357: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1455 [000] .... 40888.509358: xfs_buf_unlock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 1 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller xfs_buf_relse
xfsdump-1455 [000] .... 40888.509359: xfs_buf_rele: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 1 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.513340: xfs_perag_get: dev 254:2 agno 1 refcount 441 caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.513345: xfs_perag_put: dev 254:2 agno 1 refcount 440 caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.513346: xfs_buf_trylock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.513347: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1455 [000] .... 40888.513348: xfs_buf_unlock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 1 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller xfs_buf_relse
xfsdump-1455 [000] .... 40888.513349: xfs_buf_rele: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 1 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.517336: xfs_perag_get: dev 254:2 agno 1 refcount 441 caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.517341: xfs_perag_put: dev 254:2 agno 1 refcount 440 caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.517342: xfs_buf_trylock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller _xfs_buf_find
xfsdump-1455 [000] .... 40888.517343: xfs_buf_find_broken: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 0 flags READ|LOCK|DONT_BLOCK caller xfs_buf_get
xfsdump-1455 [000] .... 40888.517344: xfs_buf_unlock: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 1 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller xfs_buf_relse
xfsdump-1455 [000] .... 40888.517345: xfs_buf_rele: dev 254:2 bno 0xffffffffffffffff len 0x2000 hold 2 pincount 0 lock 1 flags READ|ASYNC|BROKEN|TRYLOCK|PAGES caller _xfs_buf_find
As I lurked to XFS mailing list I see that better fix is to never put
half-initialized buf into rbtree.
http://article.gmane.org/gmane.comp.file-systems.xfs.general/44356
I've tried another simple patch. If these broken XBF_READ_AHEAD buffers
can be picked up by non read-ahead code (as long as it's in rbtree)
maybe it's not reasonable to treat them specially in
xfs_buf_allocate_memory().
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index e24b320..d18c070 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -341,11 +341,13 @@ use_alloc_page:
retry:
page = alloc_page(gfp_mask);
if (unlikely(page == NULL)) {
+ /*
if (flags & XBF_READ_AHEAD) {
bp->b_page_count = i;
error = ENOMEM;
goto out_free_pages;
}
+ */
/*
* This could deadlock.
Patch is probably broken by Evolution but you can see what I've done.
XFS patched this way survived my test for a long(er) time without hangs.
Retries counter in /proc/fs/xfs/stat raises slowly as expected.
"buf 13869637 753868 13946555 17174 8848657 92215 555 146160 90802"
There is one "XFS: possible memory allocation deadlock" message as you
can see in kernel logs below.
Yet there is another (really?) assertion:
Apr 4 02:58:22 goscie kernel: [ 5.617480] sshd (892): /proc/892/oom_adj is deprecated, please use /proc/892/oom_score_adj instead.
Apr 4 04:35:14 goscie kernel: [ 5817.096240] XFS: possible memory allocation deadlock in xfs_buf_allocate_memory (mode:0x1200)
Apr 4 04:41:24 goscie kernel: [ 6187.984118] XFS: Assertion failed: ((iflags & BMV_IF_DELALLOC) != 0) || (map[i].br_startblock != DELAYSTARTBLOCK), file: fs/xfs/xfs_bmap.c, line: 5612
Apr 4 04:41:24 goscie kernel: [ 6187.986603] ------------[ cut here ]------------
Apr 4 04:41:24 goscie kernel: [ 6187.987608] kernel BUG at fs/xfs/xfs_message.c:101!
Apr 4 04:41:24 goscie kernel: [ 6187.988050] invalid opcode: 0000 [#1] SMP
Apr 4 04:41:24 goscie kernel: [ 6187.988050] CPU 0
Apr 4 04:41:24 goscie kernel: [ 6187.988050] Modules linked in: loop tpm_tis psmouse i2c_piix4 tpm i2c_core evdev pcspkr tpm_bios serio_raw processor thermal_sys button xfs sg ata_generic ata_piix libata scsi_mod virtio_net [last unloaded: scsi_wait_scan]
Apr 4 04:41:24 goscie kernel: [ 6187.988050]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] Pid: 3019, comm: xfsdump Not tainted 3.3.0xfsdump1-00001-g26e1ccc-dirty #4 Bochs Bochs
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RIP: 0010:[<ffffffffa009043e>] [<ffffffffa009043e>] assfail+0x1d/0x23 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RSP: 0000:ffff880011c71af8 EFLAGS: 00010296
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RAX: 00000000000000a0 RBX: ffff88000317b6a0 RCX: 0000000007b707b7
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RBP: ffff880011c71bb8 R08: 0000000000000000 R09: ffffffff8167d50a
Apr 4 04:41:24 goscie kernel: [ 6187.988050] R10: 0000000000000000 R11: ffff88001780d000 R12: 0000000000000000
Apr 4 04:41:24 goscie kernel: [ 6187.988050] R13: ffff8800153080f0 R14: ffff880003130800 R15: ffff8800031cd840
Apr 4 04:41:24 goscie kernel: [ 6187.988050] FS: 0000000000000000(0000) GS:ffff880017c00000(0063) knlGS:00000000f765c700
Apr 4 04:41:24 goscie kernel: [ 6187.988050] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
Apr 4 04:41:24 goscie kernel: [ 6187.988050] CR2: 00000000f586541c CR3: 0000000012155000 CR4: 00000000000006f0
Apr 4 04:41:24 goscie kernel: [ 6187.988050] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 4 04:41:24 goscie kernel: [ 6187.988050] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 4 04:41:24 goscie kernel: [ 6187.988050] Process xfsdump (pid: 3019, threadinfo ffff880011c70000, task ffff8800120dd750)
Apr 4 04:41:24 goscie kernel: [ 6187.988050] Stack:
Apr 4 04:41:24 goscie kernel: [ 6187.988050] ffff880011c71bb8 ffffffffa00a6fdc ffff8800031cd988 00000005115338c0
Apr 4 04:41:24 goscie kernel: [ 6187.988050] 0000001000000005 0000000000000008 ffff88000317b600 000000000000c000
Apr 4 04:41:24 goscie kernel: [ 6187.988050] ffff880015308000 0000000800000000 ffffffffa008b2d2 000000008115ecf6
Apr 4 04:41:24 goscie kernel: [ 6187.988050] Call Trace:
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa00a6fdc>] ? xfs_getbmap+0x45b/0x5ae [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa008b2d2>] ? xfs_open_by_handle+0x182/0x182 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa008b3f0>] ? xfs_ioc_getbmapx+0x4e/0x85 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa008c8f6>] ? xfs_file_ioctl+0x43f/0x751 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff810be2f2>] ? __lock_page_killable+0x5d/0x63
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff81106cc7>] ? __dentry_open+0x1d6/0x2c8
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff81105ca6>] ? fd_install+0x27/0x4f
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa008b2ad>] ? xfs_open_by_handle+0x15d/0x182 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa00eb0f7>] ? xfs_file_compat_ioctl+0x355/0x486 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa008a3b7>] ? xfs_iunlock+0xea/0x146 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffffa009509f>] ? xfs_free_eofblocks+0xa5/0x217 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff81142365>] ? compat_sys_ioctl+0x1a5/0xe28
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff81071d92>] ? timekeeping_get_ns+0xd/0x2a
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff810729fa>] ? getnstimeofday+0x4d/0x80
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff81072a7a>] ? do_gettimeofday+0x10/0x2f
Apr 4 04:41:24 goscie kernel: [ 6187.988050] [<ffffffff81361a36>] ? cstar_dispatch+0x7/0x21
Apr 4 04:41:24 goscie kernel: [ 6187.988050] Code: 89 64 24 28 e8 86 fc ff ff 48 83 c4 68 c3 48 89 f1 41 89 d0 48 83 ec 08 48 89 fa 48 c7 c6 7b 6f 0f a0 31 ff 31 c0 e8 8d ff ff ff <0f> 0b eb fe 90 90 45 31 c0 31 c0 48 83 7f 38 00 75 67 e9 bd 00
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RIP [<ffffffffa009043e>] assfail+0x1d/0x23 [xfs]
Apr 4 04:41:24 goscie kernel: [ 6187.988050] RSP <ffff880011c71af8>
Apr 4 04:41:24 goscie kernel: [ 6188.005126] ---[ end trace 00a3f5114a46a7b5 ]---
Should I investigate further or it is related to my (broken) patch?
Best Regards,
Rafal Kupka
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-04-04 9:20 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-29 18:20 XFS: Assertion failed: bp->b_bn != XFS_BUF_DADDR_NULL, file: fs/xfs/xfs_buf.c, line: 598 Rafał Kupka
2012-03-02 10:43 ` Christoph Hellwig
2012-03-03 11:37 ` Rafał Kupka
2012-03-03 11:44 ` Christoph Hellwig
2012-03-05 21:40 ` Rafał Kupka
2012-03-06 20:30 ` Dave Chinner
2012-03-08 19:10 ` Rafał Kupka
2012-03-26 16:05 ` Rafał Kupka
2012-03-27 3:28 ` Dave Chinner
2012-03-27 10:17 ` Dave Chinner
2012-03-30 16:02 ` Rafał Kupka
2012-04-04 9:19 ` Rafał Kupka
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.