RFC: what to do when trans_roll fails and we bheld buffers? (crash in xfs/141)

* RFC: what to do when trans_roll fails and we bheld buffers? (crash in xfs/141)
@ 2019-04-23 16:49 Darrick J. Wong
  2019-04-23 18:08 ` Brian Foster
  0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2019-04-23 16:49 UTC (permalink / raw)
  To: xfs; +Cc: Brian Foster, Allison Collins

Hi all,

I saw a crash in xfs/141 last night while testing xfs for-next on a V4
filesystem.  I think the problem here is that:

 1. xfs_attr_set calls xfs_attr_set_args.

 2. xfs_attr_set_args calls xfs_attr_shortform_to_leaf and sets @leaf_bp.

 3. _set_args calls xfs_defer_finish.

 4. xfs_defer_finish calls xfs_defer_trans_roll calls xfs_trans_roll.

 5. xfs_trans_roll sets @args.trans to the new transaction created by
    xfs_trans_dup.

 6. xfs_trans_roll calls __xfs_trans_commit on the old transaction and
    fails because we injected a log error.

 7. _trans_roll passes the error code up to _defer_finish.

 8. _defer_finish passes the error code up to _attr_set without joining
    anything (in particular leaf_bp which we bhold'd so that we would
    retain the buffer lock) to the new transaction.  Note that
    @args.trans is still set to the new transaction; nobody's cancelled it.

 8. xfs_attr_set tries to bail out through out_release_leaf.  The
    leaf_bp is non-null and args.trans is non-null, so we call
    xfs_trans_brelse to release leaf_bp from args.trans

 9. leaf_bp isn't joined to any transaction at all, but we passed in a
    non-null args.trans so _trans_brelse hits the "bp->b_transp == tp"
    assert and then crashes trying to assert on the buffer log item
    which is null because leaf_bp isn't joined to any transaction.

I think the underlying problem here is that we haven't precisely defined
what's supposed to happen when the _trans_roll in _defer_trans_roll
errors out.  Looking at the three callers of trans_bhold (attr setting,
allocating disk space for a new dquot, and online repair) the handling
of this situation is not consistent.

(Embarassingly, the first of those three was RVB'd by me and the latter
two were written by me :P)

Should we attach the bholded buffers to the new transaction even if the
old transaction failed to commit so that callers can xfs_trans_brelse
the buffer and it all works out (like what the attr code does)?

Or should we cancel the new transaction and set *tpp to NULL, which will
also solve this problem for the attr case since args.trans will be null
after the old commit fails?  But then we run into the problem that now
every caller has to be prepared to deal with the transaction being null
if something went wrong.  All of those callers ought to be calling
xfs_trans_cancel if that's the case, so we might be able to get away
with adding a if (!tp) return; at the top of that function.

Or are the callers of _defer_finish/_trans_roll simply supposed to know
that the buffer might not be joined if the roll fails and deal with it
appropriately?

Extra complication: if the rolling succeeds but one of the
defer_ops->finish_item() calls fails then the buffer /will/ be joined to
the transaction being passed out.

I /think/ the easiest answer here is that we ought to join the buffers
to the new transaction even if committing the old one fails.  I also
suspect the out_release code in xrep_roll_ag_trans is wrong, but that's
my problem to fix. :)

--D

run fstests xfs/141 at 2019-04-23 00:01:13
XFS (sdc): Mounting V4 Filesystem
XFS (sdc): Ending clean mount
XFS (sdb): Mounting V4 Filesystem
XFS (sdb): Ending clean mount
XFS (sdb): Quotacheck needed: Please wait.
XFS (sdb): Quotacheck: Done.
XFS (sdb): Injecting error (false) at file fs/xfs/xfs_log.c, line 1924, on filesystem "sdb"
XFS (sdb): Intentionally corrupted log record at LSN 0x100000e0e. Shutdown imminent.
XFS (sdb): metadata I/O error in "xlog_iodone" at daddr 0x23762e len 64 error 0
XFS (sdb): xfs_do_force_shutdown(0x2) called from line 1272 of file fs/xfs/xfs_log.c. Return address = ffffffffa064e48b
XFS: Assertion failed: bp->b_transp == tp, file: fs/xfs/xfs_trans_buf.c, line: 385
potentially unexpected fatal signal 6.
CPU: 1 PID: 1154887 Comm: fsstress Tainted: G        W         5.1.0-rc5-djw #rc5
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
RIP: 0033:0x7fdb91cd9e97
Code: c2 b8 ea 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3b 41 89 c0 41 ba 08 00 00 00 31 d2 4c 89 ce bf 02 00 00 00 b8 0e 00 00 00 0f 05 <48> 8b 8c 24 08 01 00 00 64 48 33 0c 25 28 00 00 00 44 89 c0 75 1f
RSP: 002b:00007ffdb0d0f080 EFLAGS: 00000246 ORIG_RAX: 000000000000000e
RAX: 0000000000000000 RBX: 0000000003aed64a RCX: 00007fdb91cd9e97
RDX: 0000000000000000 RSI: 00007ffdb0d0f080 RDI: 0000000000000002
RBP: 0000000000458923 R08: 0000000000000000 R09: 00007ffdb0d0f080
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000005
R13: 000000000000031e R14: 00007ffdb0d0f3d0 R15: 000055d212696560
FS:  00007fdb928cdb80 GS:  0000000000000000
potentially unexpected fatal signal 6.
CPU: 1 PID: 1154888 Comm: fsstress Tainted: G        W         5.1.0-rc5-djw #rc5
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
RIP: 0033:0x7fdb91cd9e97
Code: c2 b8 ea 00 00 00 0f 05 48 3d 00 f0 ff ff 77 3b 41 89 c0 41 ba 08 00 00 00 31 d2 4c 89 ce bf 02 00 00 00 b8 0e 00 00 00 0f 05 <48> 8b 8c 24 08 01 00 00 64 48 33 0c 25 28 00 00 00 44 89 c0 75 1f
RSP: 002b:00007ffdb0d0eaf0 EFLAGS: 00000246 ORIG_RAX: 000000000000000e
RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007fdb91cd9e97
RDX: 0000000000000000 RSI: 00007ffdb0d0eaf0 RDI: 0000000000000002
RBP: 000055d2128a41c0 R08: 0000000000000000 R09: 00007ffdb0d0eaf0
R10: 0000000000000008 R11: 0000000000000246 R12: 000000000007a120
R13: 0000000051eb851f R14: 00007ffdb0d0f3d0 R15: 000055d21269ec60
FS:  00007fdb928cdb80 GS:  0000000000000000
XFS (sdb): Log I/O Error Detected. Shutting down filesystem
XFS (sdb): Please unmount the filesystem and rectify the problem(s)
WARNING: CPU: 0 PID: 1154885 at fs/xfs/xfs_message.c:104 assfail+0x27/0x2a [xfs]
Modules linked in: dm_snapshot xfs dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c ext2 ext4 mbcache jbd2 dm_flakey ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_set_hash_net xt_set ip_set_hash_mac ip_set nfnetlink bfq mq_deadline ip6table_filter ip6_tables iptable_filter dax_pmem_compat device_dax dax_pmem_core nd_pmem sch_fq_codel ip_tables x_tables nfsv4 af_packet [last unloaded: scsi_debug]
CPU: 0 PID: 1154885 Comm: fsstress Tainted: G        W         5.1.0-rc5-djw #rc5
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
RIP: 0010:assfail+0x27/0x2a [xfs]
Code: 0f 0b c3 66 66 66 66 90 48 89 f1 41 89 d0 48 c7 c6 50 7c 6a a0 48 89 fa 31 ff e8 a6 f8 ff ff 80 3d 45 94 0a 00 00 74 02 0f 0b <0f> 0b c3 48 8b b3 a8 02 00 00 48 c7 c7 e8 80 6a a0 c6 05 66 8f 0a
RSP: 0018:ffffc9000194bbb0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa0699d13
RBP: ffff88803c45cb00 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000000a R11: f000000000000000 R12: ffff888060e41b80
R13: 0000000000000010 R14: 000000000000005d R15: 0000000000000000
FS:  00007fdb928cdb80(0000) GS:ffff88803e800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdb9195c000 CR3: 00000000770bd000 CR4: 00000000000006b0
Call Trace:
 xfs_trans_brelse+0x185/0x210 [xfs]
 xfs_attr_set+0x2ce/0x300 [xfs]
 xfs_xattr_set+0x54/0xa0 [xfs]
 __vfs_setxattr+0x66/0x80
 __vfs_setxattr_noperm+0x54/0xf0
 vfs_setxattr+0x81/0xa0
 setxattr+0x146/0x1d0
 path_setxattr+0xd0/0xf0
 __x64_sys_setxattr+0x27/0x30
 do_syscall_64+0x50/0x160
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fdb91db6839
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffdb0d0f318 EFLAGS: 00000246 ORIG_RAX: 00000000000000bc
RAX: ffffffffffffffda RBX: 000000000000005d RCX: 00007fdb91db6839
RDX: 000055d212bd8b60 RSI: 00007ffdb0d0f360 RDI: 000055d212bf59f0
RBP: 0000000000000001 R08: 0000000000000001 R09: 00007ffdb0d0f3d0
R10: 000000000000005d R11: 0000000000000246 R12: 0000000000000001
R13: 000000000000031a R14: 00007ffdb0d0f360 R15: 000055d212bd8b60
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>]           (null)
hardirqs last disabled at (0): [<ffffffff81057804>] copy_process+0x8d4/0x2220
softirqs last  enabled at (0): [<ffffffff81057804>] copy_process+0x8d4/0x2220
softirqs last disabled at (0): [<0000000000000000>]           (null)
---[ end trace f2eeafdb05fd11a1 ]---
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
#PF error: [normal kernel read fault]
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 1154885 Comm: fsstress Tainted: G        W         5.1.0-rc5-djw #rc5
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
RIP: 0010:xfs_trans_brelse+0x48/0x210 [xfs]
Code: 85 4a 01 00 00 48 85 ed 0f 84 8a 00 00 00 66 66 66 66 90 65 8b 05 a8 85 9a 5f 89 c0 48 0f a3 05 06 d4 aa e1 0f 82 ee 00 00 00 <81> 7b 38 3c 12 00 00 0f 85 52 01 00 00 8b 83 a8 00 00 00 85 c0 0f
RSP: 0018:ffffc9000194bbb8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 000000000000000a RDI: ffffffffa0699d13
RBP: ffff88803c45cb00 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000000a R11: f000000000000000 R12: ffff888060e41b80
R13: 0000000000000010 R14: 000000000000005d R15: 0000000000000000
FS:  00007fdb928cdb80(0000) GS:ffff88803e800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 00000000770bd000 CR4: 00000000000006b0
Call Trace:
 xfs_attr_set+0x2ce/0x300 [xfs]
 xfs_xattr_set+0x54/0xa0 [xfs]
 __vfs_setxattr+0x66/0x80
 __vfs_setxattr_noperm+0x54/0xf0
 vfs_setxattr+0x81/0xa0
 setxattr+0x146/0x1d0
 path_setxattr+0xd0/0xf0
 __x64_sys_setxattr+0x27/0x30
 do_syscall_64+0x50/0x160
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fdb91db6839
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffdb0d0f318 EFLAGS: 00000246 ORIG_RAX: 00000000000000bc
RAX: ffffffffffffffda RBX: 000000000000005d RCX: 00007fdb91db6839
RDX: 000055d212bd8b60 RSI: 00007ffdb0d0f360 RDI: 000055d212bf59f0
RBP: 0000000000000001 R08: 0000000000000001 R09: 00007ffdb0d0f3d0
R10: 000000000000005d R11: 0000000000000246 R12: 0000000000000001
R13: 000000000000031a R14: 00007ffdb0d0f360 R15: 000055d212bd8b60
Modules linked in: dm_snapshot xfs dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c ext2 ext4 mbcache jbd2 dm_flakey ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp ip_set_hash_net xt_set ip_set_hash_mac ip_set nfnetlink bfq mq_deadline ip6table_filter ip6_tables iptable_filter dax_pmem_compat device_dax dax_pmem_core nd_pmem sch_fq_codel ip_tables x_tables nfsv4 af_packet [last unloaded: scsi_debug]
Dumping ftrace buffer:
   (ftrace buffer empty)
CR2: 0000000000000038
---[ end trace f2eeafdb05fd11a2 ]---
RIP: 0010:xfs_trans_brelse+0x48/0x210 [xfs]
Code: 85 4a 01 00 00 48 85 ed 0f 84 8a 00 00 00 66 66 66 66 90 65 8b 05 a8 85 9a 5f 89 c0 48 0f a3 05 06 d4 aa e1 0f 82 ee 00 00 00 <81> 7b 38 3c 12 00 00 0f 85 52 01 00 00 8b 83 a8 00 00 00 85 c0 0f
RSP: 0018:ffffc9000194bbb8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 000000000000000a RDI: ffffffffa0699d13
RBP: ffff88803c45cb00 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000000a R11: f000000000000000 R12: ffff888060e41b80
R13: 0000000000000010 R14: 000000000000005d R15: 0000000000000000
FS:  00007fdb928cdb80(0000) GS:ffff88803e800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 00000000770bd000 CR4: 00000000000006b0

^ permalink raw reply	[flat|nested] 7+ messages in thread