From: Scott Mayhew <smayhew@redhat.com>
To: trond.myklebust@hammerspace.com, anna.schumaker@netapp.com
Cc: linux-nfs@vger.kernel.org
Subject: [PATCH 0/1] Fix nfs4_slot use-after-free by FREE_STATEID
Date: Mon, 1 Nov 2021 16:06:22 -0400 [thread overview]
Message-ID: <20211101200623.2635785-1-smayhew@redhat.com> (raw)
My attempts at reproducing the nfsd soft-lockups during callback
processing have been hampered by frequent client panics while testing.
My reproducer is as follows:
1. Use netem to add some latency to the connection (without this, the
panics do not occur):
# tc qdisc add dev tun0 root netem delay 14ms
2. Run nfstest_delegation:
# ./nfstest_delegation --server smayhew-rhel8 --export /export/xfs/smayhew/nfstest --nfsversion 4.2 --sec=krb5 --nconnect 16
The client will typically panic before completing the test run.
This is from a vmcore without SLUB debugging enabled:
crash> bt
PID: 1623 TASK: ffff93fb40294a40 CPU: 0 COMMAND: "kworker/u4:2"
#0 [ffffa39f80003d68] machine_kexec at ffffffffac45f898
#1 [ffffa39f80003db8] __crash_kexec at ffffffffac59d02d
#2 [ffffa39f80003e80] panic at ffffffffacddb718
#3 [ffffa39f80003f00] watchdog_timer_fn.cold at ffffffffacde5645
#4 [ffffa39f80003f28] __hrtimer_run_queues at ffffffffac57a53a
#5 [ffffa39f80003f88] hrtimer_interrupt at ffffffffac57b1ec
#6 [ffffa39f80003fd8] __sysvec_apic_timer_interrupt at ffffffffac45687c
#7 [ffffa39f80003ff0] sysvec_apic_timer_interrupt at fffffffface24b7d
--- <IRQ stack> ---
#8 [ffffa39f80663cb8] sysvec_apic_timer_interrupt at fffffffface24b7d
[exception RIP: unknown or invalid address]
RIP: ffff93fb46a0a974 RSP: ffff93fa46924540 RFLAGS: 00001000
RAX: ffff93fb41cf5600 RBX: 0000000000000000 RCX: ffff93fb424c0cd0
RDX: ffff93fb424c0cd8 RSI: ffffffffc0e3cde0 RDI: 00000000000001aa
RBP: fffffffface24b1b R8: ffff93fa46924540 R9: 0000000000000000
R10: ffffffffad000cc2 R11: 0000000000000000 R12: fffffffface23394
R13: 0000000000000000 R14: ffff93fb401c8200 R15: ffff93fb40067730
ORIG_RAX: 0000000000000007 CS: a58cc9ea96350 SS: 0001
WARNING: possibly bogus exception frame
#9 [ffffa39f80663d78] native_queued_spin_lock_slowpath at ffffffffac54664d
#10 [ffffa39f80663da0] _raw_spin_lock at fffffffface329da
#11 [ffffa39f80663da8] rpc_wake_up_first_on_wq at ffffffffc0467341 [sunrpc]
#12 [ffffa39f80663dd8] nfs41_wake_and_assign_slot at ffffffffc0e3d559 [nfsv4]
#13 [ffffa39f80663de0] nfs41_release_slot at ffffffffc0e13f77 [nfsv4]
#14 [ffffa39f80663e18] nfs41_free_stateid_done at ffffffffc0e1c018 [nfsv4]
#15 [ffffa39f80663e30] rpc_exit_task at ffffffffc045d3b8 [sunrpc]
#16 [ffffa39f80663e40] __rpc_execute at ffffffffc046708e [sunrpc]
#17 [ffffa39f80663e70] rpc_async_schedule at ffffffffc04672a9 [sunrpc]
#18 [ffffa39f80663e88] process_one_work at ffffffffac4fee8b
#19 [ffffa39f80663ed0] worker_thread at ffffffffac4ff083
#20 [ffffa39f80663f10] kthread at ffffffffac505dcf
#21 [ffffa39f80663f50] ret_from_fork at ffffffffac4034f2
Find the rpc_task so we can find the nfs4_slot:
crash> rpc_task.tk_calldata ffff93fb51899e00
tk_calldata = 0xffff93fa472b24e0
crash> nfs_free_stateid_data.res 0xffff93fa472b24e0
res = {
seq_res = {
sr_slot = 0xffff93fa46924540,
sr_timestamp = 0x10017679e,
sr_status = 0x1,
sr_status_flags = 0x0,
sr_highest_slotid = 0x41991000,
sr_target_highest_slotid = 0xffff93fa
},
status = 0x42de63c0
}
Note the slab has been corrupted ("invalid freepionter" message):
crash> kmem 0xffff93fa46924540
CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME
ffff93fb40042500 64 13992 14848 232 4k kmalloc-64
SLAB MEMORY NODE TOTAL ALLOCATED FREE
fffff4adc01a4900 ffff93fa46924000 0 64 24 40
FREE / [ALLOCATED]
kmem: kmalloc-64: slab: fffff4adc01a4900 invalid freepointer: 7a1b527446924a20
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffff4adc01a4900 6924000 ffff93fb40042500 ffff93fa46924a40 1 fffffc0000200 slab
This is from a vmcore captured with "slub_debug=PU,kmalloc-64":
crash> bt
PID: 8 TASK: ffff8b7d80233180 CPU: 0 COMMAND: "kworker/u4:0"
#0 [ffffa7208004bac0] machine_kexec at ffffffffac05f898
#1 [ffffa7208004bb10] __crash_kexec at ffffffffac19d02d
#2 [ffffa7208004bbd8] crash_kexec at ffffffffac19e234
#3 [ffffa7208004bbe0] oops_end at ffffffffac022b47
#4 [ffffa7208004bc00] exc_general_protection at ffffffffaca22d09
#5 [ffffa7208004bca0] asm_exc_general_protection at ffffffffacc00a0e
[exception RIP: nfs4_xdr_dec_free_stateid+0x54]
RIP: ffffffffc0e93fb4 RSP: ffffa7208004bd50 RFLAGS: 00010282
RAX: 6b6b6b6b6b6b6b6b RBX: ffff8b7d8551cc90 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8b7d89a99f18 RDI: ffff8b7d89a99ef0
RBP: ffffa7208004bdd8 R8: 0000000000000000 R9: ffff8b7c84e5e900
R10: ffff8b7c82bb8e20 R11: ffff8b7d8994d240 R12: 0000000000000000
R13: ffff8b7d8551c300 R14: ffffa7208004bdd8 R15: ffff8b7cb2ea5a80
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#6 [ffffa7208004bd98] gss_unwrap_resp at ffffffffc0640965 [auth_rpcgss]
#7 [ffffa7208004bdd0] call_decode at ffffffffc04c4336 [sunrpc]
#8 [ffffa7208004be40] __rpc_execute at ffffffffc04dd08e [sunrpc]
#9 [ffffa7208004be70] rpc_async_schedule at ffffffffc04dd2a9 [sunrpc]
#10 [ffffa7208004be88] process_one_work at ffffffffac0fee8b
#11 [ffffa7208004bed0] worker_thread at ffffffffac0ff083
#12 [ffffa7208004bf10] kthread at ffffffffac105dcf
#13 [ffffa7208004bf50] ret_from_fork at ffffffffac0034f2
crash> rpc_task.tk_calldata ffff8b7d813c1800
tk_calldata = 0xffff8b7d8551cc60
crash> nfs_free_stateid_data.res 0xffff8b7d8551cc60
res = {
seq_res = {
sr_slot = 0xffff8b7d89994d80,
sr_timestamp = 0xffff6ac6,
sr_status = 0x1,
sr_status_flags = 0x0,
sr_highest_slotid = 0x81986c00,
sr_target_highest_slotid = 0xffff8b7d
},
status = 0x8551c120
}
Note the slot has been overwritten with POISON_FREE:
crash> nfs4_slot 0xffff8b7d89994d80
struct nfs4_slot {
table = 0x6b6b6b6b6b6b6b6b,
next = 0x6b6b6b6b6b6b6b6b,
generation = 0x6b6b6b6b6b6b6b6b,
slot_nr = 0x6b6b6b6b,
seq_nr = 0x6b6b6b6b,
seq_nr_last_acked = 0x6b6b6b6b,
seq_nr_highest_sent = 0x6b6b6b6b,
privileged = 0x1,
seq_done = 0x1
}
Let's find the tracking info for who last freed the object:
crash> kmem 0xffff8b7d89994d80
CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME
ffff8b7d80042500 64 12444 12600 600 8k kmalloc-64
SLAB MEMORY NODE TOTAL ALLOCATED FREE
ffffceb8c4266500 ffff8b7d89994000 0 21 18 3
FREE / [ALLOCATED]
ffff8b7d89994d80
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffceb8c4266500 109994000 ffff8b7d80042500 ffff8b7d89994c00 1 17ffffc0010200 slab,head
(See freepointer_outside_object() in mm/slub.c)
crash> struct kmem_cache.offset,inuse ffff8b7d80042500
offset = 0x40
inuse = 0x40
(object + s->inuse + sizeof(void *) + sizeof(struct track))
crash> px 0xffff8b7d89994d80+0x40+0x8+0x98
$5 = 0xffff8b7d89994e60
crash> struct track 0xffff8b7d89994e60
struct track {
addr = 0xffffffffc0eadb6f,
addrs = {0xffffffffac3250b1, 0xffffffffc0eadb6f, 0xffffffffc0eabb94, 0xffffffffc0eabfc4, 0xffffffffc0e137e4, 0xffffffffc0e1f62d, 0xffffffffac35f776, 0xffffffffac384081, 0xffffffffac10320f, 0xffffffffac173c92, 0xffffffffac173d29, 0xffffffffaca25822, 0xffffffffaca222f8, 0xffffffffacc0007c, 0x0, 0xffffffffacc0007c},
cpu = 0x0,
pid = 0x69e,
when = 0xffff6b23
}
crash> kmem 0xffffffffc0eadb6f
ffffffffc0eadb6f (T) nfs4_destroy_session+0x7f [nfsv4] /export/xfs/smayhew/linux/fs/nfs/nfs4session.c: 583
VMAP_AREA VM_STRUCT ADDRESS RANGE SIZE
ffff8b7d8995a380 ffff8b7d83cf5500 ffffffffc0e7e000 - ffffffffc0f64000 942080
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffceb8c0ca7a80 329ea000 0 0 1 fffffc0000000
crash> bt 0x69e
PID: 1694 TASK: ffff8b7d88f4b180 CPU: 0 COMMAND: "umount.nfs4"
#0 [ffffa72080657be0] __schedule at ffffffffaca2d730
#1 [ffffa72080657c58] schedule at ffffffffaca2dab4
#2 [ffffa72080657c70] rpc_wait_bit_killable at ffffffffc04d2fde [sunrpc]
#3 [ffffa72080657c88] __wait_on_bit at ffffffffaca2deea
#4 [ffffa72080657cc0] out_of_line_wait_on_bit at ffffffffaca2dfe2
#5 [ffffa72080657d10] __rpc_execute at ffffffffc04dd0f2 [sunrpc]
#6 [ffffa72080657d40] rpc_execute at ffffffffc04dd50f [sunrpc]
#7 [ffffa72080657d60] rpc_run_task at ffffffffc04c47f6 [sunrpc]
#8 [ffffa72080657da0] rpc_call_sync at ffffffffc04c4f01 [sunrpc]
#9 [ffffa72080657e00] nfs4_destroy_clientid at ffffffffc0e8febf [nfsv4]
#10 [ffffa72080657e50] nfs4_free_client at ffffffffc0eabfc4 [nfsv4]
#11 [ffffa72080657e60] nfs_free_server at ffffffffc0e137e4 [nfs]
#12 [ffffa72080657e70] nfs_kill_super at ffffffffc0e1f62d [nfs]
#13 [ffffa72080657e90] deactivate_locked_super at ffffffffac35f776
#14 [ffffa72080657ea8] cleanup_mnt at ffffffffac384081
#15 [ffffa72080657ed0] task_work_run at ffffffffac10320f
#16 [ffffa72080657ef0] exit_to_user_mode_loop at ffffffffac173c92
#17 [ffffa72080657f10] exit_to_user_mode_prepare at ffffffffac173d29
#18 [ffffa72080657f28] syscall_exit_to_user_mode at ffffffffaca25822
#19 [ffffa72080657f38] do_syscall_64 at ffffffffaca222f8
#20 [ffffa72080657f50] entry_SYSCALL_64_after_hwframe at ffffffffacc0007c
RIP: 00007f64910ef74b RSP: 00007ffcc98b9088 RFLAGS: 00000202
RAX: 0000000000000000 RBX: 000055801513abe0 RCX: 00007f64910ef74b
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000558015139640
RBP: 00005580151382c0 R8: 0000000000000001 R9: 0000000000000000
R10: 000055801513aad0 R11: 0000000000000202 R12: 0000000000000001
R13: 0000558015139640 R14: 00005580151383d0 R15: 0000558015138410
ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b
So... the application ran 'umount -f' which called
nfs_kill_super
nfs_free_server
nfs_put_client
nfs4_free_client
nfs4_shutdown_client
nfs41_shutdown_client
nfs4_destroy_session
nfs4_destroy_session_slot_tables
nfs4_shutdown_slot_table
nfs4_release_slot_table
nfs4_shrink_slot_table <--- all slots freed
nfs_free_client
rpc_shutdown_client
rpc_killall_tasks <--- tasks will call rpc_exit_task which will call rpc_call_done. For v4.1+ this will call nfs41_sequence_done which will call nfs41_sequence_process and nfs41_sequence_free_slot.
Scott Mayhew (1):
nfs4: take a reference on the nfs_client when running FREE_STATEID
fs/nfs/nfs4proc.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
--
2.31.1
next reply other threads:[~2021-11-01 20:06 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-01 20:06 Scott Mayhew [this message]
2021-11-01 20:06 ` [PATCH 1/1] nfs4: take a reference on the nfs_client when running FREE_STATEID Scott Mayhew
2021-11-02 16:28 ` Trond Myklebust
2021-11-02 20:11 ` Scott Mayhew
2021-11-02 20:51 ` Trond Myklebust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211101200623.2635785-1-smayhew@redhat.com \
--to=smayhew@redhat.com \
--cc=anna.schumaker@netapp.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.