All of lore.kernel.org
 help / color / mirror / Atom feed
* nfsv4.1 deadlock between evict and nfs_fhget when drain session
@ 2021-06-01  8:35 zhangxiaoxu (A)
  0 siblings, 0 replies; only message in thread
From: zhangxiaoxu (A) @ 2021-06-01  8:35 UTC (permalink / raw)
  To: trond.myklebust, Anna Schumaker, Linux NFS Mailing List,
	zhangxiaoxu, zhangyi (F)

Hello,

We're seeing a deadlock on NFSv4.1.

The process of the deadlock maybe as below:
  - task 1: prune icache, and mark inode_A & inode_B on freeing, then evict inode_A first, but waiting for inode_A's delegation return to server
  - task 2: open file, already got the fh from server, waiting for the inode_B which has the same file handle was freed complete
  - task 3: state manager is on draining session, but there is a slot is hold by task2
  - task 4: run the delegreturn rpc_task, but the session is on draining, so the delegreturn is sleeping on rpc. Then task 1 blocked.
then deadlocked.

Commit 244fcd2f9a90 ("NFS: Ensure we time out if a delegreturn does not complete") already ensure the delegreturn
task can timeout if get slot from session. But can't timeout if task sleep on rpc when session is on draining.

I think commit 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode") introduce this problem.
But if revert it, there maybe another deadlock because task 1 maybe waiting inode_A writeback complete.
If make delegreturn privileged in rpc, as the same above.

I think the task 2 should free the slot as soon as possible when it's rpc task complete.
But ae55e59da0e4 ("pnfs: Don't release the sequence slot until we've processed layoutget on open") made slot freed more late.

Any idea about this problem is welcome.

Stacks of the problem:

# task1:
__wait_on_freeing_inode
find_inode
ilookup5_nowait
ilookup5
iget5_locked
nfs_fhget
_nfs4_opendata_to_nfs4_state
nfs4_do_open
nfs4_atomic_open
nfs_atomic_open
path_openat
do_filp_open
do_sys_open
__x64_sys_open
do_syscall_64
entry_SYSCALL_64_after_hwframe

# task2:
rpc_wait_bit_killable
__rpc_wait_for_completion_task
_nfs4_proc_delegreturn
nfs4_proc_delegreturn
nfs_do_return_delegation
nfs_inode_return_delegation_noreclaim
nfs4_evict_inode
evict
dispose_list
prune_icache_sb
super_cache_scan
do_shrink_slab
shrink_slab
shrink_node
kswapd
kthread
ret_from_fork

# task3:
nfs4_drain_slot_tbl
nfs4_begin_drain_session
nfs4_run_state_manager
kthread
ret_from_fork

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-06-01  8:35 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-01  8:35 nfsv4.1 deadlock between evict and nfs_fhget when drain session zhangxiaoxu (A)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.