Hang/soft lockup in d_invalidate with simultaneous calls

* Hang/soft lockup in d_invalidate with simultaneous calls
@ 2017-05-16  0:05 Khazhismel Kumykov
  2017-05-17 21:58 ` Khazhismel Kumykov
  0 siblings, 1 reply; 10+ messages in thread
From: Khazhismel Kumykov @ 2017-05-16  0:05 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2245 bytes --]

Hi,

I'm seeing behavior in d_invalidate, if multiple threads call d_invalidate on
the same tree at the same, behavior time blows up and all the calls hang with
large enough trees/enough simultaneous callers. (e.g. a directory w/ 100k
entries in d_subdir, and 5 or so threads calling d_invalidate was able to hang
my test VMs)

This seems to be due to thrashing on the dentry locks in multiple threads
tightly looping calling d_walk waiting for a shrink_dentry_list call (also
thrashing the locks) to complete. (d_invalidate loops calling d_walk so long as
any dentry in the tree is in a dcache shrink list).

There was a similar report recently "soft lockup in d_invalidate()" that
proposed in the d_invalidate d_walk to ignore dentries marked as in a shrink
list already, which does prevent this hang/lockup in this case as well, although
I'm not sure it doesn't violate the contract for d_invalidate. (Although the
entries in a shrink list should be dropped eventually, not necessarily by the
time d_invalidate returns).

If someone more familiar with the dcache could provide input I would appreciate.

A reliable repro on mainline is:
 - create a mountpoint with DCACHE_OP_REVALIDATE, e.g. fuse passthrough
 - create a directory and populate with ~100k files with content to
populate dcache
 - create some background processes opening/reading files in this folder (5
      while true; cat $file was enough to get an indefinite hang for me)
 - cause the directory to need to be invalidated (e.g., make_bad_inode on the
    directory)

This results in the background processes all entering d_invalidate and hanging,
while with just one process in d_invalidate (e.g., stat'ing a file in the dir)
things go pretty quickly as expected.


(The proposed patch from other thread)

diff --git a/fs/dcache.c b/fs/dcache.c
index 7b8feb6..3a3b0f37 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1364,7 +1364,7 @@ static enum d_walk_ret select_collect(void *_data,
struct dentry *dentry)
                 goto out;

         if (dentry->d_flags & DCACHE_SHRINK_LIST) {
-               data->found++;
+               goto out;
         } else {
                 if (dentry->d_flags & DCACHE_LRU_LIST)
                         d_lru_del(dentry);


khazhy

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4843 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread