linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* dcache endless loop in d_invalidate
@ 2018-10-16 11:15 Martin Schwidefsky
  2018-10-25 11:43 ` Martin Schwidefsky
  0 siblings, 1 reply; 2+ messages in thread
From: Martin Schwidefsky @ 2018-10-16 11:15 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel

Hi Al,

I am currently looking into a customer dump and found what looks like
an issue in the dcache code. And I think the following commit of yours
has something to do with it:

commit fe91522a7ba82ca1a51b07e19954b3825e4aaa22
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat May 3 00:02:25 2014 -0400

    don't remove from shrink list in select_collect()

            If we find something already on a shrink list, just increment
    data->found and do nothing else.  Loops in shrink_dcache_parent() and
    check_submounts_and_drop() will do the right thing - everything we
    did put into our list will be evicted and if there had been nothing,
    but data->found got non-zero, well, we have somebody else shrinking
    those guys; just try again.

    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

The dump I got is based on kernel v4.4 but the affected dcache functions
look identical to the upstream version. Here is what I found in the dump:

A lot of "rcu_sched kthread starved for <xxx> jiffies!" messages
Only one CPU, currently running process "run-crons" task 0x65a8008
It just called check_and_drop from d_walk, full backchain:

    PSW.addr   check_and_drop at 30a0e8
    %r14       d_walk at 308202
 #0 [35b87b88] d_invalidate at 3096e8
 #1 [35b87bd8] proc_flush_task at 37190c
 #2 [35b87c58] release_task at 13f202
 #3 [35b87cc8] wait_task_zombie at 13fc36
 #4 [35b87d50] wait_consider_task at 140150
 #5 [35b87dc0] do_wait at 1403de
 #6 [35b87e18] sys_wait4 at 14181e
 #7 [35b87ea8] system_call at 659ec4

Tasks runtime is
  sum_exec_runtime 26813717162347 # nsec = 26813 seconds,
  utime = 3991252 # cputime = 974 seconds,
  stime = 99132516783832 # cputime = 24202 seconds,
Task 0x65a8008 has TIF_NEED_RESCHED set

d_walk() just called check_and_drop via the finish() function pointer,
check_and_drop() will return and d_walk() will return as well.
Look like an endless loop in d_invalidate().

The (struct dentry *) dentry in d_invalidate() is at 0x3cb15858
The struct detach_data data in d_invalidate() is at 0x35b87c28

dentry tree starting @ 0x3cb15858 has two entries in d_subdirs:
0x3cb15858  d_name.name: "11898"
        0xb940d3d8 d_name.name: "cmdline"
        0xb940dd98 d_name.name: "status"

crash> px *(struct dentry *) 0x3cb15858 | grep d_flags
  d_flags = 0x2000cc,

crash> px *(struct dentry *) 0xb940d3d8 | grep d_flags
  d_flags = 0x48048c,  # DCACHE_SHRINK_LIST is set

crash> px *(struct dentry *) 0xb940dd98 | grep d_flags
  d_flags = 0x48048c,  # DCACHE_SHRINK_LIST is set

crash> px *(struct detach_data *) 0x35b87c28
$29 = {
  select = {
    start = 0x3cb15858,
    dispose = {
      next = 0x35b87c30,
      prev = 0x35b87c30
    },
    found = 0x2
  },
  mountpoint = 0x0
}

select_collect() called from detach_and_collect() will increment
data.select.found in the struct detach_data @ 0x35b87c28 but will not
add any dentries to the dispose lists. The shrink_dentry_list() call in
d_invalidate() will do nothing as the dispose list is empty. The two
dentries 0xb940d3d8 and 0xb940dd98 are still there. After d_walk returns
d_invalidate() finds data.mountpoint == NULL and data.select.found == 2,
it will start the loop again without progress.

As this is a single CPU system without kernel preemption there is nobody
else that will do the shrinking of those dcache entries.

In short, this if-statement in select_collect:

        if (dentry->d_flags & DCACHE_SHRINK_LIST) {
                data->found++;
        }

with assumption that "somebody else" will do the shrinking seems broken.

Do you agree?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: dcache endless loop in d_invalidate
  2018-10-16 11:15 dcache endless loop in d_invalidate Martin Schwidefsky
@ 2018-10-25 11:43 ` Martin Schwidefsky
  0 siblings, 0 replies; 2+ messages in thread
From: Martin Schwidefsky @ 2018-10-25 11:43 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel

On Tue, 16 Oct 2018 13:15:28 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> In short, this if-statement in select_collect:
> 
>         if (dentry->d_flags & DCACHE_SHRINK_LIST) {
>                 data->found++;
>         }
> 
> with assumption that "somebody else" will do the shrinking seems broken.
> 
> Do you agree?

If I am not mistaken this problem should be fixed by upstream commit
4fb4887140 "restore cond_resched() in shrink_dcache_parent()"
which goes on top of
ff17fa561a "d_invalidate(): unhash immediately"

Due to the cond_resched() the task that set DCACHE_SHRINK_LIST for the
remaining two dcache entries will be scheduled eventually. This will
allow the task waiting for the deletion of these dcache entries 
to continue, although some CPU cycles may get wasted.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-10-25 20:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 11:15 dcache endless loop in d_invalidate Martin Schwidefsky
2018-10-25 11:43 ` Martin Schwidefsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).