linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: dcache endless loop in d_invalidate
Date: Tue, 16 Oct 2018 13:15:28 +0200	[thread overview]
Message-ID: <20181016131528.6aac4876@mschwideX1> (raw)

Hi Al,

I am currently looking into a customer dump and found what looks like
an issue in the dcache code. And I think the following commit of yours
has something to do with it:

commit fe91522a7ba82ca1a51b07e19954b3825e4aaa22
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat May 3 00:02:25 2014 -0400

    don't remove from shrink list in select_collect()

            If we find something already on a shrink list, just increment
    data->found and do nothing else.  Loops in shrink_dcache_parent() and
    check_submounts_and_drop() will do the right thing - everything we
    did put into our list will be evicted and if there had been nothing,
    but data->found got non-zero, well, we have somebody else shrinking
    those guys; just try again.

    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

The dump I got is based on kernel v4.4 but the affected dcache functions
look identical to the upstream version. Here is what I found in the dump:

A lot of "rcu_sched kthread starved for <xxx> jiffies!" messages
Only one CPU, currently running process "run-crons" task 0x65a8008
It just called check_and_drop from d_walk, full backchain:

    PSW.addr   check_and_drop at 30a0e8
    %r14       d_walk at 308202
 #0 [35b87b88] d_invalidate at 3096e8
 #1 [35b87bd8] proc_flush_task at 37190c
 #2 [35b87c58] release_task at 13f202
 #3 [35b87cc8] wait_task_zombie at 13fc36
 #4 [35b87d50] wait_consider_task at 140150
 #5 [35b87dc0] do_wait at 1403de
 #6 [35b87e18] sys_wait4 at 14181e
 #7 [35b87ea8] system_call at 659ec4

Tasks runtime is
  sum_exec_runtime 26813717162347 # nsec = 26813 seconds,
  utime = 3991252 # cputime = 974 seconds,
  stime = 99132516783832 # cputime = 24202 seconds,
Task 0x65a8008 has TIF_NEED_RESCHED set

d_walk() just called check_and_drop via the finish() function pointer,
check_and_drop() will return and d_walk() will return as well.
Look like an endless loop in d_invalidate().

The (struct dentry *) dentry in d_invalidate() is at 0x3cb15858
The struct detach_data data in d_invalidate() is at 0x35b87c28

dentry tree starting @ 0x3cb15858 has two entries in d_subdirs:
0x3cb15858  d_name.name: "11898"
        0xb940d3d8 d_name.name: "cmdline"
        0xb940dd98 d_name.name: "status"

crash> px *(struct dentry *) 0x3cb15858 | grep d_flags
  d_flags = 0x2000cc,

crash> px *(struct dentry *) 0xb940d3d8 | grep d_flags
  d_flags = 0x48048c,  # DCACHE_SHRINK_LIST is set

crash> px *(struct dentry *) 0xb940dd98 | grep d_flags
  d_flags = 0x48048c,  # DCACHE_SHRINK_LIST is set

crash> px *(struct detach_data *) 0x35b87c28
$29 = {
  select = {
    start = 0x3cb15858,
    dispose = {
      next = 0x35b87c30,
      prev = 0x35b87c30
    },
    found = 0x2
  },
  mountpoint = 0x0
}

select_collect() called from detach_and_collect() will increment
data.select.found in the struct detach_data @ 0x35b87c28 but will not
add any dentries to the dispose lists. The shrink_dentry_list() call in
d_invalidate() will do nothing as the dispose list is empty. The two
dentries 0xb940d3d8 and 0xb940dd98 are still there. After d_walk returns
d_invalidate() finds data.mountpoint == NULL and data.select.found == 2,
it will start the loop again without progress.

As this is a single CPU system without kernel preemption there is nobody
else that will do the shrinking of those dcache entries.

In short, this if-statement in select_collect:

        if (dentry->d_flags & DCACHE_SHRINK_LIST) {
                data->found++;
        }

with assumption that "somebody else" will do the shrinking seems broken.

Do you agree?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


             reply	other threads:[~2018-10-16 11:15 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-16 11:15 Martin Schwidefsky [this message]
2018-10-25 11:43 ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181016131528.6aac4876@mschwideX1 \
    --to=schwidefsky@de.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: dcache endless loop in d_invalidate' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).