linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: Glauber Costa <glommer@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92
Date: Mon, 15 Jul 2013 11:14:28 +0200	[thread overview]
Message-ID: <20130715091428.GA26199@dhcp22.suse.cz> (raw)
In-Reply-To: <20130704163643.GF7833@dhcp22.suse.cz>

On Thu 04-07-13 18:36:43, Michal Hocko wrote:
> On Wed 03-07-13 21:24:03, Dave Chinner wrote:
> > On Tue, Jul 02, 2013 at 02:44:27PM +0200, Michal Hocko wrote:
> > > On Tue 02-07-13 22:19:47, Dave Chinner wrote:
> > > [...]
> > > > Ok, so it's been leaked from a dispose list somehow. Thanks for the
> > > > info, Michal, it's time to go look at the code....
> > > 
> > > OK, just in case we will need it, I am keeping the machine in this state
> > > for now. So we still can play with crash and check all the juicy
> > > internals.
> > 
> > My current suspect is the LRU_RETRY code. I don't think what it is
> > doing is at all valid - list_for_each_safe() is not safe if you drop
> > the lock that protects the list. i.e. there is nothing that protects
> > the stored next pointer from being removed from the list by someone
> > else. Hence what I think is occurring is this:
> > 
> > 
> > thread 1			thread 2
> > lock(lru)
> > list_for_each_safe(lru)		lock(lru)
> >   isolate			......
> >     lock(i_lock)
> >     has buffers
> >       __iget
> >       unlock(i_lock)
> >       unlock(lru)
> >       .....			(gets lru lock)
> >       				list_for_each_safe(lru)
> > 				  walks all the inodes
> > 				  finds inode being isolated by other thread
> > 				  isolate
> > 				    i_count > 0
> > 				      list_del_init(i_lru)
> > 				      return LRU_REMOVED;
> > 				   moves to next inode, inode that
> > 				   other thread has stored as next
> > 				   isolate
> > 				     i_state |= I_FREEING
> > 				     list_move(dispose_list)
> > 				     return LRU_REMOVED
> > 				 ....
> > 				 unlock(lru)
> >       lock(lru)
> >       return LRU_RETRY;
> >   if (!first_pass)
> >     ....
> >   --nr_to_scan
> >   (loop again using next, which has already been removed from the
> >   LRU by the other thread!)
> >   isolate
> >     lock(i_lock)
> >     if (i_state & ~I_REFERENCED)
> >       list_del_init(i_lru)	<<<<< inode is on dispose list!
> > 				<<<<< inode is now isolated, with I_FREEING set
> >       return LRU_REMOVED;
> > 
> > That fits the corpse left on your machine, Michal. One thread has
> > moved the inode to a dispose list, the other thread thinks it is
> > still on the LRU and should be removed, and removes it.
> > 
> > This also explains the lru item count going negative - the same item
> > is being removed from the lru twice. So it seems like all the
> > problems you've been seeing are caused by this one problem....
> > 
> > Patch below that should fix this.
> 
> Good news! The test was running since morning and it didn't hang nor
> crashed. So this really looks like the right fix. It will run also
> during weekend to be 100% sure. But I guess it is safe to say
> 
> Tested-by: Michal Hocko <mhocko@suse.cz>

And I can finally confirm this after over weekend testing on ext3.

Thanks a lot for your help Dave!
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2013-07-15  9:14 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-17 14:18 linux-next: slab shrinkers: BUG at mm/list_lru.c:92 Michal Hocko
2013-06-17 15:14 ` Glauber Costa
2013-06-17 15:33   ` Michal Hocko
2013-06-17 16:54     ` Glauber Costa
2013-06-18  7:42       ` Michal Hocko
2013-06-17 21:35   ` Andrew Morton
2013-06-17 22:30     ` Glauber Costa
2013-06-18  2:46       ` Dave Chinner
2013-06-18  6:31         ` Glauber Costa
2013-06-18  8:24           ` Michal Hocko
2013-06-18 10:44             ` Michal Hocko
2013-06-18 13:50               ` Michal Hocko
2013-06-25  2:27                 ` Dave Chinner
2013-06-26  8:15                   ` Michal Hocko
2013-06-26 23:24                     ` Dave Chinner
2013-06-27 14:54                       ` Michal Hocko
2013-06-28  8:39                         ` Michal Hocko
2013-06-28 14:31                           ` Glauber Costa
2013-06-28 15:12                             ` Michal Hocko
2013-06-29  2:55                         ` Dave Chinner
2013-06-30 18:33                           ` Michal Hocko
2013-07-01  1:25                             ` Dave Chinner
2013-07-01  7:50                               ` Michal Hocko
2013-07-01  8:10                                 ` Dave Chinner
2013-07-02  9:22                                   ` Michal Hocko
2013-07-02 12:19                                     ` Dave Chinner
2013-07-02 12:44                                       ` Michal Hocko
2013-07-03 11:24                                         ` Dave Chinner
2013-07-03 14:08                                           ` Glauber Costa
2013-07-04 16:36                                           ` Michal Hocko
2013-07-08 12:53                                             ` Michal Hocko
2013-07-08 21:04                                               ` Andrew Morton
2013-07-09 17:34                                                 ` Glauber Costa
2013-07-09 17:51                                                   ` Andrew Morton
2013-07-09 17:32                                               ` Glauber Costa
2013-07-09 17:50                                                 ` Andrew Morton
2013-07-09 17:57                                                   ` Glauber Costa
2013-07-09 17:57                                                 ` Michal Hocko
2013-07-09 21:39                                                   ` Andrew Morton
2013-07-10  2:31                                               ` Dave Chinner
2013-07-10  7:34                                                 ` Michal Hocko
2013-07-10  8:06                                                 ` Michal Hocko
2013-07-11  2:26                                                   ` Dave Chinner
2013-07-11  3:03                                                     ` Andrew Morton
2013-07-11 13:23                                                     ` Michal Hocko
2013-07-12  1:42                                                       ` Hugh Dickins
2013-07-13  3:29                                                         ` Dave Chinner
2013-07-15  9:14                                             ` Michal Hocko [this message]
2013-06-18  6:26       ` Glauber Costa
2013-06-18  8:25         ` Michal Hocko
2013-06-19  7:13         ` Michal Hocko
2013-06-19  7:35           ` Glauber Costa
2013-06-19  8:52             ` Glauber Costa
2013-06-19 13:57             ` Michal Hocko
2013-06-19 14:02               ` Glauber Costa
2013-06-19 14:28           ` Michal Hocko
2013-06-20 14:11             ` Glauber Costa
2013-06-20 15:12               ` Michal Hocko
2013-06-20 15:16                 ` Michal Hocko
2013-06-21  9:00                 ` Michal Hocko
2013-06-23 11:51                   ` Glauber Costa
2013-06-23 11:55                     ` Glauber Costa
2013-06-25  2:29                     ` Dave Chinner
2013-06-26  8:22                     ` Michal Hocko
2013-06-18  8:19       ` Michal Hocko
2013-06-18  8:21         ` Glauber Costa
2013-06-18  8:26           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130715091428.GA26199@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=glommer@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).